Cleaning a dataframe#
Perform robust centering and scaling followed by soft clipping. |
|
Deduplicate categorical data by hierarchically clustering similar strings. |
|
Column-wise consistency checks and sanitization of dtypes, null values and dates. |
|
Drop column if it is found to be uninformative according to various criteria. |