Finding Correlated Columns in a DataFrame#
In addition to TableReport
’s Associations tab, you can compute associations
using the column_associations()
function, which returns a dataframe containing the
associations.
Reported metrics include Cramer’s V statistic and Pearson’s Correlation Coefficient. The result is returned as a dataframe that contains the column name and idx for the left and right table and both associations; results are sorted in descending order by Cramer’s V association.
This can be useful to have access to the information used in the TableReport
for later use (e.g., to select which columns to drop).