set_config#
- skrub.set_config(use_table_report=None, use_table_report_data_ops=None, max_plot_columns=None, max_association_columns=None, subsampling_seed=None, enable_subsampling=None, float_precision=None, cardinality_threshold=None)[source]#
Set global skrub configuration.
- Parameters:
- use_table_report
bool, default=None The type of display used for dataframes. Default is
True.If
True, replace the default DataFrame HTML displays withTableReport.If
False, the original Pandas or Polars dataframe HTML representation will be used.
This configuration can also be set with the
SKB_USE_TABLE_REPORTenvironment variable.- use_table_report_data_ops
bool, default=None The type of HTML representation used for the dataframes preview in skrub DataOps. Default is
False.If
True,TableReportwill be used.If
False, the original Pandas or Polars dataframe display will be used.
This configuration can also be set with the
SKB_USE_TABLE_REPORT_DATA_OPSenvironment variable.- max_plot_columns
int, default=None Set the
max_plot_columnsargument ofTableReport. Default is 30. If “all”, all columns will be plotted.This configuration can also be set with the
SKB_MAX_PLOT_COLUMNSenvironment variable.- max_association_columns
int, default=None Set the
max_association_columnsargument ofTableReport. Default is 30. If “all”, all columns will be plotted.This configuration can also be set with the
SKB_MAX_ASSOCIATION_COLUMNSenvironment variable.- subsampling_seed
int, default=None Set the random seed of subsampling in skrub DataOps
skrub.DataOp.skb.subsample(), whenhow="random"is passed.This configuration can also be set with the
SKB_SUBSAMPLING_SEEDenvironment variable.- enable_subsampling{‘default’, ‘disable’, ‘force’}, default=None
Control the activation of subsampling in skrub DataOps
skrub.DataOp.skb.subsample(). Default is"default".If
"default", the behavior ofskrub.DataOp.skb.subsample()is used.If
"disable", subsampling is never used, soskb.subsamplebecomes a no-op.If
"force", subsampling is used in all DataOps evaluation modes (eval(), fit_transform, etc.).
This configuration can also be set with the
SKB_ENABLE_SUBSAMPLINGenvironment variable.- float_precision
int, default=3 Control the number of significant digits shown when formatting floats. Applies overall precision rather than fixed decimal places. Default is 3.
This configuration can also be set with the
SKB_FLOAT_PRECISIONenvironment variable.- cardinality_threshold
int, default=40 Set the
cardinality_thresholdargument ofTableVectorizer. Control the threshold value used to warn user if they have high cardinality columns in there dataset.This configuration can also be set with the
SKB_CARDINALITY_THRESHOLDenvironment variable.
- use_table_report
See also
get_configRetrieve current values for global configuration.
config_contextContext manager for global skrub configuration.
Examples
>>> from skrub import set_config >>> set_config(use_table_report=True)