.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/10_apply_on_cols.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via JupyterLite or Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_10_apply_on_cols.py: Hands-On with Column Selection and Transformers =============================================== In previous examples, we saw how skrub provides powerful abstractions like :class:`~skrub.TableVectorizer` and :func:`~skrub.tabular_learner` to create pipelines. In this new example, we show how to create more flexible pipelines by selecting and transforming dataframe columns using arbitrary logic. .. GENERATED FROM PYTHON SOURCE LINES 13-15 We begin with loading a dataset with heterogeneous datatypes, and replacing Pandas's display with the TableReport display via :func:`skrub.set_config`. .. GENERATED FROM PYTHON SOURCE LINES 15-23 .. code-block:: Python import skrub from skrub.datasets import fetch_employee_salaries skrub.set_config(use_tablereport=True) data = fetch_employee_salaries() X, y = data.X, data.y X .. raw:: html

	gender	department	department_name	division	assignment_category	employee_position_title	date_first_hired	year_first_hired
0	F	POL	Department of Police	MSB Information Mgmt and Tech Division Records Management Section	Fulltime-Regular	Office Services Coordinator	09/22/1986	1,986
1	M	POL	Department of Police	ISB Major Crimes Division Fugitive Section	Fulltime-Regular	Master Police Officer	09/12/1988	1,988
2	F	HHS	Department of Health and Human Services	Adult Protective and Case Management Services	Fulltime-Regular	Social Worker IV	11/19/1989	1,989
3	M	COR	Correction and Rehabilitation	PRRS Facility and Security	Fulltime-Regular	Resident Supervisor II	05/05/2014	2,014
4	M	HCA	Department of Housing and Community Affairs	Affordable Housing Programs	Fulltime-Regular	Planning Specialist III	03/05/2007	2,007

9,223	F	HHS	Department of Health and Human Services	School Based Health Centers	Fulltime-Regular	Community Health Nurse II	11/03/2015	2,015
9,224	F	FRS	Fire and Rescue Services	Human Resources Division	Fulltime-Regular	Fire/Rescue Division Chief	11/28/1988	1,988
9,225	M	HHS	Department of Health and Human Services	Child and Adolescent Mental Health Clinic Services	Parttime-Regular	Medical Doctor IV - Psychiatrist	04/30/2001	2,001
9,226	M	CCL	County Council	Council Central Staff	Fulltime-Regular	Manager II	09/05/2006	2,006
9,227	M	DLC	Department of Liquor Control	Licensure, Regulation and Education	Fulltime-Regular	Alcohol/Tobacco Enforcement Specialist II	01/30/2012	2,012

Column	Column name	dtype	Is sorted	Null values	Unique values	Mean	Std	Min	Median	Max
0	gender	ObjectDType	False	17 (0.2%)	2 (< 0.1%)
1	department	ObjectDType	False	0 (0.0%)	37 (0.4%)
2	department_name	ObjectDType	False	0 (0.0%)	37 (0.4%)
3	division	ObjectDType	False	0 (0.0%)	694 (7.5%)
4	assignment_category	ObjectDType	False	0 (0.0%)	2 (< 0.1%)
5	employee_position_title	ObjectDType	False	0 (0.0%)	443 (4.8%)
6	date_first_hired	ObjectDType	False	0 (0.0%)	2264 (24.5%)
7	year_first_hired	Int64DType	False	0 (0.0%)	51 (0.6%)	2.00e+03	9.33	1,965	2,005	2,016

Column 1	Column 2	Cramér's V
department	department_name	1.00
assignment_category	employee_position_title	0.636
division	assignment_category	0.574
division	employee_position_title	0.522
department_name	assignment_category	0.424
department	assignment_category	0.424
department_name	employee_position_title	0.415
department	employee_position_title	0.415
department	division	0.366
department_name	division	0.366
gender	department	0.363
gender	department_name	0.363
gender	employee_position_title	0.262
gender	division	0.255
gender	assignment_category	0.236
employee_position_title	date_first_hired	0.206
department	date_first_hired	0.144
department_name	date_first_hired	0.144
date_first_hired	year_first_hired	0.136
employee_position_title	year_first_hired	0.124