Note
Go to the end to download the full example code. or to run this example in your browser via JupyterLite or Binder
Introduction to machine-learning pipelines with skrub DataOps#
In this example, we show how we can use Skrub’s DataOps to build a machine learning pipeline that records all the operations involved in pre-processing data and training a model. We will also show how to save the model, load it back, and then use it to make predictions on new, unseen data.
This example is meant to be an introduction to Skrub DataOps, and as such it will not cover all the features: further examples in the gallery Skrub DataOps will go into more detail on how to use Skrub DataOps for more complex tasks.
The data#
We begin by loading the employee salaries dataset, which is a regression dataset
that contains information about employees and their current annual salaries.
By default, the datasets.fetch_employee_salaries()
function returns the training set.
We will load the test set later, to evaluate our model on unseen data.
from skrub.datasets import fetch_employee_salaries
training_data = fetch_employee_salaries(split="train").employee_salaries
We can take a look at the dataset using the TableReport
.
This dataset contains numerical, categorical, and datetime features. The column
current_annual_salary
is the target variable we want to predict.
import skrub
skrub.TableReport(training_data)
gender | department | department_name | division | assignment_category | employee_position_title | date_first_hired | year_first_hired | current_annual_salary | |
---|---|---|---|---|---|---|---|---|---|
0 | F | POL | Department of Police | MSB Information Mgmt and Tech Division Records Management Section | Fulltime-Regular | Office Services Coordinator | 09/22/1986 | 1,986 | 6.92e+04 |
1 | M | POL | Department of Police | ISB Major Crimes Division Fugitive Section | Fulltime-Regular | Master Police Officer | 09/12/1988 | 1,988 | 9.74e+04 |
2 | F | HHS | Department of Health and Human Services | Adult Protective and Case Management Services | Fulltime-Regular | Social Worker IV | 11/19/1989 | 1,989 | 1.05e+05 |
3 | M | COR | Correction and Rehabilitation | PRRS Facility and Security | Fulltime-Regular | Resident Supervisor II | 05/05/2014 | 2,014 | 5.27e+04 |
4 | M | HCA | Department of Housing and Community Affairs | Affordable Housing Programs | Fulltime-Regular | Planning Specialist III | 03/05/2007 | 2,007 | 9.34e+04 |
7,995 | M | HHS | Department of Health and Human Services | Adult Drug Court | Fulltime-Regular | Supervisory Therapist | 11/27/1994 | 1,994 | 1.06e+05 |
7,996 | F | POL | Department of Police | FSB Traffic Division School Safety Section | Parttime-Regular | Crossing Guard | 05/04/2015 | 2,015 | 1.66e+04 |
7,997 | M | DPS | Department of Permitting Services | Building Construction Permit Processing | Fulltime-Regular | Permit Technician II | 09/05/2006 | 2,006 | 5.88e+04 |
7,998 | F | HHS | Department of Health and Human Services | School Health Services | Fulltime-Regular | Community Health Nurse II | 07/28/2015 | 2,015 | 8.00e+04 |
7,999 | M | CAT | County Attorney's Office | Insurance Defense Litigation | Fulltime-Regular | Assistant County Attorney III | 05/19/2014 | 2,014 | 1.11e+05 |
gender
ObjectDType- Null values
- 10 (0.1%)
- Unique values
- 2 (< 0.1%)
Most frequent values
M
F
['M', 'F']
department
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 37 (0.5%)
Most frequent values
POL
HHS
FRS
DOT
COR
DLC
DGS
LIB
DPS
SHF
['POL', 'HHS', 'FRS', 'DOT', 'COR', 'DLC', 'DGS', 'LIB', 'DPS', 'SHF']
department_name
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 37 (0.5%)
Most frequent values
Department of Police
Department of Health and Human Services
Fire and Rescue Services
Department of Transportation
Correction and Rehabilitation
Department of Liquor Control
Department of General Services
Department of Public Libraries
Department of Permitting Services
Sheriff's Office
['Department of Police', 'Department of Health and Human Services', 'Fire and Rescue Services', 'Department of Transportation', 'Correction and Rehabilitation', 'Department of Liquor Control', 'Department of General Services', 'Department of Public Libraries', 'Department of Permitting Services', "Sheriff's Office"]
division
ObjectDType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
Most frequent values
Transit Silver Spring Ride On
School Health Services
Transit Gaithersburg Ride On
Highway Services
Child Welfare Services
FSB Traffic Division School Safety Section
PSB 3rd District Patrol
Income Supports
PSB 4th District Patrol
Transit Nicholson Ride On
['Transit Silver Spring Ride On', 'School Health Services', 'Transit Gaithersburg Ride On', 'Highway Services', 'Child Welfare Services', 'FSB Traffic Division School Safety Section', 'PSB 3rd District Patrol', 'Income Supports', 'PSB 4th District Patrol', 'Transit Nicholson Ride On']
assignment_category
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
Most frequent values
Fulltime-Regular
Parttime-Regular
['Fulltime-Regular', 'Parttime-Regular']
employee_position_title
ObjectDType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
Most frequent values
Bus Operator
Police Officer III
Firefighter/Rescuer III
Manager III
Firefighter/Rescuer II
Master Firefighter/Rescuer
Office Services Coordinator
School Health Room Technician I
Community Health Nurse II
Police Officer II
['Bus Operator', 'Police Officer III', 'Firefighter/Rescuer III', 'Manager III', 'Firefighter/Rescuer II', 'Master Firefighter/Rescuer', 'Office Services Coordinator', 'School Health Room Technician I', 'Community Health Nurse II', 'Police Officer II']
date_first_hired
ObjectDType- Null values
- 0 (0.0%)
- Unique values
-
2,101 (26.3%)
This column has a high cardinality (> 40).
Most frequent values
12/12/2016
01/14/2013
02/24/2014
03/10/2014
08/12/2013
09/22/2014
10/06/2014
03/19/2007
07/16/2012
07/29/2013
['12/12/2016', '01/14/2013', '02/24/2014', '03/10/2014', '08/12/2013', '09/22/2014', '10/06/2014', '03/19/2007', '07/16/2012', '07/29/2013']
year_first_hired
Int64DType- Null values
- 0 (0.0%)
- Unique values
-
51 (0.6%)
This column has a high cardinality (> 40).
- Mean ± Std
- 2.00e+03 ± 9.33
- Median ± IQR
- 2,005 ± 14
- Min | Max
- 1,965 | 2,016
current_annual_salary
Float64DType- Null values
- 0 (0.0%)
- Unique values
-
3,063 (38.3%)
This column has a high cardinality (> 40).
- Mean ± Std
- 7.34e+04 ± 2.93e+04
- Median ± IQR
- 6.92e+04 ± 3.94e+04
- Min | Max
- 9.20e+03 | 3.03e+05
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column
|
Column name
|
dtype
|
Is sorted
|
Null values
|
Unique values
|
Mean
|
Std
|
Min
|
Median
|
Max
|
---|---|---|---|---|---|---|---|---|---|---|
0 | gender | ObjectDType | False | 10 (0.1%) | 2 (< 0.1%) | |||||
1 | department | ObjectDType | False | 0 (0.0%) | 37 (0.5%) | |||||
2 | department_name | ObjectDType | False | 0 (0.0%) | 37 (0.5%) | |||||
3 | division | ObjectDType | False | 0 (0.0%) | 681 (8.5%) | |||||
4 | assignment_category | ObjectDType | False | 0 (0.0%) | 2 (< 0.1%) | |||||
5 | employee_position_title | ObjectDType | False | 0 (0.0%) | 431 (5.4%) | |||||
6 | date_first_hired | ObjectDType | False | 0 (0.0%) | 2101 (26.3%) | |||||
7 | year_first_hired | Int64DType | False | 0 (0.0%) | 51 (0.6%) | 2.00e+03 | 9.33 | 1,965 | 2,005 | 2,016 |
8 | current_annual_salary | Float64DType | False | 0 (0.0%) | 3063 (38.3%) | 7.34e+04 | 2.93e+04 | 9.20e+03 | 6.92e+04 | 3.03e+05 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
gender
ObjectDType- Null values
- 10 (0.1%)
- Unique values
- 2 (< 0.1%)
Most frequent values
M
F
['M', 'F']
department
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 37 (0.5%)
Most frequent values
POL
HHS
FRS
DOT
COR
DLC
DGS
LIB
DPS
SHF
['POL', 'HHS', 'FRS', 'DOT', 'COR', 'DLC', 'DGS', 'LIB', 'DPS', 'SHF']
department_name
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 37 (0.5%)
Most frequent values
Department of Police
Department of Health and Human Services
Fire and Rescue Services
Department of Transportation
Correction and Rehabilitation
Department of Liquor Control
Department of General Services
Department of Public Libraries
Department of Permitting Services
Sheriff's Office
['Department of Police', 'Department of Health and Human Services', 'Fire and Rescue Services', 'Department of Transportation', 'Correction and Rehabilitation', 'Department of Liquor Control', 'Department of General Services', 'Department of Public Libraries', 'Department of Permitting Services', "Sheriff's Office"]
division
ObjectDType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
Most frequent values
Transit Silver Spring Ride On
School Health Services
Transit Gaithersburg Ride On
Highway Services
Child Welfare Services
FSB Traffic Division School Safety Section
PSB 3rd District Patrol
Income Supports
PSB 4th District Patrol
Transit Nicholson Ride On
['Transit Silver Spring Ride On', 'School Health Services', 'Transit Gaithersburg Ride On', 'Highway Services', 'Child Welfare Services', 'FSB Traffic Division School Safety Section', 'PSB 3rd District Patrol', 'Income Supports', 'PSB 4th District Patrol', 'Transit Nicholson Ride On']
assignment_category
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
Most frequent values
Fulltime-Regular
Parttime-Regular
['Fulltime-Regular', 'Parttime-Regular']
employee_position_title
ObjectDType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
Most frequent values
Bus Operator
Police Officer III
Firefighter/Rescuer III
Manager III
Firefighter/Rescuer II
Master Firefighter/Rescuer
Office Services Coordinator
School Health Room Technician I
Community Health Nurse II
Police Officer II
['Bus Operator', 'Police Officer III', 'Firefighter/Rescuer III', 'Manager III', 'Firefighter/Rescuer II', 'Master Firefighter/Rescuer', 'Office Services Coordinator', 'School Health Room Technician I', 'Community Health Nurse II', 'Police Officer II']
date_first_hired
ObjectDType- Null values
- 0 (0.0%)
- Unique values
-
2,101 (26.3%)
This column has a high cardinality (> 40).
Most frequent values
12/12/2016
01/14/2013
02/24/2014
03/10/2014
08/12/2013
09/22/2014
10/06/2014
03/19/2007
07/16/2012
07/29/2013
['12/12/2016', '01/14/2013', '02/24/2014', '03/10/2014', '08/12/2013', '09/22/2014', '10/06/2014', '03/19/2007', '07/16/2012', '07/29/2013']
year_first_hired
Int64DType- Null values
- 0 (0.0%)
- Unique values
-
51 (0.6%)
This column has a high cardinality (> 40).
- Mean ± Std
- 2.00e+03 ± 9.33
- Median ± IQR
- 2,005 ± 14
- Min | Max
- 1,965 | 2,016
current_annual_salary
Float64DType- Null values
- 0 (0.0%)
- Unique values
-
3,063 (38.3%)
This column has a high cardinality (> 40).
- Mean ± Std
- 7.34e+04 ± 2.93e+04
- Median ± IQR
- 6.92e+04 ± 3.94e+04
- Min | Max
- 9.20e+03 | 3.03e+05
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column 1 | Column 2 | Cramér's V | Pearson's Correlation |
---|---|---|---|
department | department_name | 1.00 | |
assignment_category | current_annual_salary | 0.706 | |
division | assignment_category | 0.611 | |
assignment_category | employee_position_title | 0.513 | |
department | assignment_category | 0.423 | |
department_name | assignment_category | 0.423 | |
division | employee_position_title | 0.420 | |
department_name | employee_position_title | 0.418 | |
department | employee_position_title | 0.418 | |
gender | department_name | 0.388 | |
gender | department | 0.388 | |
department_name | division | 0.367 | |
department | division | 0.367 | |
gender | employee_position_title | 0.295 | |
employee_position_title | current_annual_salary | 0.273 | |
gender | assignment_category | 0.262 | |
gender | division | 0.259 | |
employee_position_title | date_first_hired | 0.248 | |
division | current_annual_salary | 0.246 | |
department | current_annual_salary | 0.206 |
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
Assembling our DataOps plan#
Our goal is to predict the current_annual_salary
of employees based on their
other features. We will use skrub’s DataOps to combine both skrub and scikit-learn
objects into a single DataOps plan, which will allow us to preprocess the data,
train a model, and tune hyperparameters.
We begin by defining a skrub var()
, which is the entry point for our DataOps plan.
data_var = skrub.var("data", training_data)
Next, we define the initial features X
and the target variable y
.
We use the DataOp.skb.mark_as_X()
and DataOp.skb.mark_as_y()
methods to mark these variables
in the DataOps plan. This allows skrub to properly split these objects into
training and validation steps when executing cross-validation or hyperparameter
tuning.
X = data_var.drop("current_annual_salary", axis=1).skb.mark_as_X()
y = data_var["current_annual_salary"].skb.mark_as_y()
Our first step is to vectorize the features in X
. We will use the
TableVectorizer
to convert the categorical and numerical features into a
numerical format that can be used by machine learning algorithms.
We apply the vectorizer to X
using the .skb.apply()
method, which allows us to
apply any scikit-learn compatible transformer to the skrub variable.
from skrub import TableVectorizer
vectorizer = TableVectorizer()
X_vec = X.skb.apply(vectorizer)
X_vec
Show graph
gender_F | gender_M | gender_nan | department_BOA | department_BOE | department_CAT | department_CCL | department_CEC | department_CEX | department_COR | department_CUS | department_DEP | department_DGS | department_DHS | department_DLC | department_DOT | department_DPS | department_DTS | department_ECM | department_FIN | department_FRS | department_HCA | department_HHS | department_HRC | department_IGR | department_LIB | department_MPB | department_NDA | department_OAG | department_OCP | department_OHR | department_OIG | department_OLO | department_OMB | department_PIO | department_POL | department_PRO | department_REC | department_SHF | department_ZAH | department_name_Board of Appeals Department | department_name_Board of Elections | department_name_Community Engagement Cluster | department_name_Community Use of Public Facilities | department_name_Correction and Rehabilitation | department_name_County Attorney's Office | department_name_County Council | department_name_Department of Environmental Protection | department_name_Department of Finance | department_name_Department of General Services | department_name_Department of Health and Human Services | department_name_Department of Housing and Community Affairs | department_name_Department of Liquor Control | department_name_Department of Permitting Services | department_name_Department of Police | department_name_Department of Public Libraries | department_name_Department of Recreation | department_name_Department of Technology Services | department_name_Department of Transportation | department_name_Ethics Commission | department_name_Fire and Rescue Services | department_name_Merit System Protection Board Department | department_name_Non-Departmental Account | department_name_Office of Agriculture | department_name_Office of Consumer Protection | department_name_Office of Emergency Management and Homeland Security | department_name_Office of Human Resources | department_name_Office of Human Rights | department_name_Office of Intergovernmental Relations Department | department_name_Office of Legislative Oversight | department_name_Office of Management and Budget | department_name_Office of Procurement | department_name_Office of Public Information | department_name_Office of Zoning and Administrative Hearings | department_name_Office of the Inspector General | department_name_Offices of the County Executive | department_name_Sheriff's Office | division_00 | division_01 | division_02 | division_03 | division_04 | division_05 | division_06 | division_07 | division_08 | division_09 | division_10 | division_11 | division_12 | division_13 | division_14 | division_15 | division_16 | division_17 | division_18 | division_19 | division_20 | division_21 | division_22 | division_23 | division_24 | division_25 | division_26 | division_27 | division_28 | division_29 | assignment_category_Parttime-Regular | employee_position_title_00 | employee_position_title_01 | employee_position_title_02 | employee_position_title_03 | employee_position_title_04 | employee_position_title_05 | employee_position_title_06 | employee_position_title_07 | employee_position_title_08 | employee_position_title_09 | employee_position_title_10 | employee_position_title_11 | employee_position_title_12 | employee_position_title_13 | employee_position_title_14 | employee_position_title_15 | employee_position_title_16 | employee_position_title_17 | employee_position_title_18 | employee_position_title_19 | employee_position_title_20 | employee_position_title_21 | employee_position_title_22 | employee_position_title_23 | employee_position_title_24 | employee_position_title_25 | employee_position_title_26 | employee_position_title_27 | employee_position_title_28 | employee_position_title_29 | date_first_hired_year | date_first_hired_month | date_first_hired_day | date_first_hired_total_seconds | year_first_hired | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.216 | 0.345 | -0.0275 | -0.0810 | -0.448 | -0.241 | -0.188 | -0.0626 | -0.232 | -0.0581 | -0.0529 | -0.103 | -0.0745 | 0.0235 | 0.101 | -0.0732 | -0.00734 | 0.137 | -0.173 | 0.181 | -0.154 | -0.332 | -0.168 | -0.172 | -0.235 | 0.0589 | 0.138 | -0.0128 | -0.00341 | 0.00186 | 0.00 | 0.395 | -0.159 | 0.182 | -0.0678 | 0.0959 | 0.0905 | 0.819 | 0.283 | -0.294 | 0.00592 | -0.145 | -0.400 | -0.0371 | -0.100 | 0.195 | 0.0225 | -0.121 | -0.00824 | 0.0719 | -0.119 | -0.0518 | -0.0862 | -0.00321 | -0.0235 | 0.0201 | -0.0342 | 0.0533 | 0.0670 | -0.0962 | 0.0413 | 1.99e+03 | 9.00 | 22.0 | 5.28e+08 | 1.99e+03 |
1 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.165 | 0.231 | -0.0198 | -0.0562 | -0.397 | -0.0296 | -0.100 | -0.0623 | 0.112 | 0.00881 | -0.0712 | -0.106 | -0.0217 | 0.0299 | -0.115 | 0.00641 | -0.0376 | -0.0409 | -0.268 | 0.309 | -0.0296 | 0.157 | -0.00405 | 0.00838 | 0.0309 | -0.0254 | -0.0903 | 0.0457 | 0.105 | -0.116 | 0.00 | 0.845 | -0.147 | -0.0485 | -0.110 | -0.0557 | -0.0474 | -0.108 | 0.0171 | -0.0499 | 0.0378 | -0.0588 | -0.00324 | 0.0683 | -0.0462 | 0.00206 | 0.00282 | -0.00366 | -0.0874 | -0.172 | -0.0264 | -0.140 | -0.195 | -0.0716 | 0.0201 | -0.116 | 0.654 | 0.143 | -0.0420 | 0.0881 | 0.0886 | 1.99e+03 | 9.00 | 12.0 | 5.90e+08 | 1.99e+03 |
2 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.130 | 0.235 | 0.359 | -0.0251 | -0.0793 | -0.400 | -0.158 | -0.0323 | -0.437 | -0.0652 | -0.0520 | -0.0741 | 0.0250 | 0.0621 | -0.00840 | -0.0325 | 0.0538 | 0.252 | -0.170 | 0.0967 | -0.00443 | -0.146 | -0.115 | -0.122 | 0.266 | -0.174 | -0.242 | 0.132 | -0.285 | 0.172 | 0.00 | 0.0480 | 0.0136 | 0.00654 | 0.0844 | 0.126 | 0.0546 | 0.203 | -0.247 | 0.691 | -0.423 | -0.476 | 0.0741 | 0.0239 | 0.0818 | 0.172 | 0.0429 | 0.0165 | -0.0117 | -0.0611 | -0.0628 | -0.0795 | -0.0181 | -0.0265 | 0.0437 | -0.00711 | 0.0521 | 0.0306 | 0.0497 | -0.00622 | 0.0787 | 1.99e+03 | 11.0 | 19.0 | 6.27e+08 | 1.99e+03 |
3 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.0577 | 0.0825 | 0.0688 | -0.00147 | -0.300 | -0.145 | 0.582 | 0.0897 | -0.0912 | 0.00313 | 0.0237 | 0.0344 | 0.0335 | 0.436 | -0.186 | 0.0915 | 0.0551 | -0.0194 | -0.00690 | 0.00482 | -0.134 | -0.111 | 0.172 | 0.0708 | 0.00370 | 0.114 | -0.0350 | -0.0817 | -0.0192 | 0.0104 | 0.00 | 0.0469 | 0.0220 | 0.0703 | 0.0383 | 0.0602 | 0.0488 | 0.134 | 0.0520 | 0.0254 | -0.00979 | 0.0599 | -0.0224 | -0.0489 | -0.0221 | -0.0366 | -0.0268 | 0.131 | 0.0390 | -0.0632 | 0.164 | 0.188 | -0.0417 | 0.0522 | 0.144 | -0.0250 | 0.0431 | -0.278 | 0.414 | 0.526 | 0.00463 | 2.01e+03 | 5.00 | 5.00 | 1.40e+09 | 2.01e+03 |
4 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.0147 | 0.0255 | 0.00183 | 0.0279 | -0.0377 | -0.0326 | -0.0224 | 0.0495 | -0.0238 | 0.00618 | 0.0173 | 0.0283 | 0.0232 | 0.0966 | -0.00106 | -0.0597 | 0.0295 | 0.115 | -0.0134 | -0.0389 | 0.0655 | 0.0221 | -0.0928 | -0.0401 | -0.0488 | 0.0480 | -0.122 | 0.00697 | -0.00809 | 0.0343 | 0.00 | 0.0893 | 0.0201 | 0.0236 | 0.228 | 0.395 | -0.0670 | -0.0202 | -0.146 | -0.00909 | -0.0368 | -0.0172 | -0.0662 | 0.00549 | 0.0933 | -0.225 | 0.0390 | -0.0750 | 0.0509 | 0.0655 | -0.0975 | 0.00344 | -0.0588 | 0.170 | -0.172 | 0.0173 | -0.0968 | 0.164 | -0.101 | 0.195 | 0.0505 | 2.01e+03 | 3.00 | 5.00 | 1.17e+09 | 2.01e+03 |
7,995 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.0116 | 0.0203 | 0.0125 | 0.00599 | -0.0234 | -0.0362 | -0.0124 | -0.00408 | -0.0316 | 0.0131 | 0.0532 | 0.0205 | 0.0303 | 0.0487 | 0.0455 | 0.0147 | 0.0259 | 0.194 | -0.0468 | -0.0250 | -0.0419 | 0.0105 | 0.150 | 0.0735 | 0.100 | -0.0872 | 0.104 | 0.240 | -0.0927 | -0.0407 | 0.00 | 0.0181 | -0.00205 | 0.0754 | 0.0419 | 0.102 | 0.0155 | 0.0909 | 0.0399 | 0.0194 | -0.0125 | 0.0573 | -0.0133 | -0.0416 | -0.0390 | -0.0555 | -0.0459 | 0.124 | -0.0260 | 0.160 | 0.133 | 0.325 | -0.317 | 0.473 | 0.571 | 0.133 | 0.133 | -0.250 | 0.186 | 0.261 | 0.0402 | 1.99e+03 | 11.0 | 27.0 | 7.86e+08 | 1.99e+03 |
7,996 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.335 | 0.447 | 0.157 | -0.0181 | -0.804 | 0.480 | -0.159 | -0.0751 | 0.219 | -0.122 | -0.211 | -0.592 | 0.214 | 0.00268 | -0.105 | 0.0188 | -0.000719 | 0.0540 | 0.194 | -0.253 | 0.0142 | 0.0596 | -0.0553 | 0.0368 | -0.0593 | 0.0237 | -0.0239 | 0.0271 | -0.0848 | 0.0537 | 1.00 | 0.00297 | 9.70e-06 | 0.00343 | 0.0126 | 0.0326 | -0.00644 | 0.0308 | 0.0491 | 0.0447 | -0.00769 | 0.0141 | -0.0386 | -0.0334 | -0.0328 | -0.423 | 1.08 | 0.0252 | -0.130 | 0.0188 | 0.00689 | -0.0254 | -0.0752 | -0.0533 | 0.0290 | -1.05e-05 | -0.0416 | 0.00246 | 0.0145 | -0.00750 | 0.0164 | 2.02e+03 | 5.00 | 4.00 | 1.43e+09 | 2.02e+03 |
7,997 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.0873 | 0.131 | -0.0301 | 0.0439 | -0.108 | -0.0670 | -0.0490 | 0.0855 | 0.00327 | 0.0369 | 0.0132 | 0.0665 | 0.0841 | 0.0917 | 0.0533 | -0.127 | 0.101 | 0.204 | -0.0498 | -0.0468 | 0.0103 | -0.0260 | -0.0985 | 0.359 | 0.0447 | -0.183 | -0.0891 | -0.161 | 0.112 | 0.174 | 0.00 | 0.0445 | 0.0117 | 0.0467 | 0.0853 | 0.163 | 0.557 | -0.182 | 0.0483 | 0.0338 | 0.0229 | -0.0752 | -0.268 | -0.0684 | -0.0439 | -0.00258 | 0.0215 | 0.0408 | 0.0895 | 0.0275 | 0.231 | 0.112 | 0.204 | 0.0824 | -0.0316 | 0.0654 | 0.0949 | 0.118 | 0.0318 | 0.0628 | -0.0219 | 2.01e+03 | 9.00 | 5.00 | 1.16e+09 | 2.01e+03 |
7,998 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.315 | 0.455 | 1.00 | -0.0593 | 0.147 | 0.695 | 0.0155 | 0.0921 | -0.176 | -0.0552 | 0.0799 | 0.145 | 0.000196 | -0.0299 | 0.0185 | -0.00339 | -0.0145 | -0.0493 | -0.0107 | 0.0344 | -0.0309 | -0.0365 | -0.0278 | 0.0317 | -0.00520 | -0.00738 | 0.0277 | -0.0240 | -0.0761 | -0.0309 | 0.00 | 0.0497 | 0.00947 | 0.0100 | 0.0885 | 0.127 | 0.519 | 0.264 | 0.0543 | -0.216 | -0.00347 | 0.0676 | 0.852 | 0.196 | 0.153 | -0.125 | -0.0115 | -0.125 | 0.0280 | -0.0138 | -0.0700 | -0.134 | 0.0527 | -0.0356 | 0.148 | 0.0358 | -0.0112 | 0.203 | 0.176 | -0.0286 | -0.0210 | 2.02e+03 | 7.00 | 28.0 | 1.44e+09 | 2.02e+03 |
7,999 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.0682 | 0.127 | -0.0618 | -0.0244 | -0.0555 | -0.0280 | -0.0220 | -0.00956 | 0.0109 | 0.0206 | 0.0403 | 0.0608 | 0.0272 | 0.173 | -0.144 | 0.0416 | -0.00607 | -0.0114 | -0.0687 | 0.160 | 0.00634 | 0.106 | -0.0952 | 0.0940 | -0.0154 | 0.0257 | 0.0638 | -0.0155 | -0.0825 | 0.0150 | 0.00 | 0.0731 | 0.0109 | 0.0302 | 0.0897 | 0.123 | -0.0341 | 0.0405 | 0.243 | 0.200 | 0.0224 | 0.140 | 0.0834 | -0.0525 | -0.134 | -0.0964 | -0.0294 | 0.00728 | 0.0662 | 0.0485 | -0.219 | 0.245 | 0.134 | -0.119 | 0.0254 | 0.0178 | 0.0233 | -0.00593 | -0.0159 | -0.0937 | 0.0532 | 2.01e+03 | 5.00 | 19.0 | 1.40e+09 | 2.01e+03 |
gender_F
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.405 ± 0.491
- Median ± IQR
- 0.00 ± 1.00
- Min | Max
- 0.00 | 1.00
gender_M
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.594 ± 0.491
- Median ± IQR
- 1.00 ± 1.00
- Min | Max
- 0.00 | 1.00
gender_nan
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00125 ± 0.0353
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_BOA
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000375 ± 0.0194
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_BOE
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00262 ± 0.0512
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_CAT
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00763 ± 0.0870
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_CCL
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0100 ± 0.0995
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_CEC
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00800 ± 0.0891
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_CEX
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00387 ± 0.0621
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_COR
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0526 ± 0.223
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_CUS
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00313 ± 0.0558
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_DEP
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0182 ± 0.134
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_DGS
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0434 ± 0.204
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_DHS
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00137 ± 0.0371
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_DLC
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0434 ± 0.204
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_DOT
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.134 ± 0.341
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_DPS
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0231 ± 0.150
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_DTS
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0148 ± 0.121
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_ECM
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000250 ± 0.0158
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_FIN
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0125 ± 0.111
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_FRS
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.145 ± 0.352
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_HCA
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00825 ± 0.0905
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_HHS
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.168 ± 0.374
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_HRC
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000875 ± 0.0296
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_IGR
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000375 ± 0.0194
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_LIB
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0399 ± 0.196
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_MPB
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000250 ± 0.0158
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_NDA
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00175 ± 0.0418
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_OAG
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000875 ± 0.0296
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_OCP
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00187 ± 0.0433
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_OHR
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00738 ± 0.0856
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_OIG
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000500 ± 0.0224
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_OLO
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00125 ± 0.0353
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_OMB
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00325 ± 0.0569
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_PIO
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00625 ± 0.0788
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_POL
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.198 ± 0.399
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_PRO
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00338 ± 0.0580
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_REC
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0139 ± 0.117
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_SHF
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0188 ± 0.136
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_ZAH
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000500 ± 0.0224
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Board of Appeals Department
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000375 ± 0.0194
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Board of Elections
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00262 ± 0.0512
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Community Engagement Cluster
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00800 ± 0.0891
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Community Use of Public Facilities
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00313 ± 0.0558
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Correction and Rehabilitation
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0526 ± 0.223
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_County Attorney's Office
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00763 ± 0.0870
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_County Council
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0100 ± 0.0995
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Department of Environmental Protection
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0182 ± 0.134
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Department of Finance
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0125 ± 0.111
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Department of General Services
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0434 ± 0.204
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Department of Health and Human Services
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.168 ± 0.374
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Department of Housing and Community Affairs
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00825 ± 0.0905
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Department of Liquor Control
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0434 ± 0.204
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Department of Permitting Services
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0231 ± 0.150
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Department of Police
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.198 ± 0.399
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Department of Public Libraries
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0399 ± 0.196
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Department of Recreation
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0139 ± 0.117
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Department of Technology Services
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0148 ± 0.121
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Department of Transportation
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.134 ± 0.341
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Ethics Commission
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000250 ± 0.0158
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Fire and Rescue Services
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.145 ± 0.352
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Merit System Protection Board Department
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000250 ± 0.0158
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Non-Departmental Account
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00175 ± 0.0418
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Office of Agriculture
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000875 ± 0.0296
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Office of Consumer Protection
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00187 ± 0.0433
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Office of Emergency Management and Homeland Security
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00137 ± 0.0371
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Office of Human Resources
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00738 ± 0.0856
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Office of Human Rights
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000875 ± 0.0296
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Office of Intergovernmental Relations Department
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000375 ± 0.0194
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Office of Legislative Oversight
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00125 ± 0.0353
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Office of Management and Budget
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00325 ± 0.0569
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Office of Procurement
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00338 ± 0.0580
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Office of Public Information
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00625 ± 0.0788
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Office of Zoning and Administrative Hearings
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000500 ± 0.0224
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Office of the Inspector General
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.000500 ± 0.0224
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Offices of the County Executive
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.00387 ± 0.0621
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
department_name_Sheriff's Office
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0188 ± 0.136
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
division_00
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
674 (8.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.224 ± 0.281
- Median ± IQR
- 0.134 ± 0.215
- Min | Max
- 7.31e-05 | 1.12
division_01
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
674 (8.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.176 ± 0.284
- Median ± IQR
- 0.172 ± 0.298
- Min | Max
- -0.557 | 0.821
division_02
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
674 (8.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0400 ± 0.314
- Median ± IQR
- -0.00556 ± 0.158
- Min | Max
- -0.600 | 1.00
division_03
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
674 (8.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0438 ± 0.306
- Median ± IQR
- -0.00797 ± 0.0794
- Min | Max
- -0.404 | 1.12
division_04
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
677 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.0385 ± 0.232
- Median ± IQR
- -0.00726 ± 0.201
- Min | Max
- -0.804 | 0.372
division_05
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
680 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.0438 ± 0.218
- Median ± IQR
- -0.0441 ± 0.131
- Min | Max
- -0.657 | 0.695
division_06
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
680 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0114 ± 0.211
- Median ± IQR
- -0.00379 ± 0.116
- Min | Max
- -0.344 | 1.19
division_07
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
680 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00159 ± 0.207
- Median ± IQR
- -0.00348 ± 0.0523
- Min | Max
- -0.825 | 0.936
division_08
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
679 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.0156 ± 0.199
- Median ± IQR
- -0.0131 ± 0.0934
- Min | Max
- -0.846 | 0.665
division_09
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0151 ± 0.185
- Median ± IQR
- 0.00198 ± 0.0367
- Min | Max
- -0.531 | 1.03
division_10
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0275 ± 0.180
- Median ± IQR
- 0.00560 ± 0.0586
- Min | Max
- -0.377 | 1.23
division_11
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0166 ± 0.175
- Median ± IQR
- 0.00324 ± 0.128
- Min | Max
- -0.592 | 0.641
division_12
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0123 ± 0.167
- Median ± IQR
- -0.00468 ± 0.0731
- Min | Max
- -0.671 | 0.870
division_13
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0275 ± 0.159
- Median ± IQR
- 0.00233 ± 0.0789
- Min | Max
- -0.364 | 1.06
division_14
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.00634 ± 0.155
- Median ± IQR
- -0.000898 ± 0.0868
- Min | Max
- -0.441 | 0.867
division_15
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.0135 ± 0.151
- Median ± IQR
- -0.0107 ± 0.0607
- Min | Max
- -0.617 | 0.684
division_16
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.000139 ± 0.147
- Median ± IQR
- 7.10e-06 ± 0.0788
- Min | Max
- -0.559 | 0.513
division_17
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0222 ± 0.140
- Median ± IQR
- 0.00427 ± 0.138
- Min | Max
- -0.432 | 0.533
division_18
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0128 ± 0.140
- Median ± IQR
- 0.00525 ± 0.0941
- Min | Max
- -0.581 | 0.539
division_19
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00710 ± 0.139
- Median ± IQR
- -0.00310 ± 0.0836
- Min | Max
- -0.369 | 0.635
division_20
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.00180 ± 0.134
- Median ± IQR
- -0.00179 ± 0.0481
- Min | Max
- -0.418 | 0.983
division_21
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00984 ± 0.131
- Median ± IQR
- -0.00554 ± 0.0607
- Min | Max
- -0.585 | 0.612
division_22
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.00700 ± 0.127
- Median ± IQR
- -0.00312 ± 0.0950
- Min | Max
- -0.383 | 0.602
division_23
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.00332 ± 0.122
- Median ± IQR
- -0.00361 ± 0.0686
- Min | Max
- -0.502 | 0.596
division_24
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00320 ± 0.118
- Median ± IQR
- -0.00298 ± 0.0517
- Min | Max
- -0.712 | 0.815
division_25
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00349 ± 0.115
- Median ± IQR
- 0.00631 ± 0.0470
- Min | Max
- -0.337 | 0.718
division_26
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00146 ± 0.114
- Median ± IQR
- -0.00936 ± 0.0648
- Min | Max
- -0.326 | 0.634
division_27
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.00709 ± 0.112
- Median ± IQR
- -0.00120 ± 0.0572
- Min | Max
- -0.446 | 0.627
division_28
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00526 ± 0.109
- Median ± IQR
- -0.00524 ± 0.0682
- Min | Max
- -0.343 | 0.526
division_29
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
681 (8.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.000560 ± 0.108
- Median ± IQR
- 0.000696 ± 0.0703
- Min | Max
- -0.295 | 0.591
assignment_category_Parttime-Regular
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
- Mean ± Std
- 0.0896 ± 0.286
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 1.00
employee_position_title_00
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.234 ± 0.313
- Median ± IQR
- 0.0893 ± 0.304
- Min | Max
- 0.000532 | 1.10
employee_position_title_01
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0777 ± 0.344
- Median ± IQR
- 0.00421 ± 0.0502
- Min | Max
- -0.374 | 1.08
employee_position_title_02
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.109 ± 0.301
- Median ± IQR
- 0.0149 ± 0.0476
- Min | Max
- -0.0620 | 1.16
employee_position_title_03
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.106 ± 0.275
- Median ± IQR
- 0.0314 ± 0.183
- Min | Max
- -0.166 | 0.989
employee_position_title_04
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0800 ± 0.240
- Median ± IQR
- 0.0430 ± 0.161
- Min | Max
- -0.584 | 0.684
employee_position_title_05
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0523 ± 0.208
- Median ± IQR
- -0.00336 ± 0.106
- Min | Max
- -0.310 | 0.963
employee_position_title_06
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0323 ± 0.192
- Median ± IQR
- 0.00383 ± 0.163
- Min | Max
- -0.306 | 0.819
employee_position_title_07
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0106 ± 0.190
- Median ± IQR
- -0.00498 ± 0.141
- Min | Max
- -0.533 | 0.636
employee_position_title_08
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0360 ± 0.177
- Median ± IQR
- 0.00421 ± 0.105
- Min | Max
- -0.294 | 0.773
employee_position_title_09
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00682 ± 0.180
- Median ± IQR
- 0.00592 ± 0.0636
- Min | Max
- -0.514 | 0.780
employee_position_title_10
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0219 ± 0.175
- Median ± IQR
- -0.00711 ± 0.0960
- Min | Max
- -0.520 | 0.696
employee_position_title_11
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.00189 ± 0.173
- Median ± IQR
- 0.00308 ± 0.0772
- Min | Max
- -0.437 | 0.852
employee_position_title_12
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.00592 ± 0.168
- Median ± IQR
- -0.00404 ± 0.0911
- Min | Max
- -0.499 | 0.895
employee_position_title_13
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0128 ± 0.165
- Median ± IQR
- -0.00431 ± 0.116
- Min | Max
- -0.449 | 0.657
employee_position_title_14
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.00423 ± 0.160
- Median ± IQR
- -0.00619 ± 0.173
- Min | Max
- -0.445 | 0.397
employee_position_title_15
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0178 ± 0.155
- Median ± IQR
- -0.000739 ± 0.0529
- Min | Max
- -0.235 | 1.08
employee_position_title_16
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0113 ± 0.145
- Median ± IQR
- -0.00241 ± 0.103
- Min | Max
- -0.292 | 0.798
employee_position_title_17
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00434 ± 0.138
- Median ± IQR
- -0.00123 ± 0.0911
- Min | Max
- -0.251 | 0.687
employee_position_title_18
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00572 ± 0.136
- Median ± IQR
- 0.00568 ± 0.0863
- Min | Max
- -0.325 | 1.00
employee_position_title_19
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.0107 ± 0.132
- Median ± IQR
- -0.00328 ± 0.119
- Min | Max
- -0.318 | 0.438
employee_position_title_20
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.00352 ± 0.129
- Median ± IQR
- -0.00315 ± 0.102
- Min | Max
- -0.486 | 0.379
employee_position_title_21
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.00475 ± 0.127
- Median ± IQR
- 0.0138 ± 0.122
- Min | Max
- -0.427 | 0.531
employee_position_title_22
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00190 ± 0.125
- Median ± IQR
- -0.00285 ± 0.0864
- Min | Max
- -0.378 | 0.614
employee_position_title_23
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00778 ± 0.123
- Median ± IQR
- -0.00752 ± 0.0615
- Min | Max
- -0.427 | 0.698
employee_position_title_24
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.00280 ± 0.122
- Median ± IQR
- 0.00189 ± 0.0399
- Min | Max
- -0.593 | 0.721
employee_position_title_25
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00826 ± 0.120
- Median ± IQR
- -0.00104 ± 0.0966
- Min | Max
- -0.221 | 0.654
employee_position_title_26
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.00167 ± 0.114
- Median ± IQR
- 0.00128 ± 0.0931
- Min | Max
- -0.403 | 0.492
employee_position_title_27
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00499 ± 0.108
- Median ± IQR
- -0.00236 ± 0.0547
- Min | Max
- -0.376 | 0.548
employee_position_title_28
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.000102 ± 0.105
- Median ± IQR
- -0.00226 ± 0.0777
- Min | Max
- -0.333 | 0.554
employee_position_title_29
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
431 (5.4%)
This column has a high cardinality (> 40).
- Mean ± Std
- -0.00551 ± 0.0973
- Median ± IQR
- 0.0128 ± 0.0664
- Min | Max
- -0.702 | 0.221
date_first_hired_year
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
51 (0.6%)
This column has a high cardinality (> 40).
- Mean ± Std
- 2.00e+03 ± 9.33
- Median ± IQR
- 2.00e+03 ± 14.0
- Min | Max
- 1.96e+03 | 2.02e+03
date_first_hired_month
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 12 (0.1%)
- Mean ± Std
- 6.35 ± 3.48
- Median ± IQR
- 7.00 ± 6.00
- Min | Max
- 1.00 | 12.0
date_first_hired_day
Float32DType- Null values
- 0 (0.0%)
- Unique values
- 31 (0.4%)
- Mean ± Std
- 15.3 ± 8.61
- Median ± IQR
- 15.0 ± 14.0
- Min | Max
- 1.00 | 31.0
date_first_hired_total_seconds
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
2,101 (26.3%)
This column has a high cardinality (> 40).
- Mean ± Std
- 1.08e+09 ± 2.94e+08
- Median ± IQR
- 1.12e+09 ± 4.39e+08
- Min | Max
- -1.34e+08 | 1.48e+09
year_first_hired
Float32DType- Null values
- 0 (0.0%)
- Unique values
-
51 (0.6%)
This column has a high cardinality (> 40).
- Mean ± Std
- 2.00e+03 ± 9.33
- Median ± IQR
- 2.00e+03 ± 14.0
- Min | Max
- 1.96e+03 | 2.02e+03
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column
|
Column name
|
dtype
|
Is sorted
|
Null values
|
Unique values
|
Mean
|
Std
|
Min
|
Median
|
Max
|
---|---|---|---|---|---|---|---|---|---|---|
0 | gender_F | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.405 | 0.491 | 0.00 | 0.00 | 1.00 |
1 | gender_M | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.594 | 0.491 | 0.00 | 1.00 | 1.00 |
2 | gender_nan | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00125 | 0.0353 | 0.00 | 0.00 | 1.00 |
3 | department_BOA | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000375 | 0.0194 | 0.00 | 0.00 | 1.00 |
4 | department_BOE | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00262 | 0.0512 | 0.00 | 0.00 | 1.00 |
5 | department_CAT | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00763 | 0.0870 | 0.00 | 0.00 | 1.00 |
6 | department_CCL | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0100 | 0.0995 | 0.00 | 0.00 | 1.00 |
7 | department_CEC | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00800 | 0.0891 | 0.00 | 0.00 | 1.00 |
8 | department_CEX | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00387 | 0.0621 | 0.00 | 0.00 | 1.00 |
9 | department_COR | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0526 | 0.223 | 0.00 | 0.00 | 1.00 |
10 | department_CUS | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00313 | 0.0558 | 0.00 | 0.00 | 1.00 |
11 | department_DEP | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0182 | 0.134 | 0.00 | 0.00 | 1.00 |
12 | department_DGS | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0434 | 0.204 | 0.00 | 0.00 | 1.00 |
13 | department_DHS | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00137 | 0.0371 | 0.00 | 0.00 | 1.00 |
14 | department_DLC | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0434 | 0.204 | 0.00 | 0.00 | 1.00 |
15 | department_DOT | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.134 | 0.341 | 0.00 | 0.00 | 1.00 |
16 | department_DPS | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0231 | 0.150 | 0.00 | 0.00 | 1.00 |
17 | department_DTS | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0148 | 0.121 | 0.00 | 0.00 | 1.00 |
18 | department_ECM | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000250 | 0.0158 | 0.00 | 0.00 | 1.00 |
19 | department_FIN | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0125 | 0.111 | 0.00 | 0.00 | 1.00 |
20 | department_FRS | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.145 | 0.352 | 0.00 | 0.00 | 1.00 |
21 | department_HCA | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00825 | 0.0905 | 0.00 | 0.00 | 1.00 |
22 | department_HHS | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.168 | 0.374 | 0.00 | 0.00 | 1.00 |
23 | department_HRC | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000875 | 0.0296 | 0.00 | 0.00 | 1.00 |
24 | department_IGR | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000375 | 0.0194 | 0.00 | 0.00 | 1.00 |
25 | department_LIB | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0399 | 0.196 | 0.00 | 0.00 | 1.00 |
26 | department_MPB | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000250 | 0.0158 | 0.00 | 0.00 | 1.00 |
27 | department_NDA | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00175 | 0.0418 | 0.00 | 0.00 | 1.00 |
28 | department_OAG | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000875 | 0.0296 | 0.00 | 0.00 | 1.00 |
29 | department_OCP | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00187 | 0.0433 | 0.00 | 0.00 | 1.00 |
30 | department_OHR | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00738 | 0.0856 | 0.00 | 0.00 | 1.00 |
31 | department_OIG | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000500 | 0.0224 | 0.00 | 0.00 | 1.00 |
32 | department_OLO | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00125 | 0.0353 | 0.00 | 0.00 | 1.00 |
33 | department_OMB | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00325 | 0.0569 | 0.00 | 0.00 | 1.00 |
34 | department_PIO | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00625 | 0.0788 | 0.00 | 0.00 | 1.00 |
35 | department_POL | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.198 | 0.399 | 0.00 | 0.00 | 1.00 |
36 | department_PRO | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00338 | 0.0580 | 0.00 | 0.00 | 1.00 |
37 | department_REC | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0139 | 0.117 | 0.00 | 0.00 | 1.00 |
38 | department_SHF | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0188 | 0.136 | 0.00 | 0.00 | 1.00 |
39 | department_ZAH | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000500 | 0.0224 | 0.00 | 0.00 | 1.00 |
40 | department_name_Board of Appeals Department | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000375 | 0.0194 | 0.00 | 0.00 | 1.00 |
41 | department_name_Board of Elections | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00262 | 0.0512 | 0.00 | 0.00 | 1.00 |
42 | department_name_Community Engagement Cluster | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00800 | 0.0891 | 0.00 | 0.00 | 1.00 |
43 | department_name_Community Use of Public Facilities | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00313 | 0.0558 | 0.00 | 0.00 | 1.00 |
44 | department_name_Correction and Rehabilitation | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0526 | 0.223 | 0.00 | 0.00 | 1.00 |
45 | department_name_County Attorney's Office | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00763 | 0.0870 | 0.00 | 0.00 | 1.00 |
46 | department_name_County Council | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0100 | 0.0995 | 0.00 | 0.00 | 1.00 |
47 | department_name_Department of Environmental Protection | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0182 | 0.134 | 0.00 | 0.00 | 1.00 |
48 | department_name_Department of Finance | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0125 | 0.111 | 0.00 | 0.00 | 1.00 |
49 | department_name_Department of General Services | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0434 | 0.204 | 0.00 | 0.00 | 1.00 |
50 | department_name_Department of Health and Human Services | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.168 | 0.374 | 0.00 | 0.00 | 1.00 |
51 | department_name_Department of Housing and Community Affairs | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00825 | 0.0905 | 0.00 | 0.00 | 1.00 |
52 | department_name_Department of Liquor Control | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0434 | 0.204 | 0.00 | 0.00 | 1.00 |
53 | department_name_Department of Permitting Services | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0231 | 0.150 | 0.00 | 0.00 | 1.00 |
54 | department_name_Department of Police | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.198 | 0.399 | 0.00 | 0.00 | 1.00 |
55 | department_name_Department of Public Libraries | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0399 | 0.196 | 0.00 | 0.00 | 1.00 |
56 | department_name_Department of Recreation | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0139 | 0.117 | 0.00 | 0.00 | 1.00 |
57 | department_name_Department of Technology Services | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0148 | 0.121 | 0.00 | 0.00 | 1.00 |
58 | department_name_Department of Transportation | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.134 | 0.341 | 0.00 | 0.00 | 1.00 |
59 | department_name_Ethics Commission | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000250 | 0.0158 | 0.00 | 0.00 | 1.00 |
60 | department_name_Fire and Rescue Services | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.145 | 0.352 | 0.00 | 0.00 | 1.00 |
61 | department_name_Merit System Protection Board Department | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000250 | 0.0158 | 0.00 | 0.00 | 1.00 |
62 | department_name_Non-Departmental Account | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00175 | 0.0418 | 0.00 | 0.00 | 1.00 |
63 | department_name_Office of Agriculture | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000875 | 0.0296 | 0.00 | 0.00 | 1.00 |
64 | department_name_Office of Consumer Protection | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00187 | 0.0433 | 0.00 | 0.00 | 1.00 |
65 | department_name_Office of Emergency Management and Homeland Security | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00137 | 0.0371 | 0.00 | 0.00 | 1.00 |
66 | department_name_Office of Human Resources | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00738 | 0.0856 | 0.00 | 0.00 | 1.00 |
67 | department_name_Office of Human Rights | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000875 | 0.0296 | 0.00 | 0.00 | 1.00 |
68 | department_name_Office of Intergovernmental Relations Department | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000375 | 0.0194 | 0.00 | 0.00 | 1.00 |
69 | department_name_Office of Legislative Oversight | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00125 | 0.0353 | 0.00 | 0.00 | 1.00 |
70 | department_name_Office of Management and Budget | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00325 | 0.0569 | 0.00 | 0.00 | 1.00 |
71 | department_name_Office of Procurement | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00338 | 0.0580 | 0.00 | 0.00 | 1.00 |
72 | department_name_Office of Public Information | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00625 | 0.0788 | 0.00 | 0.00 | 1.00 |
73 | department_name_Office of Zoning and Administrative Hearings | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000500 | 0.0224 | 0.00 | 0.00 | 1.00 |
74 | department_name_Office of the Inspector General | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.000500 | 0.0224 | 0.00 | 0.00 | 1.00 |
75 | department_name_Offices of the County Executive | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.00387 | 0.0621 | 0.00 | 0.00 | 1.00 |
76 | department_name_Sheriff's Office | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0188 | 0.136 | 0.00 | 0.00 | 1.00 |
77 | division_00 | Float32DType | False | 0 (0.0%) | 674 (8.4%) | 0.224 | 0.281 | 7.31e-05 | 0.134 | 1.12 |
78 | division_01 | Float32DType | False | 0 (0.0%) | 674 (8.4%) | 0.176 | 0.284 | -0.557 | 0.172 | 0.821 |
79 | division_02 | Float32DType | False | 0 (0.0%) | 674 (8.4%) | 0.0400 | 0.314 | -0.600 | -0.00556 | 1.00 |
80 | division_03 | Float32DType | False | 0 (0.0%) | 674 (8.4%) | 0.0438 | 0.306 | -0.404 | -0.00797 | 1.12 |
81 | division_04 | Float32DType | False | 0 (0.0%) | 677 (8.5%) | -0.0385 | 0.232 | -0.804 | -0.00726 | 0.372 |
82 | division_05 | Float32DType | False | 0 (0.0%) | 680 (8.5%) | -0.0438 | 0.218 | -0.657 | -0.0441 | 0.695 |
83 | division_06 | Float32DType | False | 0 (0.0%) | 680 (8.5%) | 0.0114 | 0.211 | -0.344 | -0.00379 | 1.19 |
84 | division_07 | Float32DType | False | 0 (0.0%) | 680 (8.5%) | 0.00159 | 0.207 | -0.825 | -0.00348 | 0.936 |
85 | division_08 | Float32DType | False | 0 (0.0%) | 679 (8.5%) | -0.0156 | 0.199 | -0.846 | -0.0131 | 0.665 |
86 | division_09 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.0151 | 0.185 | -0.531 | 0.00198 | 1.03 |
87 | division_10 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.0275 | 0.180 | -0.377 | 0.00560 | 1.23 |
88 | division_11 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.0166 | 0.175 | -0.592 | 0.00324 | 0.641 |
89 | division_12 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.0123 | 0.167 | -0.671 | -0.00468 | 0.870 |
90 | division_13 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.0275 | 0.159 | -0.364 | 0.00233 | 1.06 |
91 | division_14 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | -0.00634 | 0.155 | -0.441 | -0.000898 | 0.867 |
92 | division_15 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | -0.0135 | 0.151 | -0.617 | -0.0107 | 0.684 |
93 | division_16 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.000139 | 0.147 | -0.559 | 7.10e-06 | 0.513 |
94 | division_17 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.0222 | 0.140 | -0.432 | 0.00427 | 0.533 |
95 | division_18 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.0128 | 0.140 | -0.581 | 0.00525 | 0.539 |
96 | division_19 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.00710 | 0.139 | -0.369 | -0.00310 | 0.635 |
97 | division_20 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | -0.00180 | 0.134 | -0.418 | -0.00179 | 0.983 |
98 | division_21 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.00984 | 0.131 | -0.585 | -0.00554 | 0.612 |
99 | division_22 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | -0.00700 | 0.127 | -0.383 | -0.00312 | 0.602 |
100 | division_23 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | -0.00332 | 0.122 | -0.502 | -0.00361 | 0.596 |
101 | division_24 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.00320 | 0.118 | -0.712 | -0.00298 | 0.815 |
102 | division_25 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.00349 | 0.115 | -0.337 | 0.00631 | 0.718 |
103 | division_26 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.00146 | 0.114 | -0.326 | -0.00936 | 0.634 |
104 | division_27 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | -0.00709 | 0.112 | -0.446 | -0.00120 | 0.627 |
105 | division_28 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.00526 | 0.109 | -0.343 | -0.00524 | 0.526 |
106 | division_29 | Float32DType | False | 0 (0.0%) | 681 (8.5%) | 0.000560 | 0.108 | -0.295 | 0.000696 | 0.591 |
107 | assignment_category_Parttime-Regular | Float32DType | False | 0 (0.0%) | 2 (< 0.1%) | 0.0896 | 0.286 | 0.00 | 0.00 | 1.00 |
108 | employee_position_title_00 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.234 | 0.313 | 0.000532 | 0.0893 | 1.10 |
109 | employee_position_title_01 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.0777 | 0.344 | -0.374 | 0.00421 | 1.08 |
110 | employee_position_title_02 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.109 | 0.301 | -0.0620 | 0.0149 | 1.16 |
111 | employee_position_title_03 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.106 | 0.275 | -0.166 | 0.0314 | 0.989 |
112 | employee_position_title_04 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.0800 | 0.240 | -0.584 | 0.0430 | 0.684 |
113 | employee_position_title_05 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.0523 | 0.208 | -0.310 | -0.00336 | 0.963 |
114 | employee_position_title_06 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.0323 | 0.192 | -0.306 | 0.00383 | 0.819 |
115 | employee_position_title_07 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.0106 | 0.190 | -0.533 | -0.00498 | 0.636 |
116 | employee_position_title_08 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.0360 | 0.177 | -0.294 | 0.00421 | 0.773 |
117 | employee_position_title_09 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.00682 | 0.180 | -0.514 | 0.00592 | 0.780 |
118 | employee_position_title_10 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.0219 | 0.175 | -0.520 | -0.00711 | 0.696 |
119 | employee_position_title_11 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | -0.00189 | 0.173 | -0.437 | 0.00308 | 0.852 |
120 | employee_position_title_12 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | -0.00592 | 0.168 | -0.499 | -0.00404 | 0.895 |
121 | employee_position_title_13 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.0128 | 0.165 | -0.449 | -0.00431 | 0.657 |
122 | employee_position_title_14 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | -0.00423 | 0.160 | -0.445 | -0.00619 | 0.397 |
123 | employee_position_title_15 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.0178 | 0.155 | -0.235 | -0.000739 | 1.08 |
124 | employee_position_title_16 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.0113 | 0.145 | -0.292 | -0.00241 | 0.798 |
125 | employee_position_title_17 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.00434 | 0.138 | -0.251 | -0.00123 | 0.687 |
126 | employee_position_title_18 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.00572 | 0.136 | -0.325 | 0.00568 | 1.00 |
127 | employee_position_title_19 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.0107 | 0.132 | -0.318 | -0.00328 | 0.438 |
128 | employee_position_title_20 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | -0.00352 | 0.129 | -0.486 | -0.00315 | 0.379 |
129 | employee_position_title_21 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | -0.00475 | 0.127 | -0.427 | 0.0138 | 0.531 |
130 | employee_position_title_22 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.00190 | 0.125 | -0.378 | -0.00285 | 0.614 |
131 | employee_position_title_23 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.00778 | 0.123 | -0.427 | -0.00752 | 0.698 |
132 | employee_position_title_24 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | -0.00280 | 0.122 | -0.593 | 0.00189 | 0.721 |
133 | employee_position_title_25 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.00826 | 0.120 | -0.221 | -0.00104 | 0.654 |
134 | employee_position_title_26 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | -0.00167 | 0.114 | -0.403 | 0.00128 | 0.492 |
135 | employee_position_title_27 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | 0.00499 | 0.108 | -0.376 | -0.00236 | 0.548 |
136 | employee_position_title_28 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | -0.000102 | 0.105 | -0.333 | -0.00226 | 0.554 |
137 | employee_position_title_29 | Float32DType | False | 0 (0.0%) | 431 (5.4%) | -0.00551 | 0.0973 | -0.702 | 0.0128 | 0.221 |
138 | date_first_hired_year | Float32DType | False | 0 (0.0%) | 51 (0.6%) | 2.00e+03 | 9.33 | 1.96e+03 | 2.00e+03 | 2.02e+03 |
139 | date_first_hired_month | Float32DType | False | 0 (0.0%) | 12 (0.1%) | 6.35 | 3.48 | 1.00 | 7.00 | 12.0 |
140 | date_first_hired_day | Float32DType | False | 0 (0.0%) | 31 (0.4%) | 15.3 | 8.61 | 1.00 | 15.0 | 31.0 |
141 | date_first_hired_total_seconds | Float32DType | False | 0 (0.0%) | 2101 (26.3%) | 1.08e+09 | 2.94e+08 | -1.34e+08 | 1.12e+09 | 1.48e+09 |
142 | year_first_hired | Float32DType | False | 0 (0.0%) | 51 (0.6%) | 2.00e+03 | 9.33 | 1.96e+03 | 2.00e+03 | 2.02e+03 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
By clicking on Show graph
, we can see the DataOps plan that has been created:
the plan shows the steps that have been applied to the data so far.
Now that we have the vectorized features, we can proceed to train a model.
We use a scikit-learn HistGradientBoostingRegressor
to predict the target variable.
We apply the model to the vectorized features using .skb.apply
, and pass
y
as the target variable.
Note that the resulting predictor
will show the prediction results on the
preview subsample, but the actual model has not been fitted yet.
from sklearn.ensemble import HistGradientBoostingRegressor
hgb = HistGradientBoostingRegressor()
predictor = X_vec.skb.apply(hgb, y=y)
predictor
Show graph
current_annual_salary | |
---|---|
0 | 6.79e+04 |
1 | 9.70e+04 |
2 | 1.03e+05 |
3 | 5.39e+04 |
4 | 8.82e+04 |
7,995 | 1.08e+05 |
7,996 | 1.83e+04 |
7,997 | 6.21e+04 |
7,998 | 7.56e+04 |
7,999 | 1.03e+05 |
current_annual_salary
Float64DType- Null values
- 0 (0.0%)
- Unique values
-
6,521 (81.5%)
This column has a high cardinality (> 40).
- Mean ± Std
- 7.34e+04 ± 2.82e+04
- Median ± IQR
- 7.00e+04 ± 3.71e+04
- Min | Max
- 1.32e+04 | 2.30e+05
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column
|
Column name
|
dtype
|
Is sorted
|
Null values
|
Unique values
|
Mean
|
Std
|
Min
|
Median
|
Max
|
---|---|---|---|---|---|---|---|---|---|---|
0 | current_annual_salary | Float64DType | False | 0 (0.0%) | 6521 (81.5%) | 7.34e+04 | 2.82e+04 | 1.32e+04 | 7.00e+04 | 2.30e+05 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
Now that we have built our entire plan, we can have explore it in more detail
with the .skb.eval()
method:
predictions.skb.full_report()
This produces a folder on disk rather than displaying inline in a notebook so we do not run it here. But you can see the output here.
This method evaluates each step in the plan and shows detailed information about the operations that are being performed.
Turning the DataOps plan into a learner, for later reuse#
Now that we have defined the predictor, we can create a learner
, a
standalone object that contains all the steps in the DataOps plan. We fit the
learner, so that it can be used to make predictions on new data.
trained_learner = predictor.skb.make_learner(fitted=True)
A big advantage of the learner is that it can be pickled and saved to disk,
allowing us to reuse the trained model later without needing to retrain it.
The learner contains all steps in the DataOps plan, including the fitted
vectorizer and the trained model. We can save it using Python’s pickle
module:
here we use pickle.dumps
to serialize the learner object into a byte string.
import pickle
saved_model = pickle.dumps(trained_learner)
We can now load the saved model back into memory using pickle.loads.
Now, we can make predictions on new data using the loaded model, by passing
a dictionary with the skrub variable names as keys.
We don’t have to create a new variable, as this will be done internally by the
learner.
In fact, the learner
is similar to a scikit-learn estimator, but rather
than taking X
and y
as inputs, it takes a dictionary (the “environment”),
where each key is the name of one of the skrub variables in the plan.
We can now get the test set of the employee salaries dataset:
unseen_data = fetch_employee_salaries(split="test").employee_salaries
Then, we can use the loaded model to make predictions on the unseen data by passing the environment as dictionary.
predicted_values = loaded_model.predict({"data": unseen_data})
predicted_values
array([116382.06417108, 45114.33938599, 46680.82086958, ...,
105486.55018287, 146020.37131876, 73028.94144409], shape=(1228,))
We can also evaluate the model’s performance using the score method, which uses the scikit-learn scoring function used by the predictor:
loaded_model.score({"data": unseen_data})
0.9407037991754476
Conclusion#
In this example, we have briefly introduced the skrub DataOps, and how they can be used to build powerful machine learning pipelines. We have seen how to preprocess data, train a model. We have also shown how to save and load the trained model, and how to make predictions on new data using the trained model.
However, skrub DataOps are significantly more powerful than what we have shown here: for more advanced examples, see Skrub DataOps.
Total running time of the script: (0 minutes 5.708 seconds)