TableReport#
Usage examples at the bottom of this page.
- class skrub.TableReport(dataframe, order_by=None, title=None, column_filters=None)[source]#
Summarize the contents of a dataframe.
This class summarizes a dataframe, providing information such as the type and summary statistics (mean, number of missing values, etc.) for each column.
- Parameters:
- dataframepandas or polars DataFrame
The dataframe to summarize.
- order_by
str
Column name to use for sorting. Other numerical columns will be plotted as function of the sorting column. Must be of numerical or datetime type.
- title
str
Title for the report.
- column_filters
dict
A dict for adding custom entries to the column filter dropdown menu. Each key is an id for the filter (e.g.
"first_10"
) and the value is a mapping with the keysdisplay_name
(the name shown in the menu, e.g."First 10 columns"
) andcolumns
(a list of column names). See the end of the “Examples” section below for details.
- Attributes:
Notes
You can see some example reports for a few datasets online. We also provide an experimental online demo that allows you to select a CSV or parquet file and generate a report directly in your web browser.
Examples
>>> import pandas as pd >>> from skrub import TableReport >>> df = pd.DataFrame(dict(a=[1, 2], b=['one', 'two'], c=[11.1, 11.1])) >>> report = TableReport(df)
If you are in a Jupyter notebook, to display the report just have it be the last expression evaluated in a cell so that it is displayed in the cell’s output.
>>> report <TableReport: use .open() to display>
(Note that above we only see the string represention, not the report itself, because we are not in a notebook.)
Whether you are using a notebook or not, you can always open the report as a full page in a separate browser tab with its
open
method:report.open()
.You can also get the HTML report as a string. For a full, standalone web page:
>>> report.html() Processing... '<!DOCTYPE html>\n<html lang="en-US">\n\n<head>\n <meta charset="utf-8"...'
For an HTML fragment that can be inserted into a page:
>>> report.html_snippet() '\n<div id="report_...-wrapper" hidden>\n <template id="report_...'
Advanced configuration: you can add custom column filters that will appear in the report’s dropdown menu.
>>> filters = { ... "at_least_2": { ... "display_name": "Columns with at least 2 unique values", ... "columns": ["a", "b"], ... } ... } >>> report = TableReport(df, column_filters=filters)
With the code above, in addition to the default filters such as “All columns”, “Numeric columns”, etc., the added “Columns with at least 2 unique values” will be available in the report, selecting columns “a” and “b”.
Methods
html
()Get the report as a full HTML page.
Get the report as an HTML fragment that can be inserted in a page.
json
()Get the report data in JSON format.
open
()Open the HTML report in a web browser.