.. _installation_instructions: .. currentmodule:: skrub ======= Install ======= .. raw:: html

.. code:: console pip install skrub -U | **Deep learning dependencies** Deep-learning based encoders like :class:`TextEncoder` require installing optional dependencies to use them. The following will install `torch `_, `transformers `_, and `sentence-transformers `_. .. code:: console $ pip install skrub[transformers] -U .. raw:: html

.. code:: console conda install -c conda-forge skrub | **Deep learning dependencies** Deep-learning based encoders like :class:`TextEncoder` require installing optional dependencies to use them. The following will install `torch `_, `transformers `_, and `sentence-transformers `_. .. code:: console $ conda install -c conda-forge skrub[transformers] .. raw:: html

.. code:: console mamba install -c conda-forge skrub | **Deep learning dependencies** Deep-learning based encoders like :class:`TextEncoder` require installing optional dependencies to use them. The following will install `torch `_, `transformers `_, and `sentence-transformers `_. .. code:: console $ mamba install -c conda-forge skrub[transformers] .. raw:: html

.. _installing_from_source: Advanced Usage for Contributors ------------------------------- 1. Fork the project ''''''''''''''''''' To contribute to the project, you first need to `fork skrub on GitHub `_. That will enable you to push your commits to a branch *on your fork*. 2. Clone your fork '''''''''''''''''' Clone your forked repo to your local machine: .. code:: console git clone https://github.com//skrub cd skrub Next, add the *upstream* remote (i.e. the official skrub repository). This allows you to pull the latest changes from the main repository: .. code:: console git remote add upstream https://github.com/skrub-data/skrub.git Verify that both the origin (your fork) and upstream (official repo) are correctly set up: .. code:: console git remote -v You should see something like this: .. code:: console origin git@github.com:/skrub.git (fetch) origin git@github.com:/skrub.git (push) upstream git@github.com:skrub-data/skrub.git (fetch) upstream git@github.com:skrub-data/skrub.git (push) 3. Setup your environment ''''''''''''''''''''''''' Now, setup a development environment. You can set up a virtual environment with Conda, or with python's ``venv``: - With `conda `__: .. code:: console conda create -n env_skrub python=3.13 conda activate env_skrub - With `venv `__: .. code:: console python -m venv env_skrub source env_skrub/bin/activate Then, with the environment activated and at the root of your local copy of skrub, install the local package in editable mode with development dependencies: .. code:: console pip install -e ".[dev]" Enabling pre-commit hooks ensures code style consistency by triggering checks (mainly formatting) every time you run a ``git commit``. .. code:: console pre-commit install Optionally, configure Git to ignore certain revisions in git blame and IDE integrations. These revisions are listed in .git-blame-ignore-revs: .. code:: console git config blame.ignoreRevsFile .git-blame-ignore-revs 4. Run the tests '''''''''''''''' To ensure your environment is correctly set up, run the test suite: .. code:: console pytest --pyargs skrub Testing should take about 5 minutes. If you see some warnings like: .. code:: sh UserWarning: Only pandas and polars DataFrames are supported, but input is a Numpy array. Please convert Numpy arrays to DataFrames before passing them to skrub transformers. Converting to pandas DataFrame with columns ['0', '1', …]. warnings.warn( This is expected, and you may proceed with the next steps without worrying about them. However, no tests should fail at this point: if they do fail, then let us know. After that, your environment is ready for development! **Deep learning dependencies** Deep-learning based encoders like :class:`TextEncoder` require installing optional dependencies to use them. The following will install `torch `_, `transformers `_, and `sentence-transformers `_. .. code:: console $ pip install -e ".[transformers]" Now that you're set up, you may return to :ref:`writing your first pull request` and start coding! .. raw:: html