This project welcomes contributions and suggestions.
Developer certificate of origin¶
Contributions require you to sign a developer certificate of origin (DCO) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://developercertificate.org/.
When you submit a pull request, a DCO-bot will automatically determine whether you need to provide a DCO and decorate the PR appropriately (e.g., label, comment).
Signing off means you need to have your name and email address attached as a commit comment, which you can automate using git hooks as shown here.
Development happens against the master branch following the GitHub flow model. Contributors should use their own forks of the repository. In their fork, they create feature branches off of master, and their pull requests should target the master branch. Maintainers are responsible for prompt review of pull requests.
Pull requests against master trigger automated tests that are run through Azure DevOps or GitHub Actions. Additional test suites are run periodically. When adding new code paths or features, tests are a requirement to complete a pull request. They should be added in the test directory.
Investigating automated test failures¶
For every pull request to master with automated tests, you can check the logs of the tests to find the root cause of failures. Our tests currently run through Azure Pipelines with steps for setup, testing, and teardown. The Checks tab of a pull request contains a link to the [Azure Pipelines page](dev.azure.com/responsibleai/fairlearn/_build/results), where you can review the logs by clicking on a specific step in the automated test sequence. If you encounter problems with this workflow, please reach out through GitHub issues.
To run the same tests locally, find the corresponding pipeline definition (a yml file) in the devops directory. It either directly contains the command to execute the tests (usually starting with python -m pytest) or it refers to a template file with the command.
We are in the process of setting up regular contributor meetings. Stay tuned! <!– We have a TBD development meeting every TBD at TBD US Pacific Time and all are welcome. We keep notes from each meeting in a TBD repository. –>
Design proposals for larger changes go into the fairlearn-proposals repository. If you want to submit a proposal, just create a pull request there and maintainers will review it.
This section relies on the definitions from our Fairness in Machine Learning guide, including the definitions of “estimator”, “reduction”, “sensitive features”, “moment”, and “parity”.
Unfairness mitigation algorithms¶
Unfairness mitigation algorithms take form of scikit-learn-style estimators. Any algorithm-specific parameters are passed to the constructor. The resulting instance of the algorithm should support methods fit and predict with APIs resembling those of scikit-learn as much as possible. Any deviations are noted below.
Reduction constructors require a parameter corresponding to an estimator that implements the fit method with the sample_weight argument. Parity constraints for reductions are expressed via instances of various subclasses of the class fairlearn.reductions.Moment. Formally, instances of the class Moment implement vector-valued random variables whose sample averages over the data are required to be bounded (above and/or below).
constraints = Moment()
reduction = Reduction(estimator, constraints)
Reductions provide fit and predict methods with the following signatures:
reduction.fit(X, y, **kwargs)
All of the currently supported parity constraints (subclasses of Moment) are based on sensitive features that need to be provided to fit as a keyword argument sensitive_features. In the future, it will also be possible to provide sensitive features as columns of X.
The constructors of post-processing algorithms require either an already trained predictor or an estimator (which is trained on the data when executing fit). For post-processing algorithms, the constraints argument is provided as a string.
postprocessor = PostProcessing(estimator=estimator, constraints=constraints)
Post-processing algorithms (such as the ones under fairlearn.postprocessing) provide the same functions as the reductions above albeit with sensitive_features as a required argument for predict. In the future, we will make sensitive_features optional if the sensitive features are already provided through X.
postprocessor.fit(X, y, sensitive_features=sensitive_features)
Creating new releases¶
First add a description of the changes introduced in the package version you want to release to CHANGES.md.
It is also best to verify that the Dashboard loads correctly. This is slightly involved: 1. Create a wheel by running python setup.py sdist bdist_wheel from the repository root. This will create a dist directory which contains a .whl file. 1. Create a new Conda environment for the test 1. In this new environment, install this wheel by running pip install dist/<FILENAME>.whl 1. Install any pip packages required for the notebooks 1. Check that the dashboard loads in the notebooks
We have a Azure DevOps Pipeline which takes care of building wheels and pushing to PyPI. Validations are also performed prior to any deployments, and also following the uploads to Test-PyPI and PyPI. To use it:
Ensure that _base_version in fairlearn/__init__.py is set correctly for PyPI.
Put down a tag corresponding to this _base_version but preprended with v. For example, version 0.5.0 should be tagged wtih v0.5.0
Queue the pipeline at this tag, with a variable DEV_VERSION set to zero. When the package is uploaded to Test-PyPI, this number will be appended to the version as a dev[n] suffix
The pipeline requires sign offs immediately prior to the deployments to Test-PyPI and PyPI. If there is an issue found, then after applying the fix, update the location of the tag, and queue a new release pipeline with the value of DEV_VERSION increased by one.
The DEV_VERSION variable is to work around the PyPI behaviour where uploads are immutable and published immediately. This means that each upload to PyPI ‘uses up’ that particular version (on that particular PyPI instance). Since we wish to deploy exactly the same bits to both Test-PyPI and PyPI, without this workaround the version released to PyPI would depend on the number of issues discovered when uploading to Test-PyPI. If PyPI updates their release process to separate the ‘upload’ and ‘publish’ steps (i.e. until a package is published, it remains hidden and mutable) then the code associated with DEV_VERSION should be removed.
As part of the release process, the build_wheels.py script uses process_readme.py to turn all the relative links in the ReadMe file into absolute ones (this is the reason why the applied tag has be of the form v[_base_version]). The process_readme.py script is slightly fragile with respect to the contents of the ReadMe, so after significant changes its output should be verified.