Project for open sourcing research efforts on Backward Compatibility in Machine Learning
MIT License
Updates that may improve an AI systems accuracy can also introduce new and unanticipated errors that damage user trust. Updates that introduce new errors can also break trust between software components and machine learning models, as these errors are propagated and compounded throughout larger integrated AI systems. The Backward Compatibility ML library is an open-source project for evaluating AI system updates in a new way for increasing system reliability and human trust in AI predictions for actions.
The Backward Compatibility ML project has two components:
A series of loss functions in which users can vary the weight assigned to the dissonance factor and explore performance/capability tradeoffs during machine learning optimization.
Visualization widgets that help users examinemetrics anderror datain detail. They provide a view of error intersections between models and incompatibility distribution across classes.
pip install -r requirements.txt
npm install
npm run build && pip install -e .
or NODE_ENV=production npx webpack && pip install -e .
backwardcompatibilityml
module and use it.Start your Jupyter Notebooks server and load in the example notebook under the examples
folder
to see how the backwardcompatibilityml
module is used.
To demo the compatbility analysis widget, open the notebook compatibility-analysis.ipynb
inside the examples folder. Below is a list other sample notebooks that may be of interest. For the full list of example notebooks, please refer to Running the Backward Compatibility ML library examples
Notebook name | Framework | Notes |
---|---|---|
compatibility-analysis-cifar10-resnet18-pretrained | PyTorch | Uses a pre-trained model |
model-comparison-MNIST | PyTorch | Uses ModelComparison widget |
tensorflow-new-error-cross-entropy-loss | TensorFlow | General TensorFlow usage example |
tensorflow-MNIST | TensorFlow | Uses CompatibilityModel class |
Compatibility sweeps are automatically logged with MLflow. MLflow runs are logged in a folder named mlruns
in the same directory as the notebook.
To view the MLflow dashboard, start the MLflow server by running mlflow server --port 5200 --backend-store-uri ./mlruns
. Then, open the MLflow UI
in your browser by navigating to localhost:5200
.
To run tests, make sure that you are in the project root folder and do:
pip install -r dev-requirements.txt
pytest tests/
npm install
npm run test
This is provided as a convenience tool to developers, in order to allow development of the widget proceed outside of a Jupyter notebook environment.
The widget can be loaded in the web browser at localhost:3000
or <your-ip>:3000
. It will be loaded independently from a Jupyter notebook. The APIs will be hosted at localhost:5000
or <your-ip>:5000
.
Changes to the CSS or TypeScript code will be hot loaded automatically in the browser. Flask will run in debug mode and automatically restart whenever the Python code is changed.
FLASK_ENV=development FLASK_APP=development/compatibility-analysis/app.py flask run --host 0.0.0.0 --port 5000
on Linux or set FLASK_ENV=development && set FLASK_APP=development\compatibility-analysis\app.py && flask run --host 0.0.0.0 --port 5000
on Windows. This will start the Flask server for the APIs used by the widget.npm run start-compatibility-analysis
http://<your-ip-address>:3000
FLASK_ENV=development FLASK_APP=development/model-comparison/app.py flask run --host 0.0.0.0 --port 5000
on Linux or set FLASK_ENV=development && set FLASK_APP=development\model-comparison\app.py && flask run --host 0.0.0.0 --port 5000
on Windows. This will start the Flask server for the APIs used by the widget.npm run start-model-comparison
.http://<your-ip-address>:3000
Check CONTRIBUTING page.
This project materializes and implements ideas from ongoing research on Backward Compatibility in Machine Learning and Model Comparison. Here is a list of development and research contributors:
Current Contributors: Xavier Fernandes, Nicholas King, Kathleen Walker, Juan Lema, Besmira Nushi
Research Contributors: Gagan Bansal, Megha Srivastava, Besmira Nushi, Ece Kamar, Eric Horvitz, Dan Weld, Shital Shah
References
"Updates in Human-AI Teams: Understanding and Addressing the Performance/Compatibility Tradeoff." Gagan Bansal, Besmira Nushi, Ece Kamar, Daniel S Weld, Walter S Lasecki, Eric Horvitz; AAAI 2019. Pdf
"An Empirical Analysis of Backward Compatibility in Machine Learning Systems." Megha Srivastava, Besmira Nushi, Ece Kamar, Shital Shah, Eric Horvitz; KDD 2020. Pdf
"Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure." Besmira Nushi, Ece Kamar, Eric Horvitz; HCOMP 2018. Pdf
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
This project is licensed under the terms of the MIT license. See LICENSE.txt for additional details.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.