nannyml

nannyml: post-deployment data science in python

APACHE-2.0 License

Downloads
10.7K
Stars
1.8K
Committers
33

Bot releases are visible (Hide)

nannyml - v0.6.0

Published by github-actions[bot] about 2 years ago

Added

  • Added support for regression problems across all calculators and estimators.
    In some cases a required problem_type parameter is required during calculator/estimator initialization, this
    is a breaking change. Read more about using regression in our
    tutorials and about our new performance estimation
    for regression using the Direct Loss Estimation (DLE) algorithm.

Changed

  • Improved tox running speed by skipping some unnecessary package installations.
    Thanks @baskervilski!

Fixed

  • Fixed an issue where some Pandas column datatypes were not recognized as continuous by NannyML, causing them to be
    dropped in calculations. Thanks for reporting @Dbhasin1!
  • Fixed an issue where some helper columns for visualization crept into the stored reference results. Good catch
    @Dbhasin1!
  • Fixed an issue where a Reader instance would raise a WriteException. Thanks for those eagle eyes
    @baskervilski!
nannyml - v0.5.3

Published by github-actions[bot] about 2 years ago

Changed

  • We've completely overhauled the way we determine the "stability" of our estimations. We've moved on from determining
    a minimum Chunk size to estimating the sampling error for an operation on a Chunk.
    • A sampling error value will be provided per metric per Chunk in the result data for
      reconstruction error multivariate drift calculator, all performance calculation metrics and
      all performance estimation metrics.
    • Confidence bounds are now also based on this sampling error and will display a range around an estimation +/- 3
      times the sampling error in CBPE and reconstruction error multivariate drift calculator.
      Be sure to check out our in-depth documentation
      on how it works or dive right into the implementation.

Fixed

  • Fixed issue where an outdated version of Numpy caused Pandas to fail reading string columns in some scenarios
    (#93). Thank you, @Bernhard and
    @Gabriel for the investigative work!
nannyml - v0.5.2

Published by github-actions[bot] about 2 years ago

Changed

  • Swapped out ASCII art library from 'art' to 'PyFiglet' because the former was not yet present in conda-forge.

Fixed

  • Some leftover parameter was forgotten during cleanup, breaking CLI functionality
  • CLI progressbar was broken due to a boolean check with task ID 0.
nannyml - v0.5.1

Published by github-actions[bot] about 2 years ago

Added

  • Added simple CLI implementation to support automation and MLOps toolchain use cases. Supports reading/writing to
    cloud storage using S3, GCS, ADL, ABFS and AZ protocols. Containerized version available at
    dockerhub.

Changed

  • make clean now also clears __pycache__
  • Fixed some inconsistencies in docstrings (they still need some additional love though)
nannyml - v0.5.0

Published by github-actions[bot] over 2 years ago

Changed

  • Replaced the whole Metadata system by a more intuitive approach.

Fixed

nannyml - v0.4.1

Published by github-actions[bot] over 2 years ago

Added

  • Added limited support for regression use cases: create or extract RegressionMetadata and use it for drift
    detection. Performance estimation and calculation require more research.

Changed

  • DefaultChunker splits into 10 chunks of equal size.
  • SizeBasedChunker no longer drops incomplete last chunk by default, but this is now configurable behavior.
nannyml - v0.4.0

Published by github-actions[bot] over 2 years ago

Added

  • Added support for new metrics in the Confidence Based Performance Estimator (CBPE). It now estimates roc_auc,
    f1, precision, recall, specificity and accuracy.
  • Added support for multiclass classification. This includes
    • Specifying multiclass classification metadata + support in automated metadata extraction (by introducing a
      model_type parameter).
    • Support for all CBPE metrics.
    • Support for realized performance calculation using the PerformanceCalculator.
    • Support for all types of drift detection (model inputs, model output, target distribution).
    • A new synthetic toy dataset.

Changed

  • Removed the identifier property from the ModelMetadata class. Joining analysis data and
    analysis target values should be done upfront or index-based.
  • Added an exclude_columns parameter to the extract_metadata function. Use it to specify the columns that should
    not be considered as model metadata or features.
  • All fit methods now return the fitted object. This allows chaining Calculator/Estimator instantiation
    and fitting into a single line.
  • Custom metrics are no longer supported in the PerformanceCalculator. Only the predefined metrics remain supported.
  • Big documentation revamp: we've tweaked overall structure, page structure and incorporated lots of feedback.
  • Improvements to consistency and readability for the 'hover' visualization in the step plots, including consistent
    color usage, conditional formatting, icon usage etc.
  • Improved indication of "realized" and "estimated" performance in all CBPE step plots
    (changes to hover, axes and legends)

Fixed

  • Updated homepage in project metadata
  • Added missing metadata modification to the quickstart
  • Perform some additional check on reference data during preprocessing
  • Various documentation suggestions (#58)
nannyml - v0.3.2

Published by github-actions[bot] over 2 years ago

Fixed

  • Deal with out-of-time-order data when chunking (thanks for the assist @SoyGema!)
  • Fix reversed Y-axis and plot labels in continuous distribution plots
nannyml - v0.3.1

Published by github-actions[bot] over 2 years ago

nannyml - v0.3.0

Published by github-actions[bot] over 2 years ago

Added

  • Added support for both predicted labels and predicted probabilities in ModelMetadata.
  • Support for monitoring model performance metrics using the PerformanceCalculator.
  • Support for monitoring target distribution using the TargetDistributionCalculator

Changed

  • Plotting will default to using step plots.
  • Restructured the nannyml.drift package and subpackages. Breaking changes!
  • Metadata completeness check will now fail when there are features of FeatureType.UNKNOWN.
  • Chunk date boundaries are now calculated differently for a PeriodBasedChunker, using the
    theoretical period for boundaries as opposed to the observed boundaries within the chunk observations.
  • Updated version of the black pre-commit hook due to breaking changes in its click dependency.
  • The minimum chunk size will now be provided by each individual calculator / estimator / metric,
    allowing for each of them to warn the end user when chunk sizes are suboptimal.

Fixed

  • Restrict version of the scipy dependency to be >=1.7.3, <1.8.0. Planned to be relaxed ASAP.
  • Deal with missing values in chunks causing NaN values when concatenating.
  • Crash when estimating CBPE without a target column present
  • Incorrect label in ModelMetadata printout
Package Rankings
Top 6.64% on Proxy.golang.org
Top 28.69% on Conda-forge.org
Top 5.63% on Pypi.org