eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch

APACHE-2.0 License

Downloads
15.9K
Stars
641
Committers
37

Bot releases are hidden (Show)

eland - 8.15.0

Published by miguelgrinberg 2 months ago

  • Added a default truncation of second for text similarity (#713)
  • Added note about using text_similarity for rerank in the CLI (#716)
  • Added support for lists in result hits (#707)
  • Removed input fields from exported LTR models (#708)
eland - 8.14.0 Latest Release

Published by pquentin 4 months ago

Added

  • Added Elasticsearch Serverless support in DataFrames (#690, contributed by @AshokChoudhary11) and eland_import_hub_model (#698)

Fixed

  • Fixed Python 3.8 support (#695, contributed by @bartbroere)
  • Fixed non _source fields missing from the results hits (#693, contributed by @bartbroere)
eland - 8.13.1

Published by pquentin 6 months ago

Added

  • Added support for HTTP proxies in eland_import_hub_model (#688)
eland - 8.13.0

Published by pquentin 7 months ago

Added

  • Added support for Python 3.11 (#681)
  • Added eland.DataFrame.to_json function (#661, contributed by @bartbroere)
  • Added override option to specify the model's max input size (#674)

Changed

  • Upgraded torch to 2.1.2 (#671)
  • Mirrored pandas' lineterminator instead of line_terminator in to_csv (#595, contributed by @bartbroere)
eland - 8.12.1

Published by pquentin 9 months ago

Fixed

  • Fix missing value support for XGBRanker (#654)
eland - 8.12.0

Published by pquentin 9 months ago

Added

  • Supported XGBRanker model (#649)
  • Accepted LTR (Learning to rank) model config when importing model (#645, #651)
  • Added LTR feature logger (#648)
  • Added prefix_string config option to the import model hub script (#642)
  • Made online retail analysis notebook runnable in Colab (#641)
  • Added new movie dataset to the tests (#646)
eland - 8.11.1

Published by pquentin 11 months ago

Added

  • Make demo notebook runnable in Colab (#630)

Changed

  • Bump Shap version to 0.43 (#636)

Fixed

  • Fix failed import of Sentence Transformer RoBERTa models (#637)
eland - 8.11.0

Published by pquentin 11 months ago

Added

  • Support E5 small multilingual model (#625)

Changed

  • Stream writes in ed.DataFrame.to_csv() (#579)
  • Improve memory estimation for NLP models (#568)

Fixed

  • Fixed deprecations in preparation of Pandas 2.0 support (#602, #603, contributed by @bartbroere)
eland - 8.10.1

Published by pquentin about 1 year ago

Fixed

  • Fixed direct usage of TransformerModel (#619)
eland - 8.10.0

Published by pquentin about 1 year ago

Added

  • Published pre-built Docker images to docker.elastic.co/eland/eland (#613)
  • Allowed importing private HuggingFace models (#608)
  • Added Apple Silicon (arm64) support to Docker image (#615)
  • Allowed importing some DPR models like ance-dpr-context-multi (#573)
  • Allowed using the Pandas API without monitoring/main permissions (#581)

Changed

  • Updated Docker image to Debian 12 Bookworm (#613)
  • Reduced Docker image size by not installing unused PyTorch GPU support on amd64 (#615)
  • Reduced model chunk size to 1MB (#605)

Fixed

  • Fixed deprecations in preparation of Pandas 2.0 support (#593, #596)
eland - Release 8.9.0

Published by ezimuel about 1 year ago

Added

  • Simplify embedding model support and loading #569
  • Make eland_import_hub_model easier to find on Windows #559
  • Update trained model inference endpoint #556
  • Add BertJapaneseTokenizer support with bert_ja tokenization configuration #534
  • Add ability to upload xlm-roberta tokenized models #518
  • Tolerate different model output formats when measuring embedding size #535
  • Generate valid NLP model id from file path #541
  • Upgrade torch to 1.13.1 and check the cluster version before uploading a NLP model #522
  • Set embedding_size config parameter for Text Embedding models #532
  • Add support for the pass_through task #526

Fixed

  • Fixed black to comply with the code style #557
  • Fixed No module named 'torch' #553
  • Fix autosummary directive by removing hack autosummaries #548
  • Prevent TypeError with None check #525
eland - 8.7.0

Published by sethmlarson over 1 year ago

Added

Fixed

eland - 8.3.0

Published by sethmlarson over 2 years ago

Added

  • Added a new NLP model task type "auto" which infers the task type based on model configuration and architecture (#475)

Changed

  • Changed required version of 'torch' package to >=1.11.0,<1.12 to match required PyTorch version for Elasticsearch 8.3 (was >=1.9.0,<2) (#479)
  • Changed the default value of the --task-type parameter for the eland_import_hub_model CLI to be "auto" (#475)

Fixed

  • Fixed decision tree classifier serialization to account for probabilities (#465)
  • Fixed PyTorch model quantization (#472)
eland - 8.2.0

Published by sethmlarson over 2 years ago

Added

  • Added support for passing Cloud ID via --cloud-id to eland_import_hub_model CLI tool (#462)
  • Added support for authenticating via --es-username, --es-password, and --es-api-key to the eland_import_hub_model CLI tool (#461)
  • Added support for XGBoost 1.6 (#458)
  • Added support for question_answering NLP tasks (#457)
eland - 8.1.0

Published by sethmlarson over 2 years ago

Added

  • Added support for eland.Series.unique() (#448, contributed by @V1NAY8)
  • Added --ca-certs and --insecure options to eland_import_hub_model for configuring TLS (#441)
eland - 8.0.0

Published by sethmlarson over 2 years ago

Added

  • Added support for Natural Language Processing (NLP) models using PyTorch (#394)
  • Added new extra eland[pytorch] for installing all dependencies needed for PyTorch (#394)
  • Added a CLI script eland_import_hub_model for uploading HuggingFace models to Elasticsearch (#403)
  • Added support for v8.0 of the Python Elasticsearch client (#415)
  • Added a warning if Eland detects it's communicating with an incompatible Elasticsearch version (#419)
  • Added support for number_samples to LightGBM and Scikit-Learn models (#397, contributed by @V1NAY8)
  • Added ability to use datetime types for filtering dataframes (#284, contributed by @Fju)
  • Added pandas datetime64 type to use the Elasticsearch date type (#425, contributed by @Ashton-Sidhu)
  • Added es_verify_mapping_compatibility parameter to disable schema enforcement with pandas_to_eland (#423, contributed by @Ashton-Sidhu)

Changed

  • Changed to_pandas() to only use Point-in-Time and search_after instead of using Scroll APIs for pagination.
eland - 8.0.0-beta1

Published by sethmlarson almost 3 years ago

Added

Changed

  • Changed to_pandas() to only use Point-in-Time and search_after instead of using Scroll APIs
    for pagination.
eland - 7.14.1b1

Published by sethmlarson about 3 years ago

Added

  • Added support for DataFrame.iterrows() and DataFrame.itertuples() (#380, contributed by @kxbin)

Performance

  • Simplified result collectors to increase performance transforming Elasticsearch results to pandas (#378, contributed by @V1NAY8)
  • Changed search pagination function to yield batches of hits (#379)
eland - 7.14.0b1

Published by sethmlarson about 3 years ago

Added

  • Added support for Pandas 1.3.x (#362, contributed by @V1NAY8)
  • Added support for LightGBM 3.x (#362, contributed by @V1NAY8)
  • Added DataFrame.idxmax() and DataFrame.idxmin() methods (#353, contributed by @V1NAY8)
  • Added type hints to eland.ndframe and eland.operations (#366, contributed by @V1NAY8)

Removed

  • Removed support for Pandas <1.2 (#364)
  • Removed support for Python 3.6 to match Pandas (#364)

Changed

  • Changed paginated search function to use Point-in-Time and Search After features instead of Scroll when connected to Elasticsearch 7.12+ (#370 and #376, contributed by @V1NAY8)
  • Optimized the FieldMappings.aggregate_field_name() method (#373, contributed by @V1NAY8)
eland - 7.13.0b1

Published by sethmlarson over 3 years ago

Added

  • Added DataFrame.quantile(), Series.quantile(), and DataFrameGroupBy.quantile() aggregations (#318 and #356, contributed by @V1NAY8)

Changed

  • Changed the error raised when es_index_pattern doesn't point to any indices to be more user-friendly (#346)

Fixed

  • Fixed a warning about conflicting field types when wildcards are used in es_index_pattern (#346)
  • Fixed sorting when using DataFrame.groupby() with dropna (#322, contributed by @V1NAY8)
  • Fixed deprecated usage numpy.int in favor of numpy.int_ (#354, contributed by @V1NAY8)