sense2vec

🦆 Contextually-keyed word vectors

MIT License

Downloads
2.7K
Stars
1.6K
Committers
18

Bot releases are hidden (Show)

sense2vec - v2.0.2 Latest Release

Published by shadeMe over 1 year ago

sense2vec - v2.0.1

Published by adrianeboyd almost 2 years ago

  • In the sense2vec.teach prodigy recipe: only fail if no seeds are available.
  • Extend support for wasabi to v1.1.x.
sense2vec - v2.0.0

Published by ines over 3 years ago

  • Update component and internals for spaCy v3.
sense2vec - v1.0.3

Published by ines over 3 years ago

  • Various small fixes and improvements.
  • Improve training scripts.
  • Fix issue #102: split binary .spacy files.
  • Fix issue #118: Fix typo in s2v_other_senses.

Thanks to @ahalterman, @dshefman1 and @Anxo06 for the pull requests!

sense2vec - v1.0.2: Fix deserialization of components

Published by ines almost 5 years ago

🔴 Bug fixes

  • Add defaults for config if attributes are not included in saved model.
  • Fix serialization and deserialization of string store in component.
sense2vec - v1.0.1: Fix caching bug

Published by ines almost 5 years ago

🔴 Bug fixes

  • Fix bug that'd cause the scores to not be read correctly from precomputed most_similar caches.

✨ New features and improvements

  • Completely rewrite package from scratch.
  • Replace built-in vector storage with spaCy's Vectors, making this package a pure Python package and allowing easy out-of-the-box serialization of vectors.
  • Add fully serializable spaCy pipeline component and extension attributes.
  • Add new methods get_best_sense and get_other_senses and improve most_similar.
  • Add script for precomputing index of nearest neighbors for super fast "most similar" queries.
  • Add annotation recipes for Prodigy to easily create word lists and match patterns from similar phrases using sense2vec vectors (like the terms.teach recipe, just with multi-word expressions).
  • New and more efficient training and preprocessing scripts using GloVe and fastText.

⚠️ Backwards incompatibilities

  • The sense2vec.load method has been removed. Use Sense2Vec.from_disk instead.
  • The previous VectorMap and VectorStorage have been removed.
  • This package now requires Python 3.6+.
  • This update requires a new vectors format (see attached files).

📖 Documentation and examples

  • Rewrite README from scratch and include full API docs.

👥 Contributors

Thanks to @kabirkhan for contributing the initial Prodigy recipes!

sense2vec - v1.0.0a10

Published by ines almost 5 years ago

sense2vec - v1.0.0a9

Published by ines almost 5 years ago

sense2vec - v1.0.0a8

Published by ines almost 5 years ago

sense2vec - v1.0.0a7

Published by ines almost 5 years ago

sense2vec - v1.0.0a6

Published by ines almost 5 years ago

sense2vec - v1.0.0a5

Published by ines almost 5 years ago

sense2vec - v1.0.0a4

Published by ines almost 5 years ago

sense2vec - v1.0.0a3

Published by ines almost 5 years ago

⚠️ This is an alpha release and not yet ready for production. You can download sense2vec via pip by specifying the exact version.

pip install sense2vec==1.0.0a2

The converted Reddit vectors (trained on all comments of 2015) are attached to this release as a .tar.gz file. For more details and usage instructions, see the README.


✨ New features and improvements

  • Completely rewrite package from scratch.
  • Replace built-in vector storage with spaCy's Vectors, making this package a pure Python package and allowing easy out-of-the-box serialization of vectors.
  • Add fully serializable spaCy pipeline component and extension attributes.
  • Add new methods get_best_sense and get_other_senses and improve most_similar.
  • Add annotation recipes for Prodigy to easily create word lists and match patterns from similar phrases using sense2vec vectors (like the terms.teach recipe, just with multi-word expressions).
  • New and more efficient training and preprocessing scripts using GloVe.

⚠️ Backwards incompatibilities

  • The sense2vec.load method has been removed. Use Sense2Vec.from_disk instead.
  • The previous VectorMap and VectorStorage have been removed.
  • This package now requires Python 3.6+.
  • This update requires a new vectors format (see attached .tar.gz).

📖 Documentation and examples

  • Rewrite README from scratch and include full API docs.

👥 Contributors

Thanks to @kabirkhan for contributing the Prodigy recipes!

sense2vec - v1.0.0a1: Update sense2vec for spaCy v2.1.x or standalone use

Published by ines about 5 years ago

⚠️ This is an alpha release and not yet ready for production. You can download sense2vec via pip by specifying the exact version.

pip install sense2vec==1.0.0a1

Note that the library doesn't depend on spaCy anymore, so you might have to install spaCy and the English model separately. The Reddit vectors (trained on all comments of 2015) are attached to this release as a .tar.gz file. For more details and usage instructions, see the README.


✨ New features and improvements

  • NEW: Remove spaCy dependency and allow standalone use of the sense2vec library.
  • NEW: Include spaCy v2.x pipeline component to add sense2vec-compatible token merging and token attributes and methods.
  • Attach reddit_vectors model to release and make it easier to download and load in models.

📖 Documentation and examples

  • Rewrite README from scratch and include full API docs.

🚧 Todo

  • Replace VectorMap implementation with spaCy's Vectors class.
  • Don't merge tokens at runtime and adjust extension attributes accordingly.
  • Update training and pre-processing scripts for spaCy v2.x.
  • Retrain vectors on more data.
sense2vec - v1.0.0a0: Update sense2vec for spaCy v2.x or standalone use

Published by ines over 6 years ago

⚠️ This is an alpha release and not yet ready for production. You can download sense2vec via pip by specifying the exact version.

pip install sense2vec==1.0.0a0

Note that the library doesn't depend on spaCy anymore, so you might have to install spaCy and the English model separately. The Reddit vectors (trained on all comments of 2015) are attached to this release as a .tar.gz file. For more details and usage instructions, see the README.


✨ New features and improvements

  • NEW: Remove spaCy dependency and allow standalone use of the sense2vec library.
  • NEW: Include spaCy v2.x pipeline component to add sense2vec-compatible token merging and token attributes and methods.
  • Attach reddit_vectors model to release and make it easier to download and load in models.

📖 Documentation and examples

  • Rewrite README from scratch and include full API docs.

🚧 Todo

  • Update training and pre-processing scripts for spaCy v2.x.