text

Making text a first-class citizen in TensorFlow.

APACHE-2.0 License

Downloads
7.4M
Stars
1.2K
Committers
118

Bot releases are hidden (Show)

text - v2.17.0 Latest Release

Published by rtg0795 3 months ago

Release 2.17.0

Bug Fixes and Other Changes

  • negative sampling excludes positive class
  • revert html encoding
  • much faster set-intersection based version
  • Fix notebook failure with Keras 3.
  • Remove tensorflow-macos from setup.py
  • Update tensorflow-macos to 2.16.1
  • Update version

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

Alex Shroyer, C. Antonio Sánchez, Maggie Zhang

text - v2.17.0-rc0

Published by rtg0795 4 months ago

Release 2.17.0-rc0

Bug Fixes and Other Changes

  • negative sampling excludes positive class
  • revert html encoding
  • much faster set-intersection based version
  • Fix notebook failure with Keras 3.
  • Remove tensorflow-macos from setup.py
  • Update tensorflow-macos to 2.16.1
  • Update version

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

Alex Shroyer, C. Antonio Sánchez, Maggie Zhang

text - v2.16.1

Published by rtg0795 7 months ago

Release 2.16.1

Major Features and Improvements

Breaking Changes

Bug Fixes and Other Changes

  • Update tf-text setup scripts.
  • Support resource manager scoped Sentencepiece resources.
  • Remove use_unique_shared_resource_name.
  • Remove tensorflow_text dependency on tf_hub library.
  • Fix TF patch, bump TF commit
  • Update version

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

Raviteja Gorijala

text - v2.16.0-rc0

Published by rtg0795 8 months ago

Release 2.16.0-rc0

Major Features and Improvements

Breaking Changes

Bug Fixes and Other Changes

  • Update tf-text setup scripts.
  • Support resource manager scoped Sentencepiece resources.
  • Remove use_unique_shared_resource_name.
  • Remove tensorflow_text dependency on tf_hub library.
  • Fix TF patch, bump TF commit
  • Update version

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

Raviteja Gorijala

text - v2.15.0

Published by rtg0795 11 months ago

Release 2.15.0

Bug Fixes and Other Changes

  • Update TF versions and scripts to allow consistently building against tf-nightly.
  • No public description
  • Update phrase tokenizer to handle end-punctuation.
  • Remove private Keras imports.
  • Update tensorflow_hub dependency.
  • Sprawling .pyi updates related to pybind11 PRs #4831, #4833.
  • Report unsupported tensor type in RaggedTensorToTensor in Prepare.
  • Check in generated pyi files for some py_extension targets.
  • Update version
  • Update WORKSPACE
text - v2.15.0-rc0

Published by nallave 12 months ago

Release 2.15.0-rc0

Bug Fixes and Other Changes

  • Update TF versions and scripts to allow consistently building against tf-nightly.
  • No public description
  • Update phrase tokenizer to handle end-punctuation.
  • Remove private Keras imports.
  • Update tensorflow_hub dependency.
  • Sprawling .pyi updates related to pybind11 PRs #4831, #4833.
  • Report unsupported tensor type in RaggedTensorToTensor in Prepare.
  • Check in generated pyi files for some py_extension targets.
  • Update version
  • Update WORKSPACE

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

nallave

text -

Published by nallave about 1 year ago

Release 2.14.0

Bug Fixes and Other Changes

  • Fix nullptr dereference issue in UnicodeScriptTokenizeWithOffsetOp.
  • Bump tensorflow_hub to 0.13.0
  • Add @tensorflow/docs-team to CODEOWNERS
  • Internal change
  • Update TF Text API page to emphasize KerasNLP as the API of first choice.
  • Add a note about the implementation differences.
  • Fix out-of-bounds absl::string_view handling in RegexSplitImpl
  • Disable coerce_to_valid_utf8_op_test test on mac
  • Update /text/tutorials and /text/guide to highlight KerasNLP.
  • Move remaining text tutorials to text/
  • Update /text/tutorials and /text/guide index pages to reflect updated navigation.
  • Update broken image links
  • Creates a patch to use non_hermetic python for tf text.
  • Check in generated pyi files for some py_extension targets.
  • Update ops.Tensor references to //third_party/tensorflow/python/framework/tensor.py.
  • Remove invalid stub file
  • Update tensorflow-text from 2.11 to 2.13
  • Check in generated pyi files for some py_extension targets.
  • Remove py38 classifiers in setup.py
  • Update version

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

text - v2.14.0-rc0

Published by nallave about 1 year ago

Release 2.14.0-rc0

Bug Fixes and Other Changes

  • Bump tensorflow_hub to 0.13.0
  • Add @tensorflow/docs-team to CODEOWNERS
  • Internal change
  • Update TF Text API page to emphasize KerasNLP as the API of first choice.
  • Add a note about the implementation differences.
  • Fix out-of-bounds absl::string_view handling in RegexSplitImpl
  • Disable coerce_to_valid_utf8_op_test test on mac
  • Update /text/tutorials and /text/guide to highlight KerasNLP.
  • Move remaining text tutorials to text/
  • Update /text/tutorials and /text/guide index pages to reflect updated navigation.
  • Update broken image links
  • Creates a patch to use non_hermetic python for tf text.
  • Check in generated pyi files for some py_extension targets.
  • Update ops.Tensor references to //third_party/tensorflow/python/framework/tensor.py.
  • Remove invalid stub file
  • Update tensorflow-text from 2.11 to 2.13
  • Check in generated pyi files for some py_extension targets.
  • Remove py38 classifiers in setup.py
  • Update version
text - v2.13.0

Published by rtg0795 over 1 year ago

Release 2.13.0

Bug Fixes and Other Changes

  • Update word_embeddings.ipynb with Time Series as default dashboard.
  • Python ops for new RoundRobinTrimmer kernels.
  • Fix bug where rank 1 max sequence lengths were breaking round robin trimmer.
  • Fix roundrobintrimmer not being linked in correctly.
  • Move control_flow_ops.Assert into its own file, control_flow_assert.py, as a first step in removing circular dependencies with control_flow_ops.cond.
  • Move control_flow_ops.while_loop into its own file, while_loop.py.
  • Redirect references to stack and unstack to their new location in array_ops_stack.py.
  • Prevent crashes with new trimmer op when max_sequence_length is set to a negative value.
  • Fix run_build not getting tensorflow bazel version correctly. Also removed some "set -x" that were added for debugging.
  • Add RetVec-style UTF-8 binarization
  • Allow pad_model_inputs to work with Tensors as well.
  • Move usages of python/util:util to the newly split up targets.
  • (Generated change) Update tf.Text versions and/or docs.
  • Fix typo in Transformer tutorial
  • Pin protobuf version to prevent failure. See https://github.com/tensorflow/text/issues/1115 for more info.
  • Avoid an expensive temporary std::string.
  • Callout the differences compared to the reference paper.
  • Remove decoding_api.ipynb from tf-text docs (this belongs to TF-Models)
  • Altered build scripts to use python3 before python.
  • Removes un-used tensorflow/core/platform:status dependency from round_robin_trimmer.
  • Remove usages of tsl::Status::error_message.
  • Use Github API to fetch full commit hash from short
  • Avoid using jq in prepare_tf_dep.sh since it breaks macos builds
  • Update version
text - v2.13.0-rc0

Published by rtg0795 over 1 year ago

Release 2.13.0-rc0

Bug Fixes and Other Changes

  • Update word_embeddings.ipynb with Time Series as default dashboard.
  • Python ops for new RoundRobinTrimmer kernels.
  • Fix bug where rank 1 max sequence lengths were breaking round robin trimmer.
  • Fix roundrobintrimmer not being linked in correctly.
  • Move control_flow_ops.Assert into its own file, control_flow_assert.py, as a first step in removing circular dependencies with control_flow_ops.cond.
  • Move control_flow_ops.while_loop into its own file, while_loop.py.
  • Redirect references to stack and unstack to their new location in array_ops_stack.py.
  • Prevent crashes with new trimmer op when max_sequence_length is set to a negative value.
  • Fix run_build not getting tensorflow bazel version correctly. Also removed some "set -x" that were added for debugging.
  • Add RetVec-style UTF-8 binarization
  • Allow pad_model_inputs to work with Tensors as well.
  • Move usages of python/util:util to the newly split up targets.
  • (Generated change) Update tf.Text versions and/or docs.
  • Fix typo in Transformer tutorial
  • Pin protobuf version to prevent failure. See https://github.com/tensorflow/text/issues/1115 for more info.
  • Avoid an expensive temporary std::string.
  • Callout the differences compared to the reference paper.
  • Remove decoding_api.ipynb from tf-text docs (this belongs to TF-Models)
  • Altered build scripts to use python3 before python.
  • Removes un-used tensorflow/core/platform:status dependency from round_robin_trimmer.
  • Remove usages of tsl::Status::error_message.
  • Use Github API to fetch full commit hash from short
  • Avoid using jq in prepare_tf_dep.sh since it breaks macos builds
  • Update version
text - v2.12.1

Published by rtg0795 over 1 year ago

Release 2.12.1

Major Features and Improvements

Breaking Changes

Bug Fixes and Other Changes

  • Replace usage of the tsl::Status constructor with a tsl::{error, errors}::Code.
  • Replace usage of the tsl::Status(tsl::error::Code, ...) constructor.
  • Update version
  • Pin tensorflow-datasets version to 4.8.3

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

Raviteja Gorijala

text - v2.12.0

Published by nallave over 1 year ago

Release 2.12.0

Major Features and Improvements

  • New PhraseTokenizer.
  • New ByteSplitter.split_by_offsets which splits a string using byte offsets.
  • New concatenate_segments op.

Bug Fixes and Other Changes

  • Updated kernel code and Python API for BoiseTagsToOffsets op
  • Fix the bug that we should not re-build the config in the create function.
  • Register kernel and ops for phrase tokenizer.
  • fix the issue of conversion.
  • Fix typos in nmt_with_attention.ipynb
  • MacOS TF library was renamed. Update build configuration.
  • Update tokenization_layers_test.py
  • (Generated change) Update tf.Text versions and/or docs.
  • Update TF Text's TF Lite guide with ops that are convertible to TF Lite.
  • Update transformer test size.
  • Fix typos in uncertainty_quantification_with_sngp_bert.ipynb
  • (Generated change) Update tf.Text versions and/or docs.
  • Adds LastNItemSelector an ItemSelector that selects the last n items in the batch.
  • Temporarily remove tests for EOS offset since this is being changed in SP.
  • Update test files for new ICU version.
  • New helper function in the Op Kernel Shim for writing out data to the output tensors.
  • Adds configuration flags to enable switch to Fast Wordpiece Tokenizer implementation alternative for on device
  • New kernels to enable TF Lite conversion for SentenceFragmenterV2 op.
  • Fix possible heap overflow bug in sentence fragmenter op.
  • Deprecate PY37 support for TF-Text
  • Fix BUILD file by moving tf dep in the appropriate place for FBN to prevent conflicts when building on mobile.
  • Clean up a couple dependencies in the kernel BUILD file.
  • C++ API for new kernel for the RoundRobinTrimmer which fixes a bug and makes it available for conversion to TF Lite.
  • New kernels for the RoundRobinTrimmer which fixes a bug and makes it available for conversion to TF Lite.
  • Add two functions to implementations of the OpKernelShim for accessing the name & doc string. Accessing internals directly causes problems when trying to use techniques like Object composition as the op template. In particular, this change is needed for improvements to the polymorphic wrapper.
  • Allow int32 or int64 as types for RoundRobinTrimmer ops' splits.
  • Extend RoundRobinTrimmer kernels to allow any type as the value.
  • Return empty results if get_offsets is false.
  • Skip-uncompressing of bazel to try and locate error for mac ci tests.
  • Fix scraping full commit from short commit sha
  • Update tensorflow-text notebooks from 2.8 to 2.11
  • Fix bazel version scrapping logic for .bazelversion in install_bazel.sh
  • Fix conditional so it works better with Apple silicon. See issue #1077 for more details.
  • Force osname check to always be in lower-case. See #1077

Thanks to our Contributors
This release contains contributions from many people at Google, as well as:
synandi, tilakrayal

text - v2.12.0-rc0

Published by rtg0795 over 1 year ago

Release 2.12.0-rc0

Bug Fixes and Other Changes

  • BOISE TF op:
    • Add kernel code and Python API for BoiseTagsTpOffsets op
  • Other:
    • Internal change
    • Add model builder for phrase tokenizer.
    • Fix the bug that we should not re-build the config in the create function.
    • Register kernel and ops for phrase tokenizer.
    • fix the issue of conversion.
    • Fix typos in nmt_with_attention.ipynb
    • Fix broken link in transformer.ipynb
    • MacOS TF library was renamed. Update build configuration.
    • Update tokenization_layers_test.py
    • (Generated change) Update tf.Text versions and/or docs.
    • Update TF Text's TF Lite guide with ops that are convertible to TF Lite.
    • Update transformer test size.
    • Fix typos in uncertainty_quantification_with_sngp_bert.ipynb
    • (Generated change) Update tf.Text versions and/or docs.
    • Adds LastNItemSelector an ItemSelector that selects the last n items in the batch.
    • Add split_by_offsets method to ByteSplitter.
    • Add split_by_offsets method to ByteSplitter.
    • Temporarily remove tests for EOS offset since this is being changed in SP.
    • Update test files for new ICU version.
    • New helper function in the Op Kernel Shim for writing out data to the output tensors.
    • Add split_by_offsets method to ByteSplitter.
    • Adds configuration flags to enable switch to Fast Wordpiece Tokenizer implementation alternative for on device
    • New kernels to enable TF Lite conversion for SentenceFragmenterV2 op.
    • Fix possible heap overflow bug in sentence fragmenter op.
    • Deprecate PY37 support for TF-Text
    • Fix BUILD file by moving tf dep in the appropriate place for FBN to prevent conflicts when building on mobile.
    • Clean up a couple dependencies in the kernel BUILD file.
    • C++ API for new kernel for the RoundRobinTrimmer which fixes a bug and makes it available for conversion to TF Lite.
    • New kernels for the RoundRobinTrimmer which fixes a bug and makes it available for conversion to TF Lite.
    • Add two functions to implementations of the OpKernelShim for accessing the name & doc string. Accessing internals directly causes problems when trying to use techniques like Object composition as the op template. In particular, this change is needed for improvements to the polymorphic wrapper.
    • Allow int32 or int64 as types for RoundRobinTrimmer ops' splits.
    • Extend RoundRobinTrimmer kernels to allow any type as the value.
    • internal
    • license rules update
    • Remove license changes for now since it has broken the builds.
    • Return empty results if get_offsets is false.
    • tf_text: Add a "concatenate_segments" function.
    • Skip-uncompressing of bazel to try and locate error for mac ci tests.
    • Fix scraping full commit from short commit sha
    • Update tensorflow-text notebooks from 2.8 to 2.11
    • Fix bazel version scrapping logic for .bazelversion in install_bazel.sh

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

synandi, tilakrayal

text - v2.11.0

Published by rtg0795 almost 2 years ago

Release 2.11.0

Major Features and Improvements

  • Added op for converting to/from BOISE labels to offsets

Bug Fixes and Other Changes

  • tensorflow:
    • Moving logging.h and bitmap from tf/core to tf/tsl.
  • BOISE TF op:
    • Add main C++ functions for converting to/from BOISE labels to offsets
    • Add main C++ functions for converting to/from BOISE labels to offsets
    • Add kernel code and Python API for OffsetsToBoiseTags op
  • Other:
    • Add link to KPLs, fix typo in Neural machine translation with attention tutorial
    • Update README.md
    • Publish the tensorflow_models.nlp guide docs to tensorflow.org
    • Add missing dependency to constrained sequence kernel.
    • Add missing absl status dependency to sentence breaking utils.
    • Another missing absl status dependency. this time for sentence fragmenter.
    • Add absl status to sentence fragmenter v2.
    • Update pybind11 to 2.10.0 to match tensorflow.
    • Better error message for WordPiece when the vocabulary file has unicode issues.
    • Update Transformer tutorial with Keras MultiHeadAttention
    • transformers.ipynb: fix length filter and target slicing
    • transformers.ipynb: cleanup wording, create a PositionalEmbedding layer.
    • Replace tensorflow::Status::OK() with tensorflow::OkStatus().
    • Update README with note about various OS releases.
    • Cast the step type.
    • Reactivate TFLite ByteSplitter test.
    • Modify tokenizer to process pt_examples to tokenizers.pt
    • fix words alignment in documentation
    • Update nmt_with_attention:
    • transformers.ipynb: Factor out CrossAttention, GlobalSelfAttention, and CausalSelfAttention layers.
    • Switch the transformer to train with Model.fit.
    • Whitespace changes to force republish.
    • Fix tutorial display, again.
    • Update version

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

satojkovic

text - v2.11.0-rc0

Published by rtg0795 almost 2 years ago

Release 2.11.0-rc0

Bug Fixes and Other Changes

  • tensorflow:
    • Moving logging.h and bitmap from tf/core to tf/tsl.
  • BOISE TF op:
    • Add main C++ functions for converting to/from BOISE labels to offsets
    • Add main C++ functions for converting to/from BOISE labels to offsets
    • Add main C++ functions for converting to/from BOISE labels to offsets
    • Add kernel code and Python API for OffsetsToBoiseTags op
  • Other:
    • Add link to KPLs, fix typo in Neural machine translation with attention tutorial
    • Update README.md
    • Publish the tensorflow_models.nlp guide docs to tensorflow.org
    • Add missing dependency to constrained sequence kernel.
    • Add missing absl status dependency to sentence breaking utils.
    • Another missing absl status dependency. this time for sentence fragmenter.
    • Add absl status to sentence fragmenter v2.
    • Update pybind11 to 2.10.0 to match tensorflow.
    • Better error message for WordPiece when the vocabulary file has unicode issues.
    • Update Transformer tutorial with Keras MultiHeadAttention
    • transformers.ipynb: fix length filter and target slicing
    • transformers.ipynb: cleanup wording, create a PositionalEmbedding layer.
    • Replace tensorflow::Status::OK() with tensorflow::OkStatus().
    • Update README with note about various OS releases.
    • Cast the step type.
    • Reactivate TFLite ByteSplitter test.
    • Modify tokenizer to process pt_examples to tokenizers.pt
    • fix words alignment in documentation
    • Update nmt_with_attention:
    • transformers.ipynb: Factor out CrossAttention, GlobalSelfAttention, and CausalSelfAttention layers.
    • Switch the transformer to train with Model.fit.
    • Whitespace changes to force republish.
    • Add a phrase based tokenzier
    • Fix tutorial display, again.
    • Update version

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

satojkovic

text - v2.10.0

Published by rtg0795 about 2 years ago

Release 2.10.0

Major Features and Improvements

  • New ByteSplitter which tokenizes strings into bytes.
  • New tutorial: Fine tune BERT with Orbit [will be added to tensorflow.org/text soon].
  • Fixed an issue where dynamic TF Lite tensors were not getting resized correctly.

Bug Fixes and Other Changes

  • Fix typo error in subwords_tokenizer guide with text.WordpieceTokenizer
  • Fixes prepare_tf_dep.sh for OSX.
  • Add cross-links to tensorflow_models.nlp API reference.
  • (Generated change) Update tf.Text versions and/or docs.
  • Update shape inference of kernel template for fast wordpiece and activate the op test.
  • Update configure.sh for Apple Silicon.
  • Export Trimmer ABC to be usable as tf_text.Trimmer
  • Fix TensorFlow checkpoint and trackable imports.
  • Correct tutorial explanation: meaning of attention weights
  • Modernize fine_tune_bert.
  • Lint and update the Fine-tuning a BERT model tutorial
  • Use pointer for pointer math instead of iterator. Fixes c++17 compilation for regex_split on windows.
  • Add install_bazel.sh script to make it easy to install the correctly needed version of Bazel. (#946)
  • Make install_bazel.sh script executable.
  • Prevent runtime errors from happening due to invalid regular expressions using regex_split & RegexSplitter.
  • Centralize tensorflow-models docs into a top-level docs/ directory.
  • Remove link to non-existant section on tf.org.
  • Move fine_tune_bert guide.
  • Updated the spelling mistakes in subwords_tokenizer.ipynb
  • Fixes a bug caused by passing an empty tensor into SentencepieceTokenizer's detokenize method.
  • Update build for Sentencepiece. Darts was not properly being depended on.
  • Improve Sentencepiece build by adding missing dependency - str_format.
  • Fix typos and lint Neural machine translation with attention tutorial
  • Fix external link formatting, lint NMT with attention tutorial

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

gadagashwini, mnahinkhan, Steve R. Sun, synandi

text - v2.10.0-rc0

Published by rtg0795 about 2 years ago

Release 2.10.0-rc0

Major Features and Improvements

  • New ByteSplitter which tokenizes strings into bytes.
  • New tutorial: Fine tune BERT with Orbit [will be added to tensorflow.org/text soon].
  • Fixed an issue where dynamic TF Lite tensors were not getting resized correctly.

Bug Fixes and Other Changes

  • Fix typo error in subwords_tokenizer guide with text.WordpieceTokenizer
  • Fixes prepare_tf_dep.sh for OSX.
  • Add cross-links to tensorflow_models.nlp API reference.
  • (Generated change) Update tf.Text versions and/or docs.
  • Update shape inference of kernel template for fast wordpiece and activate the op test.
  • Update configure.sh for Apple Silicon.
  • Export Trimmer ABC to be usable as tf_text.Trimmer
  • Fix TensorFlow checkpoint and trackable imports.
  • Correct tutorial explanation: meaning of attention weights
  • Modernize fine_tune_bert.
  • Lint and update the Fine-tuning a BERT model tutorial
  • Use pointer for pointer math instead of iterator. Fixes c++17 compilation for regex_split on windows.
  • Add install_bazel.sh script to make it easy to install the correctly needed version of Bazel. (#946)
  • Make install_bazel.sh script executable.
  • Prevent runtime errors from happening due to invalid regular expressions using regex_split & RegexSplitter.
  • Centralize tensorflow-models docs into a top-level docs/ directory.
  • Remove link to non-existant section on tf.org.
  • Move fine_tune_bert guide.
  • Updated the spelling mistakes in subwords_tokenizer.ipynb
  • Fixes a bug caused by passing an empty tensor into SentencepieceTokenizer's detokenize method.
  • Update build for Sentencepiece. Darts was not properly being depended on.
  • Improve Sentencepiece build by adding missing dependency - str_format.
  • Fix typos and lint Neural machine translation with attention tutorial
  • Fix external link formatting, lint NMT with attention tutorial

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

gadagashwini, mnahinkhan, Steve R. Sun, synandi

text - v2.9.0

Published by broken over 2 years ago

Release 2.9

Major Features and Improvements

  • New FastBertNormalizer that improves speed for BERT normalization and is convertible to TF Lite.
  • New FastBertTokenizer that combines FastBertNormalizer and FastWordpieceTokenizer.
  • New ngrams kernel for handling STRING_JOIN reductions.

Bug Fixes and Other Changes

  • NgramsStringJoin shape inference fixed to handle unranked tensors
  • Upgrade pybind11 and reenable tests that were broken.
  • Rename a couple files to match the naming of the other tflite kernels. Also adds some deps to tflite_ops that were missing and causing an error when testing :all.
  • Add to TF Lite documentation that ngrams is a convertible op.
  • Fix public access and missing ICU data to build_fast_bert_normalizer_model and enable the disabled tests.
  • Update the doc for FastWordpieceTokenizer.
  • Refine the doc for FastWordpieceTokenizer.
  • Bug fix: make BertTokenizer work for RaggedTensors with row_splits_dtype=int32
  • Fix typo error text.WordpieceTokenizer
  • Added comma at missing places in emoticons for normalizer
  • Refactor build and test scripts to use prepare_tf_dep.sh
  • Fixes prepare_tf_dep.sh for OSX.
  • Fixed bug in setup.py that was requiring the wrong version.
  • Updated package with the correct versions of Python we release on.
  • Update documentation on TF Lite convertible ops.
  • Transition to use TF's version of bazel.
  • Transition to use TF's bazel configuration.
  • Add missing symbols for tokenization layers
  • Fix typo in text_generation.ipynb
  • Fix grammar typo
  • Allow fast wordpiece tokenizer to take in external wordpiece model.
  • Internal change
  • Improvement to guide where mean call is redundant. See https://github.com/tensorflow/text/issues/810 for more info.
  • Update broken link and fix typo in BERT-SNGP demo notebook
  • Consolidate disparate test-related files into a single testing_infra folder.
  • Pin tf-text version to guides & tutorials.
  • Fix bug in constrained sequence op. Added a check on an edge case where num_steps = 0 should do nothing and prevent it from SIGSEV crashes.
  • Remove outdated Keras tests due to them no longer making the testing utilities available.
  • Update bert preprocessing by padding correct tensors
  • Update tensorflow-text notebooks from 2.7 to 2.8
  • Optimize FastWordPiece to only generate requested outputs.
  • Add a note about byte-indexing vs character indexing.
  • Add a MAX_TOKENS to the transformer tutorial.
  • Only export tensorflow symbols from shared libs.
  • (Generated change) Update tf.Text versions and/or docs.
  • Do not run the prepare_tf_dep script for Apple M1 macs.
  • Update text_classification_rnn.ipynb
  • Fix the exported symbols for the linker test. By adding it to the share objects instead of the c++ code, it allows for the code to be compiled together in one large shared lib.
  • Implement FastBertNormalizer based on codepoint-wise mappings.
  • Add pybind for fast_bert_normalizer_model_builder.
  • Remove unused comments related to Python 2 compatibility.
  • update transformer.ipynb
  • Update toolchain & temporarily disable tf lite tests.
  • Define manylinux2014 for the new toolchain target, and have presubmits use it.
  • Move tflite build deps to custom target.
  • Add FastBertTokenizer.
  • Update bazel version to 5.1.0
  • Update TF Text to use new Ngrams kernel.
  • Don't try to set dimension if shape is unknown for ngrams.

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

Aflah, Connor Brinton, devnev39, Janak Ramakrishnan, Martin, Nathan Luehr, Pierre Dulac, Rabin Adhikari, gadagashwini, mohantym, rtg0795

text - v2.10.0-b2

Published by broken over 2 years ago

Release 2.10.0-b2

Major Features and Improvements

  • Added FastSentencepieceTokenizer which is convertible to TF Lite. Please note the op name in the graph will change, so any models trained with this version will need to be retrained when the release candidate for 2.10 is released.

Important Notes

  • This beta release is outside the normal release cycle and is meant to work with TF versions 2.8.x.
  • Again, the op name for FSP will change in future releases.
text - v2.8.2

Published by broken over 2 years ago

Release 2.8.2

Major Features and Improvements

  • 📦️ Fix macOS packaging so it works with package managers like Poetry (#838)

Bug Fixes and Other Changes

  • Package metadata updated with the correct available python versions.

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

Connor Brinton

Package Rankings
Top 1.01% on Pypi.org
Badges
Extracted from project README
PyPI version PyPI nightly version PyPI Python version Documentation Contributions welcome License