transform

Input pipeline framework

APACHE-2.0 License

Downloads
479K
Stars
982
Committers
27

Bot releases are hidden (Show)

transform - TensorFlow Transform 1.15.0 Latest Release

Published by rtg0795 6 months ago

Major Features and Improvements

  • Added support for sparse labels in AMI vocabulary computation.

Bug Fixes and Other Changes

  • Bumped the Ubuntu version on which tensorflow_transform is tested to 20.04
    (previously was 16.04).
  • Explicitly use Keras 2 or `tf_keras`` if Keras 3 is installed.
  • Added python 3.11 support.
  • Depends on tensorflow 2.15.
  • Enable passing tf.saved_model.SaveOptions to model saving functionality.
  • Census and sentiment examples updated to only use Keras instead of
    estimator.
  • Depends on apache-beam[gcp]>=2.53.0,<3 for Python 3.11 and on
    apache-beam[gcp]>=2.47.0,<3 for 3.9 and 3.10.
  • Depends on protobuf>=4.25.2,<5 for Python 3.11 and on protobuf>3.20.3,<5
    for 3.9 and 3.10.

Breaking Changes

  • Existing analyzer cache is automatically invalidated.

Deprecations

  • Deprecated python 3.8 support.
transform - TensorFlow Transform 1.14.0

Published by rtg0795 about 1 year ago

Major Features and Improvements

  • Adds a reserved_tokens parameter to vocabulary APIs, a list of tokens that
    must appear in the vocabulary and maintain their order at the beginning of
    the vocabulary.

Bug Fixes and Other Changes

  • approximate_vocabulary now returns tokens with the same frequency in
    reverse lexicographical order (similarly to tft.vocabulary).
  • Transformed data batches are now sliced into smaller chunks if their size
    exceeds 200MB.
  • Depends on pyarrow>=10,<11.
  • Depends on apache-beam>=2.47,<3.
  • Depends on numpy>=1.22.0.
  • Depends on tensorflow>=2.13.0,<3.

Breaking Changes

  • Vocabulary related APIs now require passing non-positional parameters by
    key.

Deprecations

  • N/A
transform - TensorFlow Transform 1.13.0

Published by rtg0795 over 1 year ago

Major Features and Improvements

  • RaggedTensors can now be automatically inferred for variable length
    features by setting represent_variable_length_as_ragged=true in TFMD
    schema.
  • New experimental APIs added for annotating sparse output tensors:
    tft.experimental.annotate_sparse_output_shape and
    tft.experimental.annotate_true_sparse_output.
  • DatasetKey.non_cacheable added to allow for some datasets to not produce
    cache. This may be useful for gradual cache generation when operating on a
    large rolling range of datasets.
  • Vocabularies produced by compute_and_apply_vocabulary can now store
    frequencies. Controlled by the store_frequency parameter.

Bug Fixes and Other Changes

  • Depends on numpy~=1.22.0.
  • Depends on tensorflow>=2.12.0,<2.13.
  • Depends on protobuf>=3.20.3,<5.
  • Depends on tensorflow-metadata>=1.13.1,<1.14.0.
  • Depends on tfx-bsl>=1.13.0,<1.14.0.
  • Modifies get_vocabulary_size_by_name to return a minimum of 1.

Breaking Changes

  • N/A

Deprecations

  • Deprecated python 3.7 support.
transform - TensorFlow Transform 1.12.0

Published by venkat2469 almost 2 years ago

Major Features and Improvements

  • N/A

Bug Fixes and Other Changes

  • Depends on tensorflow>=2.11,<2.12
  • Depends on tensorflow-metadata>=1.12.0,<1.13.0.
  • Depends on tfx-bsl>=1.12.0,<1.13.0.

Breaking Changes

  • N/A

Deprecations

  • N/A
transform - TensorFlow Transform 1.11.0

Published by venkat2469 almost 2 years ago

Major Features and Improvements

  • This is the last version that supports TensorFlow 1.15.x. TF 1.15.x support
    will be removed in the next version. Please check the
    TF2 migration guide to migrate
    to TF2.

  • Introduced tft.experimental.document_frequency and tft.experimental.idf
    which map each term to its document frequency and inverse document frequency
    in the same order as the terms in documents.

  • schema_utils.schema_as_feature_spec now supports struct features as a way
    to describe tf.SequenceExample data.

  • TensorRepresentations in schema used for
    schema_utils.schema_as_feature_spec can now share name with their source
    features.

  • Introduced tft_beam.EncodeTransformedDataset which can be used to easily
    encode transformed data in preparation for materialization.

Bug Fixes and Other Changes

  • Depends on tensorflow>=1.15.5,<2 or tensorflow>=2.10,<2.11
  • Depends on apache-beam[gcp]>=2.41,<3.

Breaking Changes

  • N/A

Deprecations

  • N/A
transform - TensorFlow Transform 1.10.1

Published by venkat2469 about 2 years ago

Major Features and Improvements

  • N/A

Bug Fixes and Other Changes

  • Depends on tfx-bsl>=1.10.1,<1.11.0.

Breaking Changes

  • N/A

Deprecations

  • N/A
transform - TensorFlow Transform 1.10.0

Published by venkat2469 about 2 years ago

Major Features and Improvements

  • N/A

Bug Fixes and Other Changes

  • Assign different close_to_resources resource hints to both original and
    cloned PTransforms in deep copy optimization. The reason of adding these
    resource hints is to prevent root Reads that are generated from deep copy
    being merged due to common subexpression elimination.
  • Depends on apache-beam[gcp]>=2.40,<3.
  • Depends on pyarrow>=6,<7.
  • Depends on tensorflow-metadata>=1.10.0,<1.11.0.
  • Depends on tfx-bsl>=1.10.0,<1.11.0.

Breaking Changes

  • N/A

Deprecations

  • N/A
transform - TensorFlow Transform 1.9.0

Published by rtg0795 over 2 years ago

Major Features and Improvements

  • Adds element-wise scaling support to scale_by_min_max_per_key,
    scale_to_0_1_per_key and scale_to_z_score_per_key for
    key_vocabulary_filename = None.

Bug Fixes and Other Changes

  • Depends on tensorflow>=1.15.5,<2 or tensorflow>=2.9,<2.10
  • Depends on tensorflow-metadata>=1.9.0,<1.10.0.
  • Depends on tfx-bsl>=1.9.0,<1.10.0.

Breaking Changes

  • N/A

Deprecations

  • N/A
transform - TensorFlow Transform 1.8.0

Published by rtg0795 over 2 years ago

Major Features and Improvements

  • Adds tft.DatasetMetadata and its factory method from_feature_spec as
    public APIs to be used when using the "instance dict" data format.

Bug Fixes and Other Changes

  • Depends on apache-beam[gcp]>=2.38,<3.
  • Depends on
    tensorflow>=1.15.5,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,<2.9.
  • Depends on tensorflow-metadata>=1.8.0,<1.9.0.
  • Depends on tfx-bsl>=1.8.0,<1.9.0.

Breaking Changes

  • N/A

Deprecations

  • N/A
transform - TensorFlow Transform 1.6.1

Published by rtg0795 over 2 years ago

Major Features and Improvements

  • N/A

Bug Fixes and Other Changes

  • Depends on
    tensorflow>=1.15.5,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,<2.9.

Breaking Changes

  • N/A

Deprecations

  • N/A
transform - TensorFlow Transform 1.7.0

Published by rtg0795 over 2 years ago

Major Features and Improvements

  • Introduced tft.experimental.compute_and_apply_approximate_vocabulary which
    computes and applies an approximate vocabulary.

Bug Fixes and Other Changes

  • Fix an issue when tft.experimental.approximate_vocabulary with text
    output format would not filter out tokens with newline characters.
  • Add a dummy value to the result of tft.experimental.approximate_vocabulary
    as is done for the exact variant, in order for downstream code to easily
    handle it.
  • Update tft.get_analyze_input_columns to ensure its output includes
    preprocessing_fn inputs which are not used in any TFT analyzers, but end
    up in a control dependency (automatic control dependencies are not present
    in TF1, hence this change will only affect the native TF2 implementation).
  • Assign different resource hint tags to both orginal and cloned PTransforms
    in deep copy optimization. The reason of adding these tags is to prevent
    root Reads that are generated from deep copy being merged due to common
    subexpression elimination.
  • Fixed an issue when large int64 values would be incorrectly bucketized in
    tft.apply_buckets.
  • Depends on apache-beam[gcp]>=2.36,<3.
  • Depends on
    tensorflow>=1.15.5,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,<2.9.
  • Depends on tensorflow-metadata>=1.7.0,<1.8.0.
  • Depends on tfx-bsl>=1.7.0,<1.8.0.

Breaking Changes

  • N/A

Deprecations

  • N/A
transform - TensorFlow Transform 1.4.1

Published by rtg0795 over 2 years ago

Major Features and Improvements

  • N/A

Bug Fixes and Other Changes

  • Depends on future package.

Breaking Changes

  • N/A

Deprecations

  • N/A
transform - TensorFlow Transform 1.6.0

Published by jay90099 over 2 years ago

Major Features and Improvements

  • Introduced tft.experimental.get_vocabulary_size_by_name that can retrieve
    the size of a vocabulary computed using tft.vocabulary within the
    preprocessing_fn.
  • tft.experimental.ptransform_analyzer now supports analyzer cache using the
    newly added tft.experimental.CacheablePTransformAnalyzer container.
  • tft.bucketize_per_key now supports weights.

Bug Fixes and Other Changes

  • Depends on numpy>=1.16,<2.
  • Depends on apache-beam[gcp]>=2.35,<3.
  • Depends on absl-py>=0.9,<2.0.0.
  • Depends on tensorflow-metadata>=1.6.0,<1.7.0.
  • Depends on tfx-bsl>=1.6.0,<1.7.0.
  • Depends on
    tensorflow>=1.15.5,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,<3.

Breaking Changes

  • N/A

Deprecations

  • N/A
transform - TensorFlow Transform 1.5.0

Published by jay90099 almost 3 years ago

Major Features and Improvements

  • Introduced tft.experimental.approximate_vocabulary analyzer that is an
    approximate version of tft.vocabulary which is more efficient with smaller
    number of unique elements or top_k threshold.

Bug Fixes and Other Changes

  • Raise a RuntimeError if order of analyzers in traced Tensorflow Graph is
    non-deterministic in TF2.
  • Fix issue where a tft.experimental.ptransform_analyzer's output dtype
    could be propagated incorrectly if it was a primitive as opposed to
    np.ndarray.
  • Depends on apache-beam[gcp]>=2.34,<3.
  • Depends on
    tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,<2.8.
  • Depends on tensorflow-metadata>=1.5.0,<1.6.0.
  • Depends on tfx-bsl>=1.5.0,<1.6.0.

Breaking Changes

  • N/A

Deprecations

  • N/A
transform - TensorFlow Transform 1.4.0

Published by jay90099 almost 3 years ago

Major Features and Improvements

  • Added tf.RaggedTensor support to all analyzers and mappers with
    reduce_instance_dims=True.

Bug Fixes and Other Changes

  • Fix re-loading a transform graph containing pyfuncs exported as a TF1
    SavedModel(added using tft.apply_pyfunc) in TF2.
  • Depends on pyarrow>=1,<6.
  • Depends on tensorflow-metadata>=1.4.0,<1.5.0.
  • Depends on tfx-bsl>=1.4.0,<1.5.0.
  • Depends on apache-beam[gcp]>=2.33,<3.

Breaking Changes

  • N/A

Deprecations

  • Deprecated python 3.6 support.
transform - TensorFlow Transform 1.3.0

Published by dhruvesh09 about 3 years ago

Major Features and Improvements

  • N/A

Bug Fixes and Other Changes

  • tft.quantiles, tft.mean and tft.var now ignore NaNs and infinite input
    values. Previously, these would lead to incorrect output calculation.
  • Improved error message for tft_beam.AnalyzeDataset,
    tft_beam.AnalyzeAndTransformDataset and tft_beam.AnalyzeDatasetWithCache
    when the input metadata is empty.
  • Added best-effort TensorFlow Decision Forests (TF-DF) and Struct2Tensor op
    registration when loading transformation graphs.
  • Depends on
    tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,<2.7.
  • Depends on tfx-bsl>=1.3.0,<1.4.0.

Breaking Changes

  • Existing tft.mean and tft.var caches are automatically invalidated.

Deprecations

  • N/A
transform - TensorFlow Transform

Published by dhruvesh09 about 3 years ago

Major Features and Improvements

  • Added RaggedTensor support to output schema inference and transformed
    tensors conversion to instance dicts and pa.RecordBatch with TF 2.x.

Bug Fixes and Other Changes

  • Depends on apache-beam[gcp]>=2.31,<3.
  • Depends on tensorflow-metadata>=1.2.0,<1.3.0.
  • Depends on tfx-bsl>=1.2.0,<1.3.0.

Breaking Changes

  • N/A

Deprecations

  • N/A
transform - TensorFlow Transform 1.1.1

Published by dhruvesh09 over 3 years ago

Major Features and Improvements

  • N/A

Bug Fixes and Other Changes

  • Depends on google-cloud-bigquery>=1.28.0,<2.21.
  • Depends on tfx-bsl>=1.1.1,<1.2.0.

Breaking Changes

  • N/A

Deprecations

  • N/A
transform - TensorFlow Transform 1.1.0

Published by dhruvesh09 over 3 years ago

Major Features and Improvements

  • Improved resource usage for tft.vocabulary when top_k is set by removing
    stages performing repetitive sorting.

Bug Fixes and Other Changes

  • Support invoking Keras models inside the preprocessing_fn using
    tft.make_and_track_object when force_tf_compat_v1=False with TF2
    behaviors enabled.
  • Fix an issue when computing the metadata for a function with automatic
    control dependencies added where dependencies on inputs which should not be
    evaluated was being retained.
  • Census TFT example: wrapped table initialization with a tf.init_scope() in
    order to avoid reinitializing the table for each batch of data.
  • Stopped depending on six.
  • Depends on protobuf>=3.13,<4.
  • Depends on tensorflow-metadata>=1.1.0,<1.2.0.
  • Depends on tfx-bsl>=1.1.0,<1.2.0.

Breaking Changes

  • N/A

Deprecations

  • N/A
transform - TensorFlow Transform 1.0.0

Published by dhruvesh09 over 3 years ago

Major Features and Improvements

  • N/A

Bug Fixes and Other Changes

  • Depends on apache-beam[gcp]>=2.29,<3.
  • Depends on
    tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<2.6.
  • Depends on tensorflow-metadata>=1.0.0,<1.1.0.
  • Depends on tfx-bsl>=1.0.0,<1.1.0.

Breaking Changes

  • tft.ptransform_analyzer has been moved under tft.experimental. The order
    of args in the API has also been changed.
  • tft_beam.PTransformAnalyzer has been moved under tft_beam.experimental.
  • The default value of the drop_unused_features parameter to
    TFTransformOutput.transform_raw_features is now True.

Deprecations

  • N/A
Package Rankings
Top 0.84% on Pypi.org
Top 6.73% on Proxy.golang.org
Badges
Extracted from project README
Python PyPI Documentation