Library for exploring and validating machine learning data
APACHE-2.0 License
Bot releases are hidden (Show)
Published by rtg0795 6 months ago
tensorflow>=2.15,<2.16
.Published by vkarampudi 6 months ago
macos_arm64
config setting to the TFDV build file. NOTE: At thistensorflow~=2.15.0
.apache-beam[gcp]>=2.53.0,<3
for Python 3.11 and onapache-beam[gcp]>=2.47.0,<3
for 3.9 and 3.10.protobuf>=4.25.2,<5
for Python 3.11 and on protobuf>3.20.3,<5
Published by rtg0795 about 1 year ago
pyarrow>=10,<11
.apache-beam>=2.47,<3
.numpy>=1.22.0
.tensorflow>=2.13.0,<3
.Published by rtg0795 over 1 year ago
HistogramSelection
to allow numeric drift/skewstatistics_io_impl
and default_record_sink
(not part of public API).numpy~=1.22.0
.pyfarmhash>=0.2.2,<0.4
.tensorflow>=2.12.0,<2.13
.protobuf>=3.20.3,<5
.tfx-bsl>=1.13.0,<1.14.0
.tensorflow-metadata>=1.13.1,<1.14.0
.Published by venkat2469 almost 2 years ago
tensorflow>=2.11,<3
tfx-bsl>=1.12.0,<1.13.0
.tensorflow-metadata>=1.12.0,<1.13.0
.Published by venkat2469 almost 2 years ago
This is the last version that supports TensorFlow 1.15.x. TF 1.15.x support
will be removed in the next version. Please check the
TF2 migration guide to migrate
to TF2.
Add a custom_validate_statistics
function to the validation API, and
support passing custom validations to validate_statistics
. Note that
custom validation is not supported on Windows.
Fix bug in implementation of semantic_domain_stats_sample_rate
.
Add beam metrics on string length
Determine whether to calculate string statistics based on the
is_categorical
field in the schema string domain.
Histograms counts should now be more accurate for distributions with few
distinct values, or frequent individual values.
Nested list length histogram counts are no longer based on the number of
values one up in the nested list hierarchy.
Support using jensen-shannon divergence to detect drift and skew for string
and categorical features.
get_drift_skew_dataframe
now includes a threshold
column.
Adds support for NormalizedAbsoluteDifference comparator.
Depends on tensorflow>=1.15.5,<2
or tensorflow>=2.10,<3
Depends on joblib>=1.2.0
.
Published by venkat2469 about 2 years ago
apache-beam[gcp]>=2.40,<3
.pyarrow>=6,<7
.tfx-bsl>=1.10.1,<1.11.0
.tensorflow-metadata>=1.10.0,<1.11.0
.Published by venkat2469 over 2 years ago
tensorflow>=1.15.5,<2
or tensorflow>=2.9,<3
tfx-bsl>=1.9.0,<1.10.0
.tensorflow-metadata>=1.9.0,<1.10.0
.Published by rtg0795 over 2 years ago
get_statistics_html
to the public API.StatsOptions.to_json
now raises an error if it encounters unsupportedapache-beam[gcp]>=2.38,<3
.tensorflow>=1.15.5,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,<3
.tensorflow-metadata>=1.8.0,<1.9.0
.tfx-bsl>=1.8.0,<1.9.0
.Published by rtg0795 over 2 years ago
DetectFeatureSkew
PTransform to the public API, which can be usedpyfarmhash>=0.2,<0.4
.tensorflow>=1.15.5,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,<3
.tensorflow-metadata>=1.7.0,<1.8.0
.tfx-bsl>=1.7.0,<1.8.0
.apache-beam[gcp]>=2.36,<3
.Published by rtg0795 over 2 years ago
numpy>=1.16,<2
.absl-py>=0.9,<2.0.0
.tensorflow>=1.15.5,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,<3
.tensorflow-metadata>=1.6.0,<1.7.0
.tfx-bsl>=1.6.0,<1.7.0
.apache-beam[gcp]>=2.35,<3
.Published by rtg0795 almost 3 years ago
apache-beam[gcp]>=2.34,<3
.tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,<3
.tensorflow-metadata>=1.5.0,<1.6.0
.tfx-bsl>=1.5.0,<1.6.0
.Published by jay90099 almost 3 years ago
pyarrow>=3
.load_anomalies_binary
utility function.pyarrow>=1,<6
.tensorflow-metadata>=1.4,<1.5
.tfx-bsl>=1.4,<1.5
.Published by dhruvesh09 about 3 years ago
apache-beam[gcp]>=2.32,<3
.tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,<3
.tfx-bsl>=1.3,<1.4
.Published by jay90099 about 3 years ago
apache-beam[gcp]>=2.31,<3
.tensorflow-metadata>=1.2,<1.3
.tfx-bsl>=1.2,<1.3
.Published by jay90099 about 3 years ago
google-cloud-bigquery>=1.28.0,<2.21
.tfx-bsl>=1.1.1,<1.2
.Published by jay90099 over 3 years ago
protobuf>=3.13,<4
.tensorflow-metadata>=1.1,<1.2
.tfx-bsl>=1.1,<1.2
.Published by jay90099 over 3 years ago
apache-beam[gcp]>=2.29,<3
.tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<3
.tensorflow-metadata>=1.0,<1.1
.tfx-bsl>=1.0,<1.1
.tfdv.validate_instance
tfdv.lift_stats_generator
tfdv.partitioned_stats_generator
tfdv.get_feature_value_slicer
compression_type
intfdv.generate_statistics_from_tfrecord
Published by dhruvesh09 over 3 years ago
apache-beam[gcp]>=2.25,!=2.26.*,<2.29
.Published by jay90099 over 3 years ago
This version is the last version before TFDV 1.0. Once 1.0, all the TFDV
public APIs (i.e. symbols in the root __init__.py
) will be subject to
semantic versioning. We are deprecating some public APIs in this version
and they will be removed in 1.0.
Sketch-based top-k/unique stats generator now is able to detect invalid
utf-8 sequences / large texts and replace them with a placeholder.
It will not suffer from memory issue usually caused by image / large text
features in the data. Note that this generator is not by default used yet.
Added StatsOptions.experimental_use_sketch_based_topk_uniques
which
enables the sketch-based top-k/unique stats generator.
display_schema
that caused domains not to be displayed.get_schema_dataframe
outputs numeric domains.tensorflow-metadata>=0.30,<0.31
.tfx-bsl>=0.30,<0.31
.tfdv.LiftStatsGenerator
is going to be removed in the next version fromStatsOptions.label_feature
tfdv.NonStreamingCustomStatsGenerator
is going to be removed in the nexttfdv.validate_instance
is going to be removed in the nexttfdv.DecodeCSV
, tfdv.DecodeTFExample
(deprecated in 0.27).feature_whitelist
in tfdv.StatsOptions
(deprecated in 0.28).feature_allowlist
instead.tfdv.get_feature_value_slicer
is deprecated.tfdv.experimental_get_feature_value_slicer
is introduced as a replacement.StatsOptions.slicing_functions
is deprecated.StatsOptions.experimental_slicing_functions
is introduced as atfdv.WriteStatisticsToText
is removed (deprecated in 0.25.0).compression_type
in tfdv.generate_statistics_from_tfrecord