Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
APACHE-2.0 License
Published by MthwRobinson almost 2 years ago
translate_text
brick for translating text between languagesapply
method to make it easier to apply cleaners to elementsPublished by MthwRobinson almost 2 years ago
partition
Published by MthwRobinson almost 2 years ago
Text
elements to argilla
dataset classes.replace_unicode_quotes
brickpartition_html
for partitioning HTML documents.Published by yuming-long almost 2 years ago
Final
to support google colabPublished by MthwRobinson almost 2 years ago
Published by MthwRobinson almost 2 years ago
Published by MthwRobinson about 2 years ago
PDFDocument
to use the from_file
methodtransformers
.