BabelEnte: Entity Extractor and Translator using BabelFy and Babelnet.org
FoLiA Document Server - HTTP webservice backend for serving and annotating FoLiA documents using ...
Multilingual text (NLP) processing toolkit
Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informat...
A linter for prose.
Public release of the TransCoder research project https://arxiv.org/pdf/2006.03511.pdf
This system utilizes a large language model (LLM) and reflection
Simple sentence mining tool for language learning
Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code
This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost a...
Unsupervised Word Segmentation for Neural Machine Translation and Text Generation
Access a database of word frequencies, in various natural languages.
leeky - training data contamination techniques for blackbox models