Project that aims to sentenize all the open data of Riksdagen and other sources to create an easily linkable dataset of sentences that can be refered to from Wikidata lexemes and other resources
GPL-3.0 License
Get a pragmatic assessment how understandable a German text is.
spaCy + UDPipe
skweak: A software toolkit for weak supervision applied to NLP tasks
NLP, before and after spaCy
Basically SentEval with German language downstream tasks
Simple sentence mining tool for language learning
Datasets for intent classification and entity extraction including converters.
Inference and training library for high-quality TTS models.
✔️Contextual word checker for better suggestions
The purpose of this script is to get all the senses for all the words in a SRT-file from Wikidata
Data repository for pretrained NLP models and NLP corpora.
A Reddit bot that summarizes news articles written in Spanish or English. It uses a custom built ...
Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code
Improving Language Model Performance through Smart Vocabularies