Simple Solution for Multi-Criteria Chinese Word Segmentation
GPL-3.0 License
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Chain-of-Hindsight, A Scalable RLHF Method
100+ Chinese Word Vectors 上百种预训练中文词向量
GLM (General Language Model)
Python scripts preprocessing Penn Treebank and Chinese Treebank
文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法
Data repository for pretrained NLP models and NLP corpora.
2019-SOTA简繁中文拼写检查工具:FASPell Chinese Spell Checker (Chinese Spell Check / 中文拼写检错 / 中文拼写纠错 / 中文拼写检查)
Datasets for intent classification and entity extraction including converters.
[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
Biomedical Entity Linking Benchmark
Geom3D: Geometric Modeling on 3D Structures, NeurIPS 2023
Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's...