Playing around with Sino-Korean words
General Assembly's 2015 Data Science course in Washington, DC
Uses frequency analysis to summarize text.
2019-SOTA简繁中文拼写检查工具:FASPell Chinese Spell Checker (Chinese Spell Check / 中文拼写检错 / 中文拼写纠错 / 中文拼写检查)
A Hanzi learning suite, with levels based on Hanzi Level Project, aka. another attempt to clone W...
100+ Chinese Word Vectors 上百种预训练中文词向量
Chinese version of GPT2 training code, using BERT tokenizer.
自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embeddin...
Datasets for intent classification and entity extraction including converters.
AiLearning:数据分析+机器学习实战+线性代数+PyTorch+NLTK+TF2
(Japanese language) Tries to determine the readings of individual characters in a word, given its...
The data used by Hanzi Writer for Japanese
文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法
Generate compositions, supercompositions and variants for a given Hanzi / Kanji
中文分词
Improving Language Model Performance through Smart Vocabularies