문장단위로 분절된 한국어 위키피디아 코퍼스. Releases에서 다운로드 받거나 tfds-korean으로 사용해주세요.
Statistics for this project are still being loaded, please check back later.
한국어 문장 띄어쓰기(삭제/추가) 모델입니다. 데이터 준비 후 직접 학습이 가능하도록 작성하였습니다.
Tools for working with the Yle corpus
Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code
문장단위로 분절된 나무위키 데이터셋. Releases에서 다운로드 받거나, tfds-korean을 통해 다운로드 받으세요.
Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>
Code for generating synthetic text images as described in "Synthetic Data for Text Localisation i...
Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's...
pytextclassifier is a toolkit for text classification. 文本分类,LR,Xgboost,TextCNN,FastText,TextRNN,B...
AiLearning:数据分析+机器学习实战+线性代数+PyTorch+NLTK+TF2
Transformer-based Text Auto-encoder (T-TA) using TensorFlow 2.
Automatic extraction of edited sentences from text edition histories.
Chinese version of GPT2 training code, using BERT tokenizer.
2019-SOTA简繁中文拼写检查工具:FASPell Chinese Spell Checker (Chinese Spell Check / 中文拼写检错 / 中文拼写纠错 / 中文拼写检查)
Free English to Chinese Dictionary Database
利用Python实现中文文本关键词抽取,分别采用TF-IDF、TextRank、Word2Vec词聚类三种方法。