NLP预/后处理工具。
APACHE-2.0 License
UNIX command-line tool for python line-based stream processing
pytextclassifier is a toolkit for text classification. 文本分类,LR,Xgboost,TextCNN,FastText,TextRNN,B...
My own personal tech cheatsheet. This covers the stuff I use quite regularly.
a friendly yet powerful LR-parser written in Python
Plumbum: Shell Combinators
Python implementations of selected Princeton Java Algorithms and Clients by Robert Sedgewick and ...
aim to use JapaneseTokenizer as easy as possible
yet another text augmentation python package
A robust caching library for Python that supports multiple storage engines, serializers, and enco...
Making sbatch more user-friendly (for python users of Jean-Zay).
Let your pipe lines flow thru the Python code in xonsh.
A collection of font engineering utilities
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
A portal page for learning experiences in Computer Programming, and more.
Python module to make testing easier, it can generate random data like names and text, run comman...