ttok

Count and truncate text based on tokens

APACHE-2.0 License

Downloads
1.3K
Stars
263
Committers
1

Bot releases are visible (Hide)

ttok - 0.3 Latest Release

Published by simonw 6 months ago

  • New --allow-special option for allowing special tokens: ttok '<|endoftext|>' --encode --allow-special #13
ttok - 0.2

Published by simonw over 1 year ago

  • --encode now encodes text to integer tokens and displays them. This has been renamed from --tokens.
  • --decode against a sequence of integer tokens now turns those back into text. #7
  • --tokens for either of those options outputs the raw tokens in a format that helps show if they have a leading space or similar. #4
ttok - 0.1

Published by simonw over 1 year ago

  • Initial release. echo input.txt | ttok to count tokens, use -t 100 to truncate to 100 tokens, or --tokens to see what those integer tokens would be. #1