llama-tokenizer-js

JS tokenizer for LLaMA and LLaMA 2

MIT License

Stars
294
Committers
2

Bot releases are hidden (Show)

llama-tokenizer-js - v1.2.1 Latest Release

Published by belladoreai 7 months ago

TypeScript fix

llama-tokenizer-js - v1.2.0

Published by belladoreai 7 months ago

  • Add TypeScript types definition file
  • Refactor tokenizer into a Class
  • Allow passing custom vocab and merge data to tokenizer
  • Allow passing custom tests to tokenizer test runner
llama-tokenizer-js - v1.1.3

Published by belladoreai about 1 year ago

  • Fix bug in a function that was unused (so not affecting tokenizer results)
  • Support very large inputs (previous version was not guaranteed to produce correct results for inputs larger than 100 000 characters, although in practice it would almost always produce correct results for large inputs)
llama-tokenizer-js - v1.1.2

Published by belladoreai about 1 year ago

Bugfix to support Next.js and other environments where performance.now() is not available.

llama-tokenizer-js - v1.1.1

Published by belladoreai over 1 year ago

Bugfix affecting results in extremely rare cases: equal prio merges are now always performed left-to-right.

llama-tokenizer-js - v1.1.0

Published by belladoreai over 1 year ago

Add support for different runtimes

llama-tokenizer-js - v1.0.1

Published by belladoreai over 1 year ago

llama-tokenizer-js - v1.0.0

Published by belladoreai over 1 year ago