yet-another-retnet

A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (https://arxiv.org/pdf/2307.08621.pdf)

MIT License

Stars
80

Bot releases are hidden (Show)

yet-another-retnet - 0.5.1 Latest Release

Published by fkodom 11 months ago

What's Changed

Full Changelog: https://github.com/fkodom/yet-another-retnet/compare/0.5.0...0.5.1

yet-another-retnet - 0.5.0

Published by fkodom 11 months ago

Significant efficiency improvements to the chunkwise formulation, thanks to @leor-c 🎉

What's Changed

Full Changelog: https://github.com/fkodom/yet-another-retnet/compare/0.4.2...0.5.0

yet-another-retnet - 0.4.2

Published by fkodom 11 months ago

What's Changed

New Contributors

Full Changelog: https://github.com/fkodom/yet-another-retnet/compare/0.4.1...0.4.2

yet-another-retnet - 0.4.1

Published by fkodom 11 months ago

yet-another-retnet - 0.4.0

Published by fkodom about 1 year ago

What's Changed

New Contributors

Full Changelog: https://github.com/fkodom/yet-another-retnet/compare/0.3.1...0.4.0

yet-another-retnet - 0.3.1

Published by fkodom about 1 year ago

What's Changed

New Contributors

Full Changelog: https://github.com/fkodom/yet-another-retnet/compare/0.3.0...0.3.1

yet-another-retnet - 0.3.0

Published by fkodom about 1 year ago

More streamlined support for training

  • example training script
  • RetNet.forward is no longer just a wrapper for RetNet.forward_parallel. It accepts inputs, labels Tensors, and returns a loss value.
    class RetNet:
        ...
        def forward(self, inputs: Tensor, labels: Tensor) -> Tensor:
            pred = self.forward_parallel(inputs)
            criterion = nn.CrossEntropyLoss()
            return criterion(rearrange(pred, "b n c -> (b n) c"), labels.flatten())
    
  • include example TorchData datapipe -- top 100 project gutenberg books
  • example streaming text generation with trained RetNet
yet-another-retnet - 0.2.0

Published by fkodom about 1 year ago

yet-another-retnet - 0.1.3

Published by fkodom about 1 year ago

Set default layer_norm_eps=1e-6, as updated in the official implementation:
https://github.com/microsoft/torchscale/commit/2c29de0fb3e5e559181f0fb4854330c5b35961cd

yet-another-retnet - 0.1.2

Published by fkodom about 1 year ago

Remove extra complex conjugation from the relative position embedding.
Reference: https://github.com/microsoft/torchscale/issues/49

yet-another-retnet - 0.1.1

Published by fkodom about 1 year ago

Bug fix for automatic PyPI version resolver 😞

yet-another-retnet - 0.1.0

Published by fkodom about 1 year ago

First stable public release

  • End-to-end RetNet module
  • Layer modules (e.g. RetNetDecoderLayer, MultiScaleRetention)
  • Low-level retention ops (e.g. retention_parallel, retention_recurrent)
  • Inference benchmarks reproduced
yet-another-retnet - 0.1.0rc1

Published by fkodom about 1 year ago

First release candidate for PyPI

Related Projects