Generate text captions for images from their embeddings.
MIT License
Using LLMs and pre-trained caption models for super-human performance on image captioning.
The LSTM model generates captions for the input images after extracting features from pre-trained...
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
Pretrained BLIP with a similar API to CLIP.
Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tune...
Home of StarCoder: fine-tuning & inference!
Text to Image Generation (Reverse image captioning): This task is just the reverse of image capti...
Non-local Neural Networks for Video Classification
A PyTorch Lightning solution to training OpenAI's CLIP from scratch.
Tensorflow implementation of contextualized word representations from bi-directional language models
I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch ...
Home of StarCoder2!