[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language
APACHE-2.0 License
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
MASS: Masked Sequence to Sequence Pre-training for Language Generation
CodeBERT
An efficient implementation of the popular sequence models for text generation, summarization, an...
VideoX: a collection of video cross-modal models
Set-of-Mark Prompting for GPT-4V and LMMs
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g...
Grounded Language-Image Pre-training
Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabil...
Foundation Architecture for (M)LLMs
Bringing Old Photo Back to Life (CVPR 2020 oral)
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
Large-scale pretraining for dialogue
The implementation of DeBERTa