[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language
APACHE-2.0 License
VideoX: a collection of video cross-modal models
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Grounded Language-Image Pre-training
The implementation of DeBERTa
CodeBERT
Foundation Architecture for (M)LLMs
Set-of-Mark Prompting for GPT-4V and LMMs
This repository contains resources for accessing the official benchmarks, codes, and checkpoints ...
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g...
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
An efficient implementation of the popular sequence models for text generation, summarization, an...
MASS: Masked Sequence to Sequence Pre-training for Language Generation
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
Bringing Old Photo Back to Life (CVPR 2020 oral)
Large-scale pretraining for dialogue