Statistics for this project are still being loaded, please check back later.
RoBERTa中文预训练模型: RoBERTa for Chinese
The codes for recent knowledge distillation algorithms and benchmark results via TF2.0 low-level API
No Teacher BART distillation experiment for NLI tasks
Fork of huggingface/pytorch-pretrained-BERT for BERT on STILTs
Code for paper Fine-tune BERT for Extractive Summarization
Temporary remove unused tokens during training to save ram and speed.
Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners
keras implement of transformers for humans
Open-source code for paper "Dataset Distillation"
🛠️ Tools for Transformers compression using PyTorch Lightning ⚡
PyTorch implementation for Channel Distillation
[ICLR 2020] Contrastive Representation Distillation (CRD), and benchmark of recent knowledge dist...
An official implementation of "An Efficient Combinatorial Optimization Model Using Learning-to-Ra...
The code for 2020 Tencent College Algorithm Contest, and the online result ranks 1st.