AutoBERT-Zero-pytorch

AutoBERT-Zero

Paper

AutoBERT-Zero: Evolving BERT Backbone from Scratch Jiahui Gao, Hang Xu, Han shi, Xiaozhe Ren, Philip L.H. Yu, Xiaodan Liang, Xin Jiang, Zhenguo Li

Install

pip install git+https://github.com/JunnYu/AutoBERT-Zero-pytorch
or
pip install autobert

Wandb Logs

sdconv

https://wandb.ai/junyu/autobert-small/runs/2flfy8gx

light

https://wandb.ai/junyu/autobert-small/runs/howc6tps

small

results

Usage

import torch
from transformers import BertTokenizerFast

from autobert import AutoBertModelForMaskedLM

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

for model in ["junnyu/autobert-small-light","junnyu/autobert-small-sdconv"]:
    tokenizer = BertTokenizerFast.from_pretrained(model)
    model = AutoBertModelForMaskedLM.from_pretrained(model)
    model.to(device)

    text = "Beijing is the capital of [MASK]."
    inputs = tokenizer(text, return_tensors="pt")
    inputs.to(device)

    # pytorch
    with torch.no_grad():
        outputs = model(**inputs).logits[0]

    pt_outputs_sentence = ""
    for i, id in enumerate(tokenizer.encode(text)):
        if id == tokenizer.mask_token_id:
            prob, indice = outputs[i].softmax(-1).topk(k=5)
            tokens = tokenizer.convert_ids_to_tokens(indice)
            slist = []
            for p, t in zip(prob, tokens):
                slist.append(t + "+" + str(round(p.item(), 4)))
            pt_outputs_sentence += " " + "[ " + " || ".join(slist) + " ]"
        else:
            pt_outputs_sentence += " " + "".join(
                tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True)
            )

    print(pt_outputs_sentence.strip())
# light:  beijing is the capital of [ china+0.1801 || india+0.0273 || america+0.0181 || japan+0.0166 || us+0.0143 ] .
# sdconv: beijing is the capital of [ china+0.1533 || india+0.054 || delhi+0.0414 || beijing+0.0389 || london+0.022 ] .

Related Projects

pytextclassifier

pytextclassifier is a toolkit for text classification. 文本分类，LR，Xgboost，TextCNN，FastText，TextRNN，B...

28 Apr 2017 482

Chinese-BERT-wwm

Pre-Training with Whole Word Masking for Chinese BERT（中文BERT-wwm系列模型）

19 Jun 2019 9,587

bert_seq2seq

pytorch实现 Bert 做seq2seq任务，使用unilm方案,现在也可以做自动摘要，文本分类，情感分析，NER，词性标注等任务,支持t5模型，支持GPT2进行文章续写。

13 Mar 2020 1,282

bert_seq2seq_DDP

bert_seq2seq的DDP版本，支持bert、roberta、nezha、t5、gpt2等模型，支持seq2seq、ner、关系抽取等任务，无需添加额外代码，轻松启动DDP多卡训练。

23 Apr 2022 45

BERT-NER

Pytorch-Named-Entity-Recognition-with-BERT

24 Feb 2019 1,202

toeicbert

TOEIC(Test of English for International Communication) solving using pytorch-pretrained-BERT model.

28 Apr 2019 119

BERT-pytorch

Google AI 2018 BERT pytorch implementation

15 Oct 2018 6,179

attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

14 Jun 2017 8,777