mipt-nlp2022

NLP course, MIPT

Course instructors

Anton Emelianov ([email protected], @king_menin), Albina Akhmetgareeva ([email protected])

Videos here

Exam questions here

Mark

final_mark=sum_i (max_score(HW_i)) / count(HWs) * 10, i==1..3

Lecture schedule

Week 1

Lecture: Intro to NLP
Practical: Text preprocessing,
Video

Week 2

Lecture: Word embeddings

Distributional semantics. Count-based (pre-neural) methods. Word2Vec: learn vectors. GloVe: count, then learn. N-gram (collocations) RusVectores. t-SNE.

Practical: word2vec, fasttext
HW1
Video: lecture, seminar

Week 3

Lecture: RNN + CNN, Text classification

Neural Language Models: Recurrent Models, Convolutional Models. Text classification (architectures)

Practical: Classification with LSTM, CNN,
Video

Week 4

Lecture: Language modelling and NER

Task description, methods (Markov Model, RNNs), evaluation (perplexity), Sequence Labelling (NER, pos-tagging, chunking etc.) N-gram language models, HMM, MEMM, CRF

Practical: NER,
Video

Week 5

Lecture: Machine translation, Seq2seq, Attention, Transformers

Basics: Encoder-Decoder framework, Inference (e.g., beam search), Eval (bleu). Attention: general, score functions, models. Bahdanau and Luong models. Transformer: self-attention, masked self-attention, multi-head attention.

Week 6

Lecture: Transfer learning in NLP

Bertology (BERT, GPT-s, t5, etc.), Subword Segmentation (BPE), Evaluation of big LMs.

Practical: transformers models for classification task,
Video

Week 7

Lecture & Practical: How to train big models? Distributed training

Training Multi-Billion Parameter Language Models. Model Parallelism. Data Parallelism.

Practical: DDP example
Video

Week 8

Lecture: Question answering
Practical: seminar QA , seminar chat-bots
Video

Squads (one-hop, multi-hop), architectures, retrieval and search, chat-bots

Week 9

Lecture: Summarization, simplification, paraphrasing
Practical: summarization seminar
HW3, https://www.kaggle.com/c/mipt-nlp-hw3-2022
Video

Week 10

Lecture: Multimodal NLP
Video

Recommended Resources

En

На русском (и про русский, в основном)

Literature

Manning, Christopher D., and Hinrich Schütze. Foundations of statistical natural language processing. Vol. 999. Cambridge: MIT press, 1999.
Martin, James H., and Daniel Jurafsky. "Speech and language processing." International Edition 710 (2000): 25.
Cohen, Shay. "Bayesian analysis in natural language processing." Synthesis Lectures on Human Language Technologies 9, no. 2 (2016): 1-274.
Goldberg, Yoav. "Neural Network Methods for Natural Language Processing." Synthesis Lectures on Human Language Technologies 10, no. 1 (2017): 1-309.

Badges

Extracted from project README