NLP course, MIPT
NLP course, MIPT
Anton Emelianov ([email protected], @king_menin), Albina Akhmetgareeva ([email protected])
Videos here
Exam questions here
final_mark=sum_i (max_score(HW_i)) / count(HWs) * 10, i==1..3
Distributional semantics. Count-based (pre-neural) methods. Word2Vec: learn vectors. GloVe: count, then learn. N-gram (collocations) RusVectores. t-SNE.
Neural Language Models: Recurrent Models, Convolutional Models. Text classification (architectures)
Task description, methods (Markov Model, RNNs), evaluation (perplexity), Sequence Labelling (NER, pos-tagging, chunking etc.) N-gram language models, HMM, MEMM, CRF
Basics: Encoder-Decoder framework, Inference (e.g., beam search), Eval (bleu). Attention: general, score functions, models. Bahdanau and Luong models. Transformer: self-attention, masked self-attention, multi-head attention.
Bertology (BERT, GPT-s, t5, etc.), Subword Segmentation (BPE), Evaluation of big LMs.
Lecture & Practical: How to train big models? Distributed training
Training Multi-Billion Parameter Language Models. Model Parallelism. Data Parallelism.
Squads (one-hop, multi-hop), architectures, retrieval and search, chat-bots