Hi, my name is Dani and I ❤️ AI and Open-Source

Field of interests: LLM, NLP, RL, Graphs, Distributed Systems

Skills 🛠️

  • Languages:  Python, SQL
  • DS/ML/DL:    SkLearn, PyTorch, Transformers
  • Big Data:       Hadoop, Spark
  • DevOps:       Linux, Git, Docker

Work experience 👔

Job Position Company Field Work Period
Head of AI Transformation Social Discovery Group LLM, Conversational AI 2024-05 — now
Research Scientist Lead SberDevices LLM, GigaChat 2023-04 — 2024-05
NLP Team Lead SberDevices Search, Information Retrieval 2022-10 — 2023-04
NLP Tech Lead Sber AI Lab NLP, MLOps, Mentoring 2021-05 — 2022-10
Senior NLP Engineer Tinkoff AI Lab Virtual Assistant "Oleg" 2021-02 — 2021-04
Middle NLP Engineer MTS AI Lab NER with Pseudo-Labeling 2020-05 — 2021-02
Junior Data Scientist Sberbank ML with Tabular Data, CV 2018-07 — 2020-05

Education 🎓

Projects 🐾

  • MUSE TF -> PT - convert Multilingual Universal Sentence Encoder from TensorFlow to PyTorch and ONNX
  • QaNER - unofficial implementation of QaNER paper (NER via Extractive Question Answering)
  • RLLib - Reinforcement Learning library
  • MUSE as Service - REST API for sentence embedding using Multilingual Universal Sentence Encoder
  • PyTorch NER - pipeline for training NER models using PyTorch
  • Text Classification Baseline - pipeline for building text classification TF-IDF + LogReg baselines
  • Graph-Based Clustering - clustering using graph connected components and spanning trees

Public talks 🗣

Certifications 📜

Hackathon participation 💻

Achievements 🏆

  • Key contributor to GigaChat: Russian most advanced LLM
  • 500+ stars on GitHub and 10 packages in PyPI with 38k+ downloads
  • Contributor to PyTorch, Scikit-Learn, SciPy
  • Open Data Science Best Contributor 2020

GitHub Stats ⭐

