Andrey Vladimirovich

Data Science & Machine Learning Consultant

Moscow, Russian Federation

Summary

Techs
Python | R | pandas | numpy | scipy | scikit-learn | seaborn | plotly | bokeh | docker | jupyter

Machine Learning
supervised learning, unsupervised learning, regression, classification, anomaly detection, time series, class imbalance, synthetic data generation, xgboost, lightgbm, catboost, ensemble methods, spacy, gensim, glove, prophet, shap, eli5

Deep Learning
keras | tensorflow | autoencoding | object detection | segmentation | classification | image augmentation

Databases
sql, mongodb, arango, aql, redis

Math
data analysis | statistics | probability theory | data visualization | regularization

Other
*nix, bash, logging, git, github, flask, gunicorn

Languages:

English, Russian

Experience

  • Recommendation system for activity search startup (Russian language)
    Sep 2019 – Present
    Textual data processing (Spacy, PyMorph2); entities detection (DeepPavlov); spelling errors correction (Yandex Speller API); custom-rule weighted metric based on entities, tags and processed ngrams.
  • Amazon ads analysis and keywords detection
    Apr 2019 – Present
    Spacy + GloVe and Gensim (self- and pre-trained; transfer learning), image recognition (self- and pre-trained VGG16 Keras; transfer learning). API. Current task: make language-independent.
  • New employee adaptation quality estimation
    2019
    Feature engineering; short and wide dataset problems: custom multicorrelation removal; feature selection (Boruta_py); model (LightGBM); results interpretation (customized Shap for original model and keras RNN; clustering via HDBSCAN; IsolationForest for anomaly detection).
  • Common patterns detection in time series (HR data)
    2018
    Data processing (smoothing, scaling and autoencoding via Keras), custom-made hierarchical clustering with Dynamic Time Warping as the main metric (dtaidistance).
  • Financial reports (10-k and 6-k forms) structure detection, analysis and purity estimation
    2017 – 2018
    Spacy, feature engineering and HDBSCAN to detect outliers (i.e. chapter names). Building the graph of tokens; synonyms extension, similarity estimation (self- and pre-trained GloVe and Gensim); storing and visualization of results (Arango DB); scripts in AQL language (path detection tasks, etc.).

Skills

Artificial Intelligence, Data Science, Flask, Git, Keras, Machine Learning, Natural Language Processing, NumPy, Pandas, Redis, SQL, SciPy, TensorFlow

Joined: January 2018