Andrey Vladimirovich

Data Science & Machine Learning Consultant

Moscow, Russian Federation


Python | R | pandas | numpy | scipy | scikit-learn | seaborn | plotly | bokeh | docker | jupyter

Machine Learning
supervised learning, unsupervised learning, regression, classification, anomaly detection, time series, class imbalance, synthetic data generation, xgboost, lightgbm, catboost, ensemble methods, spacy, gensim, glove, prophet, shap, eli5

Deep Learning
keras | tensorflow | autoencoding | object detection | segmentation | classification | image augmentation

sql, mongodb, arango, aql, redis

data analysis | statistics | probability theory | data visualization | regularization

*nix, bash, logging, git, github, flask, gunicorn


English, Russian


  • Recommendation system for activity search startup (Russian language)
    Sep 2019 – Present
    Textual data processing (Spacy, PyMorph2); entities detection (DeepPavlov); spelling errors correction (Yandex Speller API); custom-rule weighted metric based on entities, tags and processed ngrams.
  • Amazon ads analysis and keywords detection
    Apr 2019 – Present
    Spacy + GloVe and Gensim (self- and pre-trained; transfer learning), image recognition (self- and pre-trained VGG16 Keras; transfer learning). API. Current task: make language-independent.
  • New employee adaptation quality estimation
    Feature engineering; short and wide dataset problems: custom multicorrelation removal; feature selection (Boruta_py); model (LightGBM); results interpretation (customized Shap for original model and keras RNN; clustering via HDBSCAN; IsolationForest for anomaly detection).
  • Common patterns detection in time series (HR data)
    Data processing (smoothing, scaling and autoencoding via Keras), custom-made hierarchical clustering with Dynamic Time Warping as the main metric (dtaidistance).
  • Financial reports (10-k and 6-k forms) structure detection, analysis and purity estimation
    2017 – 2018
    Spacy, feature engineering and HDBSCAN to detect outliers (i.e. chapter names). Building the graph of tokens; synonyms extension, similarity estimation (self- and pre-trained GloVe and Gensim); storing and visualization of results (Arango DB); scripts in AQL language (path detection tasks, etc.).


Artificial Intelligence, Data Science, Flask, Git, Keras, Machine Learning, Natural Language Processing, NumPy, Pandas, Redis, SQL, SciPy, TensorFlow

Joined: January 2018