Andrey Vladimirovich
Data Science & Machine Learning Consultant
Moscow, Russian Federation
Summary
Techs
Python | R | pandas | numpy | scipy | scikit-learn | seaborn | plotly | bokeh | docker | jupyter
Machine Learning
supervised learning, unsupervised learning, regression, classification, anomaly detection, time series, class imbalance, synthetic data generation, xgboost, lightgbm, catboost, ensemble methods, spacy, gensim, glove, prophet, shap, eli5
Deep Learning
keras | tensorflow | autoencoding | object detection | segmentation | classification | image augmentation
Databases
sql, mongodb, arango, aql, redis
Math
data analysis | statistics | probability theory | data visualization | regularization
Other
*nix, bash, logging, git, github, flask, gunicorn
Languages:
English, Russian
Experience
- Recommendation system for activity search startup (Russian language)
Sep 2019 – Present
Textual data processing (Spacy, PyMorph2); entities detection (DeepPavlov); spelling errors correction (Yandex Speller API); custom-rule weighted metric based on entities, tags and processed ngrams. - Amazon ads analysis and keywords detection
Apr 2019 – Present
Spacy + GloVe and Gensim (self- and pre-trained; transfer learning), image recognition (self- and pre-trained VGG16 Keras; transfer learning). API. Current task: make language-independent. - New employee adaptation quality estimation
2019
Feature engineering; short and wide dataset problems: custom multicorrelation removal; feature selection (Boruta_py); model (LightGBM); results interpretation (customized Shap for original model and keras RNN; clustering via HDBSCAN; IsolationForest for anomaly detection). - Common patterns detection in time series (HR data)
2018
Data processing (smoothing, scaling and autoencoding via Keras), custom-made hierarchical clustering with Dynamic Time Warping as the main metric (dtaidistance). - Financial reports (10-k and 6-k forms) structure detection, analysis and purity estimation
2017 – 2018
Spacy, feature engineering and HDBSCAN to detect outliers (i.e. chapter names). Building the graph of tokens; synonyms extension, similarity estimation (self- and pre-trained GloVe and Gensim); storing and visualization of results (Arango DB); scripts in AQL language (path detection tasks, etc.).
Skills
Artificial Intelligence (AI), Data Science, Flask, Git, Keras, Machine Learning, Natural Language Processing, NumPy, Pandas, Redis, SQL, SciPy, TensorFlow