Aleksei Dobrov

Lead Python Developer

St. Petersburg, Russian Federation

Summary

Experienced Lead Python Developer / Development Head with a background in complex algorithmically calculations and software development. Expertise in OOP, Python, and designing complex architectures. Adept at working with JSONful APIs, parallel computing, algorithms, logical inference systems, semantic search engines, and AI. Skilled in natural language processing, data mining, computational linguistics, and 3D programming. Proficient in a variety of programming languages, databases, frameworks, and methodologies. Strong commitment to code quality, documentation, and mentorship. Proven track record of successfully leading development teams and delivering innovative solutions on time and within budget.

Languages:

English, French, German, Russian

Favorite Python Packages:

nltk, jsonrpc, fastapi, pydantic, lxml, Django, django-rest-framework, openpyxl, python-poppler, requests, requests-toolbelt, psycopg2, shelve, sphinx, pytest, black, tensorflow, keras, opencv, numpy, pandas, Pillow, tqdm, scikit-learn, scipy

Experience

Recent Work Experience

October 2023 — Now

Project: Saby / SBIS platform development: projects for property accounting subsystems

Position: Development direction head

Tasks and Achievements: 

  • Mastered the original platform and codebase quickly, which typically takes more than a year, and found multiple areas of improvement, created development tasks for these improvements;
  • Fixed multiple issues in property accounting;
  • Initiated code refactoring;
  • Launched a new project of KLADR/FIAS locations and SBIS rooms merging, which had earlier been failed several times by the previous teams
  • Ran the inventory redesign phase of the current migration project

 

March 2023 – January 2024

Project: Spark data processing and transfer core for Databricks / Snowflake / Immuta / Datadog

Position: Lead Software Engineer

Tasks and Achievements: 

  • Led the Immuta/Datadog integration team, performed code review and work planning
  • Refactored the internal Immuta integration package to support both existing Immuta API versions and to provide a unified CRUD API for applications
  • Fixed and developed pipelines to automate Sphinx documentation generation,
  • Created a universal core logging package for Datadog integration
  • Refactored existing data transfer pipelines to perform fast configuration updates

Environment: Python, Databricks, Immuta, Snowflake, Hive, Datadog, Gitlab, Sphinx, AWS S3, black/blackbricks, pylint, isort, flake8, poetry, pip

 

November 2022 – August 2023

Project: Universal system for academic mobility among partner universities for regular and crossinstitutional educational programs

Position: Full Stack Team Lead

Tasks and Achievements: 

  • Led backed, frontend, and database development;
  • Organized CI Pipelines development and debugging;
  • Organized QA;
  • Developed universal Django RESTful CRUD service for arbitrary models;
  • Developed universal vue.js SPA admin for Django;
  • Developed universal Django peer-to-peer networking models, whose storage is simultaneously backed by database and by all the other django instances in the network;
  • Developed subjects, marks, and eduaction results parser based on scanned documents OCR;
  • Developed universal documents and document packages storage and UI representation, including electronic signatures;
  • Developed universal Word document templating system based on Jinja2;
  • Developed academic difference calculations for educational program transfer applications;
  • Developed the whole workflow for crossinstitutional program applications for students, base university curators, and accepting university curators

Environment: Python, PostgreSQL, Vue.js, Django, Django REST framework, Tesseract, Jinja2

 

October 2022 – February 2023

Project: Developing a solution for teeth root segmentation using CBCT and separate optical scans, as well as automatic treatment (teeth displacement) planning, and creating aligners with 3D printing

Position: Lead Software Engineer

Tasks and Achievements: 

  • Refactor the implementation of server-side patient case export according to the new architectural requirements
  • Fix errors in teeth displacement import/export calculations
  • Refactor the system and implement export hashing and validation for preventing security issues, e.g. remove vulnerabilities in payment processing

Environment: Unity3D, C#, autofac, g3, .NET, AspNetCore, Newtonsoft.Json, Serilog, SharpZipLib, Lp Solve, Google Cloud Storage, Redis

 

July 2022 – February 2023

Project: Financial microservices migration to PAS

Position: Senior Software Developer

Tasks and Achievements: 

  • Refactored several on-prem microservices for migration to PAS, performed it, created Gitlab pipelines for public and private PAS;
  • Performed profiling and data transfer performance analysis, made performance optimization for several microservices to compensate PAS overheads;
  • Implemented variety of financial big data calculations and services from scratch on the basis of existing data analysis;
  • Implemented the second version of the common REST package for asyncronous computations.

Environment: AWS, AWS PAS, AWS S3, Oracle, Python, Bloomberg, Pithon (Pimco Python), asyncio, aiohttp, REST, fastapi, gunicorn, Gitlab

 

July 2022 – September 2022

Project: Applying machine learning methods for document parsing

Position: ML Engineer

Tasks and Achievements: 

  • Converted ICPR dataset (line charts) to COCO dataset format
  • Trained and tested Detectron with the converted dataset
  • Visualized recognition and segmentation results

Environment: Google Cloud Console, Vertex AI Workbench, Python, PyTorch, Detectron, OpenCV, Matplotlib

 

April 2022 – December 2023

Project: FIM Platform for railway IT infrastructure

Position: Lead Python / Java Engineer, later Project Manager

Tasks and Achievements:

  • Developed CCTV/DIVIS integration Plugin for FIM client (trainside)
  • Fixed multiple issues in FIM Client, FIM Server, SRS (Seat Reservation System) Plugin and different other subsystems
  • Led the whole python / java development, QA, and project delivery
  • Executed several projects on FIM client functionality improvement and redesign
  • Investigated and fixed various issues with customers from GSP/Televic, DB and OEBB

Environment: Python, Tornado, requests, requesrts-toolbelt, JSON, Trixi, Java, Springframework, Lombok

 

July 2021 – July 2022

Project: Providing a pipeline tool allowing to parse various kinds of documents

Position: Lead Python Engineer

Tasks and Achievements:

  • Design a complete technical solution for the system architecture in accordance with the ever changing customer requirements
  • Design and implement basic classes for the entities required by the customer and logic behind them
  • Design and implement a universal vocabulary and ontology parsing and storage mechanism based on the above classes
  • Manage work on system redesign and migration from Python classes to an SQL database with PL/PGSQL functions
  • Design and implement the first version of the database, including the revision mechanism to efficiently store objects states and efficiently restore them
  • Design the logic around packaging and runtime utilities in order to efficiently store and execute pipelines
  • Create an NLP-based algorithm to extract entities from table headers and cells, including both common and customer-specific patterns
  • Design and implement an algorithm to extract customer-specific measurements and assays from the extracted tables and assays and to send them to the customer’s database REST API
  • Develop a mechanism to involve customer-supported protein clone vocabulary into the data extraction pipeline to fill the required columns
  • Create various pipelines both for common and customer specific data extraction configuration mappings
  • Design and implement table unpivoting and merged cell duplication/deduplication pipelines

Environment: Python, JSON, Pickle, CSV, PostgreSQL, PL/PGSQL, PostgREST, Domino, BadgerDoc, AIIRE, SUMO

 

March 2021 – June 2021

Project: Engineering a tool for interpreting chemical images

Position: Senior Python Engineer

Tasks and Achievements:

  • Designed and implemented image denoising and upscaling for chemical structure graphs based on the Hough lines transformation algorithm and SRGAN neural network
  • Implemented an optimal script for SRGAN training based on Indigo-generated graphs for INCHI codes from the training set
  • Designed and implemented molecule graph downscaling and random noise generation for the training algorithm to simulate the training set images as accurately as possible
  • Developed a VGG19 classifier analogue that can classify molecule graphs into classes that correspond to various length chains of atoms and bonds
  • Created an algorithm to vectorize atom and bond chains as class logits
  • Tuned the classifier neural network training algorithm to converge with categorical cross entropy loss on the produced dataset
  • Implemented an image similarity metric based on features extracted by the classifier network convolutional layers for the SRGAN generator loss function
  • Trained and tested SRGAN in various ways on a GPU and a high performance cluster with VGG19 based loss
  • Researched various error metrics for images and found an optimal one or a combination thereof based on various training runs
  • Designed and implemented an SSIM-based SRGAN training algorithm for molecule graph denoising and upscaling
  • Created a two-dimensional formal grammar parsing algorithm to recognize letters and bonds on molecule graph images

Environment: Python, Opencv-python, NumPy, TensorFlow, Keras, Tqdm, Pillow, Indigo

 

January 2021 – April 2021

Project: Working on software applications that calculate and present data from mass spectrograph devices

Position: Senior Python Engineer

Tasks and Achievements:

  • Designed a complete universal technical solution for mass spectrograph data extraction out of the customer’s PDF files from ChemStation and Astra
  • Implemented table extraction pipelines for ChemStation and Astra reports
  • Implemented a chromatogram extraction pipeline for ChemStation and Astra reports
  • Enabled peak extraction, evaluation, sorting, and selection pipelines from extracted data
  • Tuned consolidated XLSX report generation on data extracted from datasets
  • Reviewed code
  • Showcased the results to the customer
  • Clarified the logic required by the customer

Environment: Python, Poppler, PDFplumber, PyMuPDF, Opencv-python, Numpy, Pillow, Openpyxl

 

July 2020 – December 2020

Project: X5Id

Position: Lead java developer

Tasks and Achievements:

  • Developed adapters for Keycloak to store users in Tarantool database and to authenticate them.
  • Fixed numerous issues both in the Java backend and Vue.js frontend related to the authentication flows and profile storage.
  • Performed code refactoring, introduced linting, improved code style / quality, performed code review.
  • Improved legacy token migration flows for older versions of the mobile apps.
  • Improved CIP integration.

Environment: Java, KeyCloak, Tarantool, GraphQL, Lua, Vue.js, Maven, JBoss, Lombok, Resteasy

 

 

July 2013 – December 2020

Project: An IT solution and service provider

Position: Lead Project Engineer

Tasks and Achievements:

  • Perform full cycle project management
  • Design technical solutions
  • Design solution architecture
  • Plan the development process
  • Manage releases
  • Create technical specifications
  • Perform cost estimation
  • Liaise with the customers

Environment: Python, Pyramid, Django, Django CMS, React.js, jQuery, PostgreSQL, Oracle SQL, Cython, CherryPy, ClickHouse

 

July 2018 – December 2018

Project: Lead Software Engineer

Position: Engineering solutions for a data science company based in Switzerland

Tasks and Achievements:

  • Developed deep learning strategies and models for renewable energy day-ahead price prediction
  • Built models for profit/loss optimization with trading no day-ahead and imbalance markets
  • Provided automation for historical data, forecast data, and actual data collection from various sources

Environment: Python, Scikit-learn, TensorFlow/Keras, Numpy, Scipy, Pandas, Angular.js, SQLAlchemy, Alembic, MySQL, Requests, Openpyxl, CSV

 

December 2008 – June 2013

Project: Developing a Linux migration enterprise solution

Position: Lead Software Engineer

Tasks and Achievements:

  • Ran the linguistic department
  • Managed the system development process
  • Led the Linux distribution localization processes

Environment: Scientific Linux, CentOS Linux, RPM, GNU autotools, GCC, YUM, Poedit, Pootle, Zope/Plone, Python, C, PostgreSQL, BerkeleyDB, StarDict, VIM

 

PARTICIPATION IN THE SCIENTIFIC RESEARCH AND INDUSTRIAL PROJECTS (MAIN PROJECTS)

  1. "OOmnik" – free open-source interpreter for natural language texts (2003-2010, python/c)
  2. "Bukva Zakona" – Russian laws corpora with linguistic markup and a search engine for these corpora (2004, python/sqlite)
  3. RND projects "Gardemarin" / "Garmonia" / "Grazia" – modules of morphological, syntactic, and semantic analysis for Russian, English, Arabic, Chinese, Turkish, and French texts, and electronic dictionaries for cluster analysis of news messages in these languages (2004-2010, python/c).
  4. Developmantal work "Semantics" – modules of linguistic analysis, automatic format and encoding detection for documents written in mixed languages (Russian, English, Arabic, Chinese, Turkish, and French) (2010, python/c)
  5. "HomeReading" – web-portal and a collaborative translation support system of fiction texts in English and German (2006-2007, python (zope+plone))
  6. "ArRus" – electronic dictionary and translation support system from Arabic to Russian and vice versa (2007 - 2008, python/plone)
  7. Search engine for Subscribe.ru portal (globbie) – a system of intellectual (semantic) information retrieval of textual information in the mailing-list archive of Subscribe.ru portal (2008-2009, python/postgresql/Oracle Berkeley DB)
  8. Reference search engine for the Library of Russian Academy of Sciences – full-text search in bibliographic records (2008-2010, python)
  9. «Informatization of Abkhazia Republic» – localization of NauLinux distribution to Abkhazian language, development of Abkhazian electronic dictionary and morphological analyzer, development of a web-portal for the Abkhazian State Ecological Service (2008-2012)
  10. Universal search engine for Russian Constitutional Court – a system of intellectual (semantic) information retrieval of textual information in the electronic archive of court decisions and bibliographic records of the Library of Constitutional Court of Russia (2009-2011, python/c)
  11. "AIIRE" – free open-source natural language processor and intellectual information retrieval engine (AI Information Retrieval Engine) (2010 – now, vanilla c)
  12. Elecronic Archive of Engineering Documentation of «ГИПРОДОРНИИ» (the Institute For Road and Bridge Building Design) – development of the electronic archive and a search engine for this archive (2011, python)
  13. "Everyqa.ru" – web-portal of questions and answers («Every Question has an Answer») with built-in ontology and intellectual information retrieval engine (2011 – 2013, python/php/postgresql)
  14. "Deprint" – development of modules of intellectual OCR of Russian and English texts based on библиотек Abbyy FineReader SDK and free open-source Tesseract library (2012, c/c++)
  15. «Expert» – expert and intellectual semantic search system for Russian Ministry of Finances with built-in editable ontology (more than 1000000 legal documents, 2012, c/c++/Oracle Berkeley DB/python)
  16. «Aeroflot-Benefit System» — Loyalty system of PJSC Aeroflot for ticketing agents with complex calculation of promo-actions charges and sales of discount coupons for these charges (2014-2016, python/pyramid/postgresql)
  17. «Aeroflot-OPO» — a system for planning and calculating vacations for employees of PJSC Aeroflot (2014, python/pyramid/oracle)
  18. «Aeroflot-RX14» — the unified system of the Aeroflot website vocabularies, web-services and parters catalogue (2014-2015, python/pyramid/postgresql)
  19. AIMiner — a system of opinion mining to detect evaluations of actions or activity of Russian authorities in web publications for the Center of e-Government technologies of IFMO University (2014-2016)
  20. «Pravoslov» — machine translation system from modern Russian to Russian of the XIXth century (2015, c/python)
  21. «Aeroflot-MC» — website of the PJSC Aeroflot medical center (2015, python/django CMS/postgresql)
  22. Tibetan morphosyntactic analyzer — RFBR grant for the creation of a morphosyntactic analyzer of texts in classical Tibetan language (2015-2018, c/python)
  23. «Aeroflot-CURCALC» — backend-services and a single js-applet of the currency calculator for all subsystems of the site of PJSC «Aeroflot» (2015-2016, python/pyramid/postgresql)
  24. «Aeroflot-POS» — subsystem of country and point of sale transfer from the main website of PJSC Aeroflot to the air ticket reservation system (2016, python/django CMS/postgresql)
  25. «Aeroflot-GIFTCERT» — system for selling virtual gift certificates on the main website of PJSC Aeroflot and in mobile applications (2017, python/pyramid/postgresql/reactjs)
  26. «Aeroflot-REVCALC» — reverse miles calculator for the main website of PJSC Aeroflot and mobile applications (2017-2018, python/django)

Skills

AngularJS, Ansible, Apache, Artificial Intelligence (AI), Backend Development, Celery, DevOps, Django, Django REST Framework, FastAPI, Frontend Development, Fullstack Development, Git, Gunicorn, Keras, Linux, Machine Learning, Matplotlib, Natural Language Processing, NumPy, Pandas, PostgreSQL, Pyramid, REST, ReactJS, TensorFlow, Test-Driven Development, Vue.js, Web Development, jQuery

Joined: March 2024