Sumbono Sumbono
Python Engineer
South Tangerang, Indonesia
Summary
Learning to code gave me problem-solving skills and a way to communicate with others on a technical level. I can also develop web crawler, API, and use my coding skills to solve technical problems. As a Python Engineer, I working most in backend - data acquisition. Doing scraping/crawling, cleaning, and validation data.
Languages:
English, Indonesian
Favorite Python Packages:
regex, requests-html, asyncio, aiohttp, spaCy, nltk, scrapy, cloudscraper, selenium.
Experience
- Development Job Title Classification Using Rule Based Method (Dec 2019 - Jan 2020 & March 2020). Job title classification into a list of Department. On this project we are using regex matching and fuzzy matching string. The raw input contain of structured and unstructured job_title. Before matching process, raw text should be cleaning from: space, tab, numbers, punctuation, symbols, etc. On this project we have mapping raw input job_title into 20 department, with 80% accuracy.
- Maintaining and updating large scale websites crawler (Jan - Feb 2020). This crawler designed to grab information from many websites at a time (around ~ 50K webpage). Applied on a distributed system, managed by Celery and Redis. Using asyncio and aiohttp for asynchronous crawling.
- Development Job Title Classification Using Machine Learning Method (Dec 2019 - Dec 2019). Job title classification into a list of Department. On this project we are using scikit-learn for training the mapper model. The raw input contain of structured and unstructured job_title. Before matching process, raw text should be cleaning from: space, tab, numbers, punctuation, symbols, etc. On this project we have mapping raw input job_title into 14 department.
- Development of web crawler script using various libraries (May 2019 - Present). Crawling companies and contacts profile from many websites. The libraries used: requests, scrapy, requests-html, cloudscraper, pyppeteer, selenium.
- Development Simple API using Flask Framework (Apr 2019 - Apr 2019). An API for fetching company profiles data provided from MongoDB.
- Development Crawler app from list of url using Scrapy (Apr 2019 - Apr 2019). Crawling data from many website.
- Development of Real Time Monitoring Systems (Jun 2018 - Mar 2019). Development of monitoring systems for IoT (realtime) project. The server side web received data from sensors in sites, then push the data to client side using websocket.
Skills
Backend Development, Data Science, Git, Linux, Machine Learning, Matplotlib, MongoDB, Natural Language Processing, NumPy, Pandas, Scrapy, Selenium, Virtualenv, Web Scraping