Biswapratap Chatterjee

Data Architect and ML Engineer

Bangalore, India

Summary

Data Architect with 11 years of experience in machine learning, data pipelining, data

analytics/transformation and storage technology domain. Adept in designing and implementing

end-to-end data science projects. Loaded with a wide range of skill sets scaling from system

programming, protocol layer development, middleware development, infrastructure bring-up and

some presentation layer development. With innovation being the key strength and execution

equally complimenting can handle projects from start to finish.

Languages:

English

Favorite Python Packages:

sklearn, keras, tensorflow, pandas, numpy, flask

Experience

https://www.linkedin.com/in/biswapratap-chatterjee-42277518/

Data Architect with 11 years of experience in machine learning, data pipelining, data

analytics/transformation and storage technology domain. Adept in designing and implementing

end-to-end data science projects. Loaded with a wide range of skill sets scaling from system

programming, protocol layer development, middleware development, infrastructure bring-up and

some presentation layer development. With innovation being the key strength and execution

equally complimenting can handle projects from start to finish.

Experience

2017-03 -

present

Data Architect

Western Digital

QVisor

Jarvis

Optimaliseer

Log Analyzer

WD360- An Intra-Company Appathon Competition

  • An intelligent software/firmware quality analysis machine.
  • Targeted for validation testing of Western Digital firmware using attention based

    reinforcement learning.

  • Patent applied, technical design.
  • The primary idea proposer, designer, architect and execution manager.
  • Responsible from POC developments for early risk mitigation to full execution

    of the project.

  • An intelligent agent for managing different datasets produced by various teams

    across Western Digital.

  • Automatic relationship determination, relationship heatmap generation, auto-

    highlighting of hidden interesting data correlation, 2D and 3D graph projections.

  • Patent applied, technical design.
  • The primary idea proposer, designer, architect and developer.
  • A reinforcement learning agent aiding Western Digital Drive firmware to

    enhance performance.

  • Auto tuning of firmware configurable parameters to achieve enhanced

    performance.

  • Achieved 12% performance improvement as compared to human (expert)

    tuning.

  • The primary idea proposer, designer, architect and developer.
  • An algorithm and machine learning based log analyzer for - Anomaly detection,

    Parameter Extraction, Root Cause Analysis, Bug Classification.

  • Patent applied algorithm to auto cluster logs based on log structure and log

    time.

  • Generate finite state machines from time based logs for debugging and root

    cause analysis.

  • Log vectorization and clustering for bug classification.
  • A 360 degree contextual search engine for WD My Home (Personal Cloud).
  • The search engine could search based on n-grams, context and objects

    (images/audio/doc).

  • Used Open CV, Sony Gracenote, IBM Watson and Elastic Search.
  • Won 1st prize worth $5000 in the competition.
  • 2015-03 -

    present

    Personal R&D for skill development

    Skills

    Machine Learning – a) Python –

    Keras, Keras-RL, Tensor Flow. b)

    Supervised Learning. a. Classification

    Problems. b. Decision making

    Problems. c) Reinforcement Learning:

    a. Q-learning. b. Policy Gradient

    Learning. c. Deep Deterministic Policy

    Gradient (Actor-Critic) Learning. d)

    Genetic Algorithm e) Attention Based

    Neural Networks f) GANs g)

    Unsupervised Learning h) NLP i) RNN

    j) LSTM

    advanced

    Data Analytics – a) Infrastructure. a.

    Spark/Mesos b. Elastic Search. c.

    MongoDB. d. Scalable REST APIs

    using Falcon. b)

    Algorithms/Techniques. a.

    Pandas/Numpy/SciPy/SkLearn etc. b.

    Web Scrapping. c. Natural Language

    Processing (NLP). d. Image Analytics

    using OpenCV. e. Audio Analytics

    using Sony GraceNote. f.

    Document/Text/Emotion Analytics

    using IBM Watson. g. Density

    Distribution. h. Timeline Distribution. i.

    Predictive Analysis. j. Decision Tree

    Classifiers.

    advanced

    Storage Application Development

    and Sustenance. a) Enterprise and

    Consumer Grade. b) SMIS, SNIA,

    CIM, OpenSSL, OpenSLP. c) SCSI,

    SATA, NVMe, PCI protocol

    development stack. d) RAID and

    Initiator Target Storage Controllers.

    advanced

    System Application Development

    (Windows, Linux, VMware) – a) C/C++

    b) RESTful Applications. c) Python. d)

    Android. e) Java. f) NodeJS/Electron

    advanced

    Self

    All the projects mentioned below are personal projects or online competitions

    done in private time in parallel with professional work.

    Case Resolver

    Buddy

    Problem Statement: A company has more than 50,000 clients and there is a

    dedicated Field Application Team (FAT) for each client. The FATs and the clients

    communicate via various means like chat, email, Jira/bug tool etc. The FATs

    and the clients sometimes do not use grammatically correct English sentences

    for their communication. Moreover the FAT often mentions the name of the

    same client in various forms like - IBM, IBM Pvt. Ltd., IBM Corp, IBM Global etc.

    The company wants to segregate all communications by unique client names

    and also figure out the accurate date for a case to have resolved from the

    discussions between the FATs and the clients.

  • Problem was solved using algorithms, supervised ML, Natural Language

    Processing, Sentiment Analysis and Regular Expressions.

  • Although the problem was solved but the accuracy could still be improved.
  • An automatic e-shopping agent.
  • We spend a lot of time browsing over the internet in e-commerce websites over

    millions of products on a daily basis. Still we are never sure if we made the best

    choice, not just w.r.t. price but personal choice (which can be many).

  • This ML agent solves this problem of the user using - algorithms, web

    scrapping, genetic algorithms etc.

  • Although the problem was solved but the accuracy could still be improved.
  • 2016-09 -

    2017-03

    Senior Firmware Developer

    Seagate

    Dothill

    This project was about the sustenance of the well-known product – Dothill

    Storage Array. The storage array was used to address the mid-range of the array

    market with various features involving SATA and SAS drives. The project required

    me to work on the sustenance roadmap of the Dothill storage array. I worked in

    the night US shift to address high priority customer issues by analysing Firmware

    logs and providing recovery or code fix action plans. I was involved in multiple

    modules of the product from – Platform, Backend, RAID, Cache, Paged Storage,

    Snapshot and Tiering.

    2015-06 -

    2016-09

    Senior Staff

    Broadcom

    Management Stack of 1G, 10G and 50G Network Controllers

    The project involves the development and maintenance of the software

    management stack for high end network controllers from Broadcom. I am acting

    as a prime engineer in designing and implementing the entire stack. I have

    complete knowledge on the stack starting from the CLI, CIM till the OS library.

    2013-03 -

    2015-06

    Staff Engineer

    Sandisk

    Projects related to storage protocols like - SATA, SCSI, NVMe

    2008-09 -

    2013-03

    Senior Software Engineer

    Wipro

    Projects related to storage protocols like - SATA, SCSI, NVMe, RAID

    Web Application Development – a)

    AngularJS 4+ b)

    JavaScript/JQuery/TypeScript/HTML/

    CSS

    hands on

    Search Engine Development.

    average

    Android Application Development

    in Java.

    hands on

    Education

    2004-05 -

    2008-05

    Kalinga Institute of Industrial Technology (KIIT) University

  • Bachelor of Technology in Computer Science
  • CGPA of 8.77
  • Skills

    Artificial Intelligence, Big Data, Data Science, Elasticsearch, Machine Learning, Natural Language Processing

    Joined: June 2019