The Senior Data Engineer is responsible for maintaining a unified high-volume data pipeline which collects and centralizes business data across the organization. Inbound data streams include core application metrics, sales records, and customer service data.
You will be responsible for integrating additional application and CS data streams, as well as enabling data scientists to respond to metrics request from the business team.
What you will do:
- Maintain and extend scripts for those databases harboring data from marketing, business and operations.
- Manage Salesforce & ZenDesk reporting data.
- Manage and direct all application and product user data.
- Manage data stream from app users to centralize and grow music training data pipeline.
- Opportunity to work with A.I. training data
- All the data!
What you will need to do it:
- 4+ years experience
- Bachelors or Masters in Computer science or related field
- You will maintain and extend an existing pipeline which includes the
following technologies:
- Apache Spark jobs (pySpark and Scala)
- Hadoop/AWS EMR (Java)
- Redshift
- Reporting dataflows making heavy use of AWS Datapipeline, Python,
and HighCharts
- Docker with Kubernetes for job orchestration