Sr Data Engineer
Our Sr. Data Engineers work with our data scientists and developers in Engineering team to help empower and drive insights from our large datasets. We consumer many terabytes of data that have challenging data problems such as classification, feature selection, data ETL challenges, matching and association data puzzles. As part of our core data team, this role is essential to partner to help build data pipeline and transformation algorithms that help in either our predictive analytics, production data resolution, and/or improving the data insights/analytics from our vast property, consumer, and geo-data.
You are a go-getting, someone who wants to develop code, break down interesting data problems, have worked with large dataset. You know how to hack some scripts, or develop automated data pipelines.
Preference is in the Washington DC metro area, but will also consider strong remote candidates.
- Undergraduate degree in computer science, mathematics, statistics, or the hard sciences or relevant experience
- Experience working with pandas dataframes (e.g., understanding of when to use .loc; columnar data manipulation; broadcast functions; how to use pandas.merge to perform joins)
- Solid python developer
- Ability to write and apply data manipulation functions in python/pandas/numpy
- Experience working with large data sets, and with unstructured data
- Ability to quickly find, use, and integrate relevant python packages
- Database ingestion experience (e.g., loading of CSVs and basic ETL)
- Ability to write code against APIs (e.g., for querying datasets)
- Strong mathematical and statistical skills Desirable:
- Experience with geospatial data and tools
- Some C# experience
- Experience with Spark, Dask, or other frameworks for working with large datasets
- Experience with AWS tools and libraries (e.g., Lambda, or using boto to work with S3)