We're looking for an experienced distributed systems/big data software engineer to join our team. You'll take ownership of our crawl and data pipeline, architecting and implementing new solutions, keeping them running with high uptime and scaling them up to handle load.
Experience writing good software is a must, and experience with distributed systems and big data is a huge plus - we're looking for someone with good intuition about what good architecture looks like.
Our current tech-stack is mostly Python running on AWS (EC2, S3, EMR, ELB, Kinesis, etc.), with Redis, Apache Spark and a bit of MongoDB. We have a large distributed infrastructure, consisting of hundreds of systems ranging from crawlers, data processing clusters, database and API servers, etc. If you haven't worked with individual pieces of this before it's fine - we like working with smart people that are comfortable working in a distributed systems/big data/cloud environment. We're happy to help you get up to speed with anything you haven't seen before. We practice peer code review and continuous deployment on all our code and systems.
We write tests, practice peer code review, pair programming and continuous deployment on all our code and systems. We love blue-green deployments and anything immutable and we push very hard toward automation.