Creating new solutions based on data
Developing and maintaining ML models
Preparing reports/visualizations for Product Team
Building ETL pipelines
Experience in Python
Data Science
the ability to present the results transparently and clearly
focus on giving value to business
knowledge of A/B tests
SQL
experience in both Relational and NoSQL databases
ability to perform complex selects
preferably MySQL and DynamoDB
Machine Learning
knowledge of ML algorithms for recommendations, classification, regression, forecasting, clustering and when to use a given model
Practical knowledge of GIT
Experience in deploying and maintaining ML models in production
AWS (Athena, S3, SQS, Lambda, RDS, DynamoDB, QuickSight)
Java
Basic knowledge of data processing pipelines
PySpark, EMR, Hadoop
Some experience with web development in Python ecosystem, e.g. Flask
Deep Learning experience with TensorFlow, PyTorch
Exposure to Continuous Integration/Continuous Deployment Environment
DevOps - you build it you run it
Small, tightly-knit groups of very skilled people
Code Reviews
Directly Responsible Individual
Pair programming
Paying back technical debt whenever you can
1:1s
Blameless postmortems
Seeking mastery. We read books, attend conferences and meetups. We have a library of books. We study alone and in groups. Company has a budget to support us. We do this because it is our passion.
Curiosity. If we use something we want to know how it works exactly. What are the constraints, when does it fail
Direct, honest and timely feedback. This is how we improve.
Autonomy. We support each other and we actively avoid micromanagement.
Being a good human. We don’t tolerate jerks, no matter how brilliant they are
Scalability - we ingest “tons” of data that must be processed near-real time. Traffic patterns change constantly and we have to adapt dynamically. We rely on horizontal partitioning and auto-scaling a lot.
Reliability - uptime, latencies, queue processing delays - we live and breathe by these metrics. We assume machines, disks, network and software will fail. Our approach is resilience engineering and automation.
AWS (25+ services) Java 11, git, JIRA, Confluence, WebStorm, PhpStorm, IntelliJ, Redis, Hadoop, Spark