- 1+ years of experience as a Data Engineer or in a similar role
- Experience with data modeling, data warehousing, and building ETL pipelines
- Experience in SQL
- Bachelor's or Master’s degree in Computer Science, Info Systems, Business, or related field
- 2+ years of coding experience with Python
- Knowledge of software engineering best practices for the full development life cycle (i.e., coding standards, code reviews, source control management, build processes, and testing).
- Highly adaptable, creative, and thrives in a fast-paced work environment
Amazon Connections is changing the way that we use the voice of employee to improve business outcomes and the employee experience of an incredibly complex, global workforce. We ask employees quick questions every day to get early signals and to learn more about their experiences that allow early interventions and positive changes with internal business partners around the world. The mission of Connections Research team is to support Amazon in being the Earth’s Best Employer. We accomplish this mission by delivering high quality questions and research-backed insights to inform our line managers and leadership to take action, and make data-driven decisions that create a safe, productive, diverse and inclusive work environment.
We are looking for an experienced data engineer to own our machine learning pipelines powering Connections. This role requires close collaboration with scientists, software developers and data engineers from other teams. In this unique data engineering role, in addition to typical model input and output data pipelines you will be also in charge of dependency management of scientific model packages.You will be responsible for:
- Model input and output data pipelines
- Standardizing and optimizing queries
- Modifying existing code and queries to handle upstream data changes
- Dependency management of scientific model packages (Python)
- Collaborating with software development teams and scientists to deploy new models
- Building a feature repository and a model health monitoring framework for machine learning models
- Is able to drive data engineering best practices (e.g. Data Discovery, Naming Conventions, Operational Excellence, and Data Security) and set standards
- Familiarity with scientific libraries (sklearn, numpy, tensorflow, nltk etc.) and package management systems in Python
- Experience working with scientists and machine learning models
- Experience using AWS products (S3, Athena, SageMaker, EC3, Lambda, Batch, CloudWatch) to build data solutions
- Experience using Docker
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, visit https://www.amazon.jobs/en/disability/us
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.