- Bachelor's degree in computer science, engineering, mathematics, or a related technical discipline
- 4+ years of industry experience in software development, data engineering, business intelligence, data science, or related field with a track record of manipulating, processing, and extracting value from large datasets
- Demonstrated strength in data modeling, ETL development, and data warehousing
- Experience using big data technologies (Hadoop, Hive, Hbase, Spark etc.)
- Knowledge of data management fundamentals and data storage principles
- Knowledge of distributed systems as it pertains to data storage and computing
Do you want to join an innovative team of engineers and scientists who use vast amounts of data combined with machine learning to help Amazon provide the best customer experience by automatically mitigating risk and providing support solutions? Are you excited by the prospect of analyzing and modeling terabytes of data and creating state-of-art algorithms to solve real world problems? Do you like to build end-to-end business solutions and directly impact the profitability of the company? Do you like to innovate and simplify tasks and processes? If yes, then you may be a great fit to join the Machine Learning Accelerator team in the Amazon Customer Trust and Partner Support group.
Our mission in the Machine Learning Accelerator team is to develop strategic solutions that make Amazon.com the safest place to transact online, with the highest quality products and brands, and the easiest access for sellers.
As a Data Engineer in the Machine Learning Accelerator team, you will work with scientists and engineers to develop scalable and innovative solutions to process terabytes of real-time unstructured data and enable us to gain insights. You will be responsible for designing systems which leverage cutting edge AWS, proprietary and open source technologies for managing big data, monitoring for anomaly detection, and providing solutions for Amazon customers.
- Design, implement and support an analytical data infrastructure providing ad-hoc access to large datasets and computing power.
- Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using SQL and AWS big data technologies.
- Creation and support of real-time data pipelines built on AWS technologies including EMR, Glue, Kinesis, Redshift/Spectrum and Athena.
- Continual research of the latest big data, elasticsearch and visualization technologies to provide new capabilities and increase efficiency.
- Collaborate with other tech teams to implement advanced analytics algorithms that exploit our rich datasets for statistical analysis, prediction, clustering and machine learning.
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, visit https://www.amazon.jobs/en/disability/us .
- Experience working with AWS big data technologies (Redshift, S3, EMR)
- Proven success in communicating with users, other technical teams, and senior management to collect requirements, describe data modeling decisions and data engineering strategy
- Experience providing technical leadership and mentoring other engineers for best practices on data engineering
- Familiarity with statistical models and data mining algorithms
- Knowledge of software engineering best practices across the development lifecycle, including agile methodologies, coding standards, code reviews, source management, build processes, testing, and operations
- Masters in computer science, mathematics, statistics, economics, or other quantitative fields.