When you are part of the team at Thermo Fisher Scientific, you’ll do important work, like helping customers in finding cures for cancer, protecting the environment or making sure our food is safe. Your work will have real-world impact, and you’ll be supported in achieving your career goals.
How will you make an impact?
Thermo Fishers Scientific is seeking a Data Engineer located at Carlsbad, CA to work with Digital Marketing and Data Architecture team to build Databricks-based Data Pipeline and bring data onto our enterprise level data platform for Data Science, Analytics and Digital Marketing needs. The data platform is primarily based on Oracle Exadata database, AWS Redshift and Databricks-based Delta technologies toward Lakehouse transition to enable Data Science, Data Analytics, Customer Analytics and Data Services for critical Application and Business enablement
What will you do?
- Design, develop, test, deploy, support, enhance data integration solutions seamlessly to connect and integrate Thermo Fisher enterprise systems in our Data Science and Enterprise Data Platform.
- Innovate for data integration in Apache Spark-based Platform to ensure the technology solutions leverage cutting edge integration capabilities.
- Facilitate requirements gathering and process mapping workshops, review business/functional requirement documents, author technical design documents, testing plans and scripts.
- Assist with implementing standard operating procedures, facilitate review sessions with functional owners and end-user representatives, and leverage technical knowledge and expertise to drive improvements.
- Defining, designing and documenting reference architecture and leading the implementation of BI and analytical solutions.
- Follow agile development methodologies to deliver solutions and product features by following DevOps practices.
How will you get here?
- HS Degree required and 3-5 years of IT experience or BS degree e with major in computer science engineering (or equivalent) prefered
Experience, Knowledge, Skills, Abilities
- Experience in Databricks, Data/Delta lake, Oracle, SQL Server or AWS Redshift type relational databases.
- Experience in ETL (Data extraction, data transformation and data load processes)
- 3+ years working experience in data integration and pipeline development.
- Excellent experience in Databricks and Apache Spark.
- Data lake and Delta lake experience with AWS Glue and Athena.
- 2+ years of Experience with AWS Cloud on data integration with Apache Spark, Glue, Kafka, Elastic Search, Lambda, S3, Redshift, RDS, MongoDB/DynamoDB ecosystems.
- Strong real-life experience in python development especially in pySpark in AWS Cloud environment
- Design, develop test, deploy, maintain and improve data integration pipeline.
- Experience in Python and common python libraries.
- Strong analytical experience with database in writing complex queries, query optimization, debugging, user defined functions, views, indexes etc.
- Strong experience with source control systems such as Git and Jenkins build and continuous integration tools.
- Highly self-driven, execution-focused, with a willingness to do what it takes” to deliver results as you will be expected to rapidly cover a considerable amount of demands on data integration
- Understanding of development methodology and actual experience writing functional and technical design specifications.
- Excellent verbal and written communication skills, in person, by telephone, and with large teams.
- Strong prior technical, development background in either data Services or Engineering
- Demonstrated experience resolving complex data integration problems;
- Must be able to work cross-functionally. Above all else, must be equal parts data-driven and results-driven.