Data Engineer - Machine Learning
Requisition #: 241343
Location: Johns Hopkins Health Care, Hanover, MD 21076
Category: Non-Clinical Professional
Work Shift: Day Shift
Work Week: Full Time (40 hours)
Weekend Work Required: No
Date Posted: June 26, 2020
Johns Hopkins HealthCare (JHHC) is the managed care and health services business of Johns Hopkins Medicine, one of the premier health delivery, academic, and research institutions in the United States. JHHC is a $2.5B business serving over 400,000 lives with lines of business in Medicaid, Medicare, commercial, military health, health solutions, and venture investments. JHHC has become a leader in provider-sponsored health plans and is poised for future growth.
Many organizations talk about transforming the future of healthcare, Johns Hopkins HealthCare is actually doing it. We develop innovative, analytics-driven health programs in collaboration with provider partners to drive improved quality and better health outcomes for the members and communities we serve. If you are interested in improving how healthcare is delivered, join the JHHC team.
We are looking for a Data Engineer to join our growing team of analytics experts. The hire will be responsible for implementing and maintaining products of machine learning models, developing SQL library to optimize our data pipeline, building data cubes for analytics team. The Data Engineer will support the team on data initiatives and data governance. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up. The candidate must be self-directed and comfortable supporting the data needs of multiple teams, systems and products.
Responsibilities for Data Engineer
- Identify, design, and implement internal process improvements: deploying and maintaining machine learning models, automating manual processes, optimizing data delivery, etc.
- Build SQL library for optimal extraction, transformation, and loading of data.
- Create and maintain optimal data pipeline architecture.
- Build standard for data governance of maintaining ETL library and analytics products.
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Work with data and analytics experts to strive for greater functionality in our data systems.
Qualifications for Data Engineer
- 3+ years of experience in a Data Engineer role, with a graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
- Experience in implementing machine learning models and deploying UI (web) into product environment.
- Advanced working SQL knowledge with a variety of databases.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management. Must have robust programming experience, including solid Python experience, following software engineering best practices as well as experience building and maintaining data pipelines and data assets
- Strong project management and organizational skills.
- Experience with tools as Python, ASP.NET, IIS, Hadoop, Spark, etc. Experience implementing machine learning models in consumer-facing applications. Experience working with a variety of data environments, e.g., Hadoop, HDFS, SQL, Mongo, DataBricks, ElasticSearch, etc.
- Working knowledge of noSQL / graphic database.
- Experience as an individual contributor, hands-on developer, non-manager role executing on engineering projects as a primary job responsibility
- Demonstrated knowledge of data management best practices
- Prioritization skills; ability to manage ad-hoc requests in parallel with ongoing projects
- Healthcare or wellness specific business knowledge and/or experience with behavioral and claims data preferred
- Experience running machine learning or AI applications at scale
- Experience with data pipeline frameworks such as Airflow, Luigi or Oozie
- Experience with search engines (Elasticsearch or Solr)
- Experience with cloud-based computing (AWS or Azure)
- Experience with Scala, in particular with Spark Scala API
- Familiarity with EHR data and standards (HL7 or FHIR)
- Experience with HBase, Neo4j, or other non-relational data bases
- Experience with code and process documentation
- Experience with explaining, educating, presenting and/or training non-engineers on engineering concepts and processes
- Experience with continuous integration and delivery
- Experience with ETL
Johns Hopkins Health System and its affiliates are Equal Opportunity/Affirmative Action employers. All qualified applicants will receive consideration for employment without regard to race, color, religion, sexual orientation, gender identity, sex, age, national origin, disability, protected veteran status, and or any other status protected by federal, state, or local law.