Alaffia Health

Data Engineer

Python AWS Machine Learning

January 11, 2022

Apply Now

Remote, Remote

January 11, 2022

Apply Now

Job description

About Alaffia & Our Mission

The U.S. healthcare system suffers from over $300B in improper payments each year due to fraud, waste, abuse, and processing errors. We’re on a mission to change that. We’ve assembled a team of experienced technologists and industry-leading healthcare domain experts to best prevent inaccurate payments. The Alaffia team has alumni ranging from Amazon, Goldman Sachs, the Centers for Medicare and Medicaid Services, and other leading healthcare and financial institutions. We’re also backed by industry-leading venture capital firms!

If you want to make a major impact at the core of U.S. healthcare by implementing the latest in cutting-edge technologies, then we’d like to meet you.

Our Culture

At Alaffia, we fundamentally believe that the whole is more valuable than the sum of its individual parts. Further to that point, we believe a diverse team of individuals with various backgrounds, ideologies, and types of training generates the most value. Our people are entrepreneurial by nature, problem solvers, and passionate about what they do — both inside and outside the office.

About the Role & What You’ll Be Doing

Alaffia’s core value is derived from our health insurance payments data. Building pipelines and infrastructure for processing and transforming this data is an integral part of our work, and we’re looking for a talented data engineer who loves designing and building data pipelines and services that enable us to change the world of healthcare.

We’re looking for someone who relishes the challenges of dataflow system design, implementation, and optimization and enjoys quickly iterating on new pipelines and services in a fast-paced development environment. In this role, you’ll have the opportunity to build the centerpiece of a healthcare payments platform with real social and economic impact. You’ll be making a dent in the struggles of our nation’s healthcare payments system from your first day.

Your Responsibilities

Reporting directly to our CTO, you’ll be:

Writing production-level code in languages such as Python and Rust for our data pipelines and adjacent services

Designing new services to extract value in a real-time data ingestion context

Architecting complex data pipelines using Apache Airflow and Kubernetes to enable human-in-the-loop machine learning models in production

Innovating in data engineering by leveraging modern technologies such as Apache Arrow, Postgres Foreign Data Wrappers, Deltalake, and MLFlow

Working closely with our backend and machine learning engineers to implement new APIs and services to power new in-app functionality

Collaborating with our Product team to enable frontend features empowered by a robust data pipeline

What We’re Looking For

5+ years of experience
Experience building and deploying production data processing pipelines
Experience with:
Dataframe frameworks such as Pandas, Polars, Nushell, or Spark
Orchestrators such as Airflow or Kubeflow
At least one scripting language, preferably Python
PostgreSQL
Datalakes such as Snowflake, Deltalake, Azure Synapse, or AWS
Athena/Glue
Others:
- Team development, Agile development
- Production deployments
- AWS, GitHub, CI/CD using GitHub Actions, PLG Stack
- Jest, Enzyme, Pytest

Location

New York, NY (preferred)

What Else Do You Get Working With Us?