Job description

About Alaffia & Our Mission

The U.S. healthcare system suffers from over $300B in improper payments each year due to fraud, waste, abuse, and processing errors. We’re on a mission to change that. We’ve assembled a team of experienced technologists and industry-leading healthcare domain experts to best prevent inaccurate payments. The Alaffia team has alumni ranging from Amazon, Goldman Sachs, the Centers for Medicare and Medicaid Services, and other leading healthcare and financial institutions. We’re also backed by industry-leading venture capital firms!

If you want to make a major impact at the core of U.S. healthcare by implementing the latest in cutting-edge technologies, then we’d like to meet you.

Our Culture

At Alaffia, we fundamentally believe that the whole is more valuable than the sum of its individual parts. Further to that point, we believe a diverse team of individuals with various backgrounds, ideologies, and types of training generates the most value. Our people are entrepreneurial by nature, problem solvers, and passionate about what they do — both inside and outside the office.

About the Role & What You’ll Be Doing

Alaffia’s core value is derived from our health insurance payments data. Building pipelines and infrastructure for processing and transforming this data is an integral part of our work, and we’re looking for a talented data engineer who loves designing and building data pipelines and services that enable us to change the world of healthcare.

We’re looking for someone who relishes the challenges of dataflow system design, implementation, and optimization and enjoys quickly iterating on new pipelines and services in a fast-paced development environment. In this role, you’ll have the opportunity to build the centerpiece of a healthcare payments platform with real social and economic impact. You’ll be making a dent in the struggles of our nation’s healthcare payments system from your first day.

Your Responsibilities

Reporting directly to our CTO, you’ll be:

  • Writing production-level code in languages such as Python and Rust for our data pipelines and adjacent services
  • Designing new services to extract value in a real-time data ingestion context
  • Architecting complex data pipelines using Apache Airflow and Kubernetes to enable human-in-the-loop machine learning models in production
  • Innovating in data engineering by leveraging modern technologies such as Apache Arrow, Postgres Foreign Data Wrappers, Deltalake, and MLFlow
  • Working closely with our backend and machine learning engineers to implement new APIs and services to power new in-app functionality
  • Collaborating with our Product team to enable frontend features empowered by a robust data pipeline

What We’re Looking For

  • 5+ years of experience
  • Experience building and deploying production data processing pipelines
  • Experience with:
  • Dataframe frameworks such as Pandas, Polars, Nushell, or Spark
  • Orchestrators such as Airflow or Kubeflow
  • At least one scripting language, preferably Python
  • PostgreSQL
  • Datalakes such as Snowflake, Deltalake, Azure Synapse, or AWS
  • Athena/Glue
  • Others:
    • Team development, Agile development
    • Production deployments
    • AWS, GitHub, CI/CD using GitHub Actions, PLG Stack
    • Jest, Enzyme, Pytest


New York, NY (preferred)

What Else Do You Get Working With Us?

  • Fully-covered Medical, Vision and Dental benefit
  • Competitive compensation package (cash + equity)
  • Unlimited PTO
  • Work in a flat organizational structure — direct access to Leadership

Please let the company know that you found this position on this Job Board as a way to support us, so we can keep posting cool jobs.