Lead Data Engineer (Series C Startup)

Location: San Francisco, CA

*** Mention DataYoshi when applying ***

Who is Recruiting from Scratch:

Recruiting from Scratch is a premier talent firm that focuses on placing the best product managers, software, and hardware talent at innovative companies. Our team is 100% remote and we work with teams across the United States to help them hire. We work with companies funded by the best investors including Sequoia Capital, Lightspeed Ventures, Tiger Global Management, A16Z, Accel, DFJ, and more.
If you are a fit, the team will reach out to you about this role or any others that may be a fit for our clients.

Our Client

The biggest bottleneck in bringing new treatments to patients is the clinical trial. On average, getting a drug through the trial process takes nearly a decade and frequently costs $1B+. And the problem is only getting worse.

Our client is a new healthcare company that owns the end-to-end drug development process. Their proprietary technology allows us to integrate and improve clinical research for patients, providers, and sponsors, while executing clinical trials faster and cheaper.

This role will be required that the hire can make it into the HQ office in New York City or Boston with remote flexibility through 2022.

The Role

As a Lead Data Engineer you will be responsible for our client's clinical data platform. You will lead the engineering effort to ingest millions of Electronic Health Records, clean and structure this data for analytical and product use cases, and identify patients that will be served by a clinical trial. You will partner with the Data, Product, and Medical teams to set and achieve targets for data quality, and build a learning feedback loop to move the needle over time. You will evolve our data infrastructure to meet growing operational and data complexity and scale. You will become a domain expert in clinical data and its application to products and operations across the company. As a founding member of the Clinical Data team, you will play a significant role in developing the team’s culture and strategy. Ultimately, you will leverage data to bring treatments to patients who may not have had access otherwise.


  • Build and maintain pipelines to clean and structure complicated health data
  • Evolve infrastructure and data architecture to accommodate product needs
  • Partner with Data Analysts to assess the quality of our data and automate targeted improvements
  • Implement data privacy and security as necessary, for example by implementing de-identification of Personally Identifiable Information
  • Create tools to continuously monitor, test, and optimize our clinical data pipeline to ensure timely delivery and high quality
  • Collaborate with operational and product partners to achieve business and mission outcomes
  • Partner with our Data team to maintain and scale data warehousing and analytics as necessary (Redshift, DBT)
  • Help enforce best practices and promote testability and maintainability throughout our systems and codebase

What We’re Looking For

  • Minimum 4 years of professional software development experience
  • Professional experience building and maintaining data pipelines (e.g. Airflow, Prefect, or Luigi)
  • Fluency in SQL and at least one other programming language
  • Strong knowledge of data modeling
  • Experience architecting data systems
  • Comfortable with Linux, Docker, and cloud technologies
  • Excellent problem solving and debugging skills
  • Strong communication skills with the ability to convey complicated systems to both technical and non-technical audiences
  • B.S. in Computer Science or related field, or equivalent experience

Nice to Have

  • Experience building cross functional feedback loops
  • Experience with infrastructure as code tools (Ansible, Terraform, etc)
  • Experience performance tuning row-based (PostgreSQL) and columnar (e.g. Redshift) data stores
  • Experience working with healthcare data (Electronic Health Records, Insurance Claims, etc.)

*** Mention DataYoshi when applying ***

Offers you may like...

  • Kandji

    Lead Data Engineer
  • Electra Vehicles

    Lead Data Scientist
    Boston, MA
  • GrowthBook

    Lead Data Scientist (Remote)
  • Sun Life Financial

    Lead Data Scientist/ Director
    Toronto, ON
  • Weir Minerals Australia

    Lead Data Scientist
    Sydney NSW