Data Engineer

Location: New York, NY 10003

*** Mention DataYoshi when applying ***

Sema4 is a patient-centered health intelligence company founded on the idea that more information, deeper analysis, and increased engagement will improve the diagnosis, treatment, and prevention of disease. Sema4 is dedicated to transforming healthcare by building dynamic models of human health and defining optimal, individualized health trajectories, starting in the areas of reproductive health and oncology. Centrellis™, our innovative health intelligence platform, is enabling us to generate a more complete understanding of disease and wellness and to provide science-driven solutions to the most pressing medical needs. Sema4 believes that patients should be treated as partners, and that data should be shared for the benefit of all.

Our Engineering team seeks a talented Data Engineer to help design, build, and maintain the data pipelines that power Sema4’s advanced analytics tool, Centrellis. This health intelligence platform analyzes and interprets extensive information about known inherited diseases, including related mutations, frequency across populations, and the penetrance and expressivity of genetic changes.


  • Instrument and monitor data stores with performance and latency
  • Optimize data structures, schemas, indices, and storage engines
  • Perform and report out ad-hoc data analyses
  • Establish and maintain data security protocols
  • Optimize data storage costs
  • Develop and maintain ETL processes
  • Create quality metrics and monitoring tools to ensure high fidelity data
  • Participate in database schema design with our bioinformatics team
  • Work within a software engineering team to deliver data products to market
  • Take ownership of what we’re building and participate in the technology stack decisions
  • Perform and report on ad-hoc data analyses
  • Administer databases, including backups, performance tuning, and load balancing


  • Bachelor’s Degree required – Advanced Degree a plus
  • 3+ years’ experience designing and building data pipelines
  • Strong & clear communication skills
  • Proven track record of data engineering, backing real-world shipping software products
  • Write high-quality, well-tested, production-grade code
  • Strong work ethic
  • Strong knowledge of databases (relational and non-relational)
  • Proficiency in SQL and Python
  • Proficiency in SDLC tooling, including Git, CICD, and deployment automation
  • Strong communication both written and verbal

*** Mention DataYoshi when applying ***

Offers you may like...

  • Auchan Retail France

    Data Engineer / Machine Learning Engineer - SQL / ...
    59491 Villeneuve-d'Ascq
  • CollabraLink Technologies

    Data Engineer - Mid Level
  • Jobvite

    Data Engineer
  • AdamsGabbert

    Data Engineer
    Kansas City, MO 64106

    Geospatial Data Engineer
    Brooklyn, NY