Data Engineer

Location: Somerville, MA 02143

*** Mention DataYoshi when applying ***

Company Overview

Imagine if we could better match patients with the treatments that prove the most effective for them . . .

GNS Healthcare applies a powerful form of AI called causal machine learning to predict which treatments will work for which patients, accelerating the clinical development of new drugs, and improving individual patient outcomes while reducing the total cost of care.

Headquartered in the biotechnology center of Somerville, Massachusetts, our patented REFS™ technology is based on recent breakthroughs in causal machine learning and AI that transforms massive quantities of patient data into in silico patients that simulate clinical trials. These computer models are simulated to discover new drugs for serious diseases, and power up solutions that, biopharmaceutical companies utilize to slow disease progression and improve therapeutic effectiveness. Our platforms and solutions have been validated across oncology, immunology, cardiovascular, metabolic disease, and neurology, etc. and have appeared in over 50 peer-reviewed publications.


  • Assists team members with the design and development of RWD analyses and predictive models.
  • Designs and builds consistent, reproducible, and testable ETL pipelines to ingest, normalize, and store data from large healthcare datasets, from clinical trials, to claims, and EHRs.
  • Supports projects including specific epidemiology, health outcomes and other observational studies to better understand disease natural history, prevalence, comorbidities, treatment patterns, and health and safety outcomes in ‘real world’ patient populations.
  • Functions as a healthcare data subject matter expert to support the design, development, testing, implementation, and support of clinical information and intelligence solutions.
  • Provides complete documentation and communication of all processes, methods, and results.
  • Supports production solutions, the ongoing updates, and maintenance of our reference data sources.


  • 3-5 years of data engineering experience with a thorough understanding of data lake architectures.
  • An expert in cloud data warehousing tools (e.g. Snowflake, Amazon RedShift, BigQuery, Microsoft SQL Server, Oracle, PostgreSQL, or equivalent) and ELT tools (e.g. Stitch, Fivetran, DBT, Glue). You thrive on building modern, cloud-native data pipelines and operations.
  • Fluent in SQL scripting
  • Experience working in healthcare, life sciences, and/or with diverse healthcare data sets (e.g. medical, pharmacy claims, and lab results)
  • Experience with industry standard measures and code sets (e.g. ICD-10 codes, ICD-9 codes, HCPCS codes, CPT codes, HEDIS metrics, ETGs, HCCs, DRGs, etc.) and publicly available sources (e.g. HCUP)
  • You thrive on mapping and designing ingestion and transformation of data from multiple sources, creating a cohesive data asset and Common Data Model (CDM)
  • Experience using a scripting language (Python, R, Java, Scala, C++, C# and Bash/PowerShell) to automate processes and support proprietary software. Machine learning, R, and python skills highly preferred.
  • Experience using Git/Bitbucket and working on shared code repositories
  • Experience using Tableau or other in-app data visualization platforms

Nice to Have Skills

  • Background in statistics, biostatistics, public health, research design, health economics, or other related quantitative healthcare field.
  • Experience with backend web (API) development, Kubernetes, Docker, and Tableau
  • Experience developing data frames
  • Experience with data processing frameworks such as Apache Spark, Beam, Dataflow, Crunch, Scalding, Storm, Hive and BigQuery
  • Experience extracting data from REST APIs and parallel processing large datasets

Company Culture

Our philosophy at GNS is simple: we cannot transform biomedicine with anything less than an all-star team. We are seeking smart, driven people who are experts in their field, have a track record of success and a passion for creating change. We believe that strong teams supercharge the performance of individuals, create a fun and dynamic workplace and great results for our clients and the people they serve.

We are passionate about our work and believe in the ability of our technology to change the world. Our core values of integrity, collaboration, value, diversity, and game-changing guide our behaviors with each other and our clients.

GNS offers competitive salaries, stock options, unlimited vacation, health, dental and vision insurance, life insurance, long-term disability, 401(k), generous parental leave, tuition reimbursement, professional development, volunteering opportunities, virtual social gatherings, and more.

Equal Employment Opportunity

GNS Healthcare provides equal employment opportunities to all employees and applicants for employment without regard to race, color, national origin, religion, sexual orientation, gender, gender identity or expression, age, veteran status, disability, pregnancy or conditions related to pregnancy, or genetics. In addition to federal law requirements, GNS Healthcare complies with applicable state and local laws governing nondiscrimination in employment in every location in which the company has facilities.


*** Mention DataYoshi when applying ***

Offers you may like...

  • MTU Aero Engines AG

    Data Engineer / Data Scientist (m/w/d)
  • QinetiQ

    Data Scientist Data Engineer
  • UnitedHealth Group

    Sr. Data Engineer, Analytics - Telecommute
    Eden Prairie, MN 55346
  • Ivy Tech Solutions inc

    Mortgage Data Engineer
  • Piper Companies

    Data Engineer - 100% Remote