Sema4 is a patient-centered health intelligence company founded on the idea that more information, deeper analysis, and increased engagement will improve the diagnosis, treatment, and prevention of disease. Sema4 is dedicated to transforming healthcare by building dynamic models of human health and defining optimal, individualized health trajectories, starting in the areas of reproductive health and oncology. Centrellis™, our innovative health intelligence platform, is enabling us to generate a more complete understanding of disease and wellness and to provide science-driven solutions to the most pressing medical needs. Sema4 believes that patients should be treated as partners, and that data should be shared for the benefit of all.
Our Clinical Informatics Team is seeking a Data Scientist to join our team of highly motivated and passionate scientists and engineers working to improve health. We are looking for a candidate interested in developing and implementing novel approaches in translational medicine, or applying analytic and interpretive methods to integrate a wide variety of health and genomic data and leverage it towards improving treatment and prevention.
Our goal is to use these data to improve diagnostics, identify novel treatments, and offer clinical insights into both disease and wellness. We aim to deliver clinical applications into practice that will help clinicians target treatment and care to individuals’ health profiles rather than relying on a one-size-fits-all model, in order to improve patient outcomes. We are developing innovative new tools that address a range of needs prioritized by our physician partners using a vast repository of health and genomic data. The right candidate will work with other scientists in the Clinical Informatics team, Bioinformatic R&D, as well as the Product and Business Development teams in a multidisciplinary environment.
RESPONSIBILITIES
- Develop and apply novel computational methods for disease subtype stratification and digital phenotyping algorithm. Collaborate with team members and other scientists in multidisciplinary manner to ensure digital phenotypes can be adopted by other hospitals or medical centers.
- Develop and apply Natural Language Processing (NLP) tools to process free-text clinical notes for improving disease prognosis and diagnosis.
- Build machine learning and deep learning pipelines to assess risk for a number of diseases using integrated big data from various data sources, including genetics, genomics, electronic medical records, social, behavioral, and environmental information, wearable data, and medical imaging data.
- Identify novel indications or side effects for medications prescribed to patients in certain settings, in order to recommend strategies to improve standard of care.
- Work with the Bioinformatics and Business Development teams in collaboration with genetic labs, pharmaceutical or insurance partners, to use aggregated patient data to assess a variety of clinical questions.
QUALIFICATIONS
- PhD in Computer Science, Computer Engineering, Statistics, Bioinformatics or Computational Biology related field.
- 2+ years post-graduate experience of analyzing large data sets in healthcare/biotech/pharma with advanced analytics approaches, or equivalent PhD research experience.
- Extensive experience in machine learning with a proven track record of developing and applying advanced computational techniques to solve complex disease problems
- Proven experience in developing AI models for real-world environments and integrating ML into large-scale production applications.
- Knowledge of advanced machine learning methodologies such as deep learning, architectures and familiarity with at least one DL library (e.g., Tensorflow, Theano, PyTorch, Caffe)
- Experience with NLP and text mining
- Highly proficient in programming and scripting in at least one language (R, Python, etc.)
- Experience with SQL and Oracle databases
- Experience working with high-performance computing clusters, especially ones designed for AI/machine learning applications, such as AWS
- Familiarity with Common Data Models, such as OMOP CDM from OHDSI, is an advantage
- Excellent written and oral communication skills and ability to build strong relationships
- Ability to handle multiple competing priorities in a fast-paced environment