Join the Spark Team
We were born of innovation, springing from the curiosity, imagination and dedication of remarkable scientists and healthcare visionaries. Our shared mission is to challenge the inevitability of genetic disease by discovering, developing, and delivering treatments in ways unimaginable – until now.
We don’t follow footsteps. We create the path.
The Bioinformatics Group within the Data Science Organization at Spark Therapeutics is seeking an engaged and passionate Sr Data Scientist to participate in and support projects involving high-dimensional and complex data such as Mass Spectrometry or Imaging across the Technology & Research Organizations. Projects may also require knowledge of computational protein modeling and structural prediction, and associated domain tools and resources. Key areas of focus include exploratory analytics, process & method development, and automation. He/she will be responsible for:
- Analysis and interpretation of high-dimensional data, including genomic, Imaging, and proteomic datasets
- Engaging in cross-functional discussions, providing conceptual input in experimental and study design and serving as subject matter expert in bioinformatics
- Developing custom bioinformatics tools and machine learning models
- Building and validating data pipelines using a combination of open source and in-house developed tools for various data types and studies
- Summarizing, visualizing, and presenting analyses and findings to key stakeholders
- Supporting evaluation and writing of study reports, scientific presentations, and SOPs
- Working with the rest of the bioinformatics group & data science organization to build the infrastructure (e.g. data capture and analysis software, developing SOPs)
% of Time Job Function and Description
50 Bioinformatics analysis, interpretation, and communication of biological data to support various technology development projects
30 Develop in-house computational tools and machine learning models to support analysis of high-throughput datasets
15 Generate technical reports, prepare presentation slides, generate novel concepts.
5 Trainings, lab meetings and administration work
Education and Experience Requirements
- Ph.D. in Computer science, Computational Biology, Bioinformatics, or related disciplines
- Minimum of 3 years of post-graduate experience in genomics and bioinformatics
- Extensive experience in processing and analysis of large and complex data i.e. Imaging data, Mass Spectrometry (and other Proteomics data), sequencing data
- Demonstrable track record in the core competency areas: high dimensional data analysis, applying machine learning (ML) to biological datasets, data visualization
Key Skills, Abilities, and Competencies
- Proficiency in programming using one or more common data science languages such as Python, R, SPARQL, SQL
- Experience using bioinformatics workflow technologies such as WDL, CWL, Cromwell, Docker
- Familiarity with commonly used bioinformatics tools such as BWA, Samtools, BLAST, GATK suite
- Strong familiarity with core concepts in molecular biology and related lab technologies
- Familiarity with ML libraries such as scikit-learn, TensorFlow, Keras, etc…
- Track record of following best practices of coding, version control (Git), code documentation, and reproducible research
- Proven ability to work independently & in a collaborative group setting
- Self-motivated to learn and develop new methodologies, manage multiple analysis pipelines simultaneously, keep accurate records, follow instructions, and comply with company policies.
- Excellent communication skills (both oral and written)
- Experience with AAV vectors and gene therapy is preferred, but not required
- Experience managing external collaborations is a plus
Please be aware that Spark mandates COVID-19 vaccination of all employees regardless of work location. Accommodations may be made in accordance with applicable law.