Scientific Data Engineer

Location: Seattle, WA 98109

*** Mention DataYoshi when applying ***

Scientific Data Engineer

The mission of the Allen Institute is to unlock the complexities of bioscience and advance our knowledge to improve human health. Using an open science, multi-scale, team-oriented approach, the Allen Institute focuses on accelerating foundational research, developing standards and models, and cultivating new ideas to make a broad, transformational impact on science.

The mission of the Allen Institute for Immunology is to advance the fundamental understanding of human immunology through the study of immune health and diseases where excessive or impaired immune responses drive pathological processes. The Institute will employ a multi-disciplinary team approach in collaboration with academic centers of human immunology to generate novel mechanistic insights into the immune synapse in health and in diseases such as autoimmunity or oncology. The Institute will simultaneously provide a foundational data set and tools for future immunological research as well as a novel collaboration portal for the broader scientific community.

The Allen Institute for Immunology is seeking a Bioinformatics Data Engineer (Data Scientist) with broad experience in developing computer codes/scripts to automate the analysis of omics data, especially next generation sequencing (NGS) data, to join our Informatics and Computational Biology team.

You will be part of a multidisciplinary team and will be responsible for (i) development and implementation of data processing and analysis software as needed, (ii) assisting in both pipeline and exploratory analysis of data from diverse assays and sample types, (iii) working towards visualizations and reports for internal and external dissemination. As such, ideal candidates should have a good understanding of sequencing technologies, and a proven track record of development of analytical software packages. This role includes analysis and integration of “big data” types, and working in close collaboration with the software development team for deployment on our interactive cloud environment to ensure user accessibility and generation of actionable insights. You will also support technology development projects in collaboration with the Molecular Biology and Immunology teams.

Good judgment and problem-solving skills are required for recognizing anomalous data, identifying and fixing code bugs and participating in data-driven algorithm design and improvement. A successful candidate will have demonstrated success in big data science, code optimization and deployment. The Bioinformatics Data Engineer must have excellent attention to detail and the eagerness to work in a team science, deadline-driven atmosphere.

Essential Functions

  • Design and develop software programs to optimize scRNA-seq, scATAC-seq & CITE-seq processing pipelines and analysis algorithms including PCA and dimensionality reduction
  • Deploy automated pipelines in our interactive cloud environment with graphical user interface to facilitate user accessibility
  • Publish codebase or software as part of high impact publications or releases
  • Integrate multiple data streams for “Big Data” analysis (examples include scRNA-seq, scATAC-seq, flow cytometry, WGS)
  • Generate interactive data visualizations and work with end users to identify actionable insights
  • Exploratory data mining
  • Meet production deadlines for data analysis and be able to pivot between multiple projects

Required Education and Experience

  • Bachelor's degree in a big data computational field (e.g., Bioinformatics, Computer Science, Biostatistics, Physics, Mathematics) with a minimum of 2 years experience in analyzing omics data.
  • Demonstrated success in a multidisciplinary team environment.
  • Good understanding of sequencing technologies, data processing and integrative analysis
  • Fluency in Java, Python, R and Unix shell scripting.
  • Experience in Big Data analysis, code optimization & parallel programming. Proven experience with big data analysis technical and languages such as Apache Spark, BigTable, Scala or Rust.
  • Good knowledge of version control systems such as Git
  • Strong organizational, teamwork, and communication skills
  • Attention to detail, and good problem-solving skills

Preferred Education and Experience

  • Masters or PhD in Bioinformatics/Computational Biology or similar
  • Familiarity with immunology
  • Understanding of Flow Cytometry and CyTOF analysis a plus
  • Familiarity with cloud computing
  • Ability to implement, test, and share new computational tools quickly, in an iterative manner, after feedback from experimental, data production, and analysis teams
  • Excellent work ethic displayed as a reliable, self-motivated, enthusiastic team player
  • Ability to learn new programming languages and packages
  • Eager to learn new skills

Work Environment

  • Working at a computer and using a mouse for extended periods of time

Expected Hours of Work

  • May need to work outside of standard working hours at times
  • Allen Institute is a Washington State Employer. This role will be performed in Washington State. This role is currently able to work remotely, in Washington State, due to COVID-19 and our focus on employee safety. We continue to evaluate the safest options for our employees. As restrictions are lifted in relation to COVID-19, this role will return to work onsite. (Temporary to address COVID)


  • Some travel may be required

Additional Comments

  • We are open to full-time, part-time, and/or contract work for this role. When you apply, please specify which work arrangement you desire. We are flexible.

**Please note, this opportunity does sponsor work visas**

**Please note, this opportunity offers relocation assistance**

It is the policy of the Allen Institute to provide equal employment opportunity (EEO) to all persons regardless of age, color, national origin, citizenship status, physical or mental disability, race, religion, creed, gender, sex, sexual orientation, gender identity and/or expression, genetic information, marital status, status with regard to public assistance, veteran status, or any other characteristic protected by federal, state or local law. In addition, the Allen Institute will provide reasonable accommodations for qualified individuals with disabilities.

*** Mention DataYoshi when applying ***

Offers you may like...

  • Thermo Fisher Scientific

    Sr. Data Engineer- Remote
    Pittsburgh, PA 15122
  • Thermo Fisher Scientific

    Sr. Data Engineer- Remote
    New York, NY 10001
  • VisévoT

    Scientifique de données cliniques - Clinical data ...
  • RCSI

    Data Scientist / Scientific Programmer
  • Thermo Fisher Scientific

    Sr. Data Scientist