Bioinformatics Data Scientist

Location: Charlottesville, VA 22911

*** Mention DataYoshi when applying ***

Position Purpose:

A bioinformatics data scientist is responsible for providing experimental design consulting and data analysis for large, high-throughput genomic experiments, with a focus on forensics and metagenomics. The bioinformatics data scientist will be responsible for designing and implementing annotated code for managing, manipulating, and analyzing large-scale genomic data, and for preparing thorough documentation and reporting.

Essential Duties and Responsibilities:

  • Develop tools for management, analysis and interpretation of high-density microarray and whole genome sequencing data.
  • Managing, manipulating, analyzing data using a combination of R, python, and UNIX tools.
  • Using established domain-specific open-source software and tools to manipulate and analyze genomic data.
  • Implement and execute data processing workflows and automated analytic pipelines.
  • Create standardized summary tables and figures using literate programming and reproducible workflows.
  • Conduct workflow benchmarking and documentation, identifying inconsistencies and resolving data problems.
  • Prepare SOPs, document source code/workflows, and write reports to summarize computational
    requirements, processing status, and customized analysis results.

Required Knowledge, Skills & Abilities:

  • Expert proficiency working in a Unix/Linux environment.
  • Expert proficiency with R, RMarkdown, and the tidyverse tools for data analysis.
  • Advanced proficiency with open-source software, tools, and databases for analyzing next-generation sequencing data (whole-genome sequencing, RNA-seq, epigenetics, microbiome, and metagenomics).
  • Proficiency working with and developing using Docker and/or Singularity container technology.
  • Proficiency using version Control software (e.g., Git or similar) to manage programming code.
  • Proficiency with Python, Perl, or another scripting language.
  • Preferred: Experience with NextFlow, SnakeMake, or similar workflow/pipeline management systems.
  • Preferred: Familiarity with developing and querying relational databases.
  • Preferred: Familiarity with AWS and/or Azure cloud computing.


  • MS or PhD in Bioinformatics, Genomics, Data Science, or related field
  • Experience (5+ years with MS or 3+ years with PhD) managing and analyzing large-scale datasets produced sequencing platforms and deliver solutions for managing, visualizing, analyzing, and interpreting genomic data
  • Advanced experience using Linux/Unix text processing tools, R, and other open-source tooling to manipulate and format data, to assess data quality, and analyze data.


  • This position requires that the candidate be willing and able to complete a successful background screening for a security clearance. Candidates with a current security clearance will receive preference.

Supervisory Responsibilities:

  • May serve as a task/project lead.

Working Conditions/ Equipment:

  • Ability to work in varying conditions to include: traditional office environments with sedentary extended periods required for code development and testing.

*** Mention DataYoshi when applying ***

Offers you may like...

  • Invitae

    Senior Bioinformatics Data Scientist
    Austin, TX
  • Deutsches Zentrum für Neurodegenerative Erkrankungen

    Data Scientist or Bioinformatician (f/m/x) for the...
    53127 Bonn
  • Nutrien Ltd.

    Principal Data Scientist (Bioinformatics) - Sustai...
    Champaign, IL
  • Stellenbosch University

    Bioinformatics Data Analyst
    Stellenbosch, Western Cape
  • University of Oxford

    Senior Bioinformatician/Data Scientist
    Oxford OX3