Anderson Cancer Ctr

Data Scientist - Genetics

Job description


Led by Prof. Peter Van Loo , the Cancer Genomics and Evolution Laboratory aims to answer the big question: How do tumors evolve? The lab's research focuses on large-scale pan-cancer genomics to gain insight into the genes, mutational processes and evolution of cancer. Our work is highly data-driven, with a focus on large-scale data analysis to gain broad biological insight and on the development of computational methods to enable conceptually novel analyses. The research group is mostly computational with a small wet-lab component.

Since its inception, members of the Cancer Genomics laboratory have co-authored 15 papers in Nature, Science or Cell. Recent successes include pan-cancer studies of the evolutionary history of cancer (Gerstung et al. , Nature 2020), intra-tumor heterogeneity (Dentro et al. , Cell 2021), the mutational landscape in non-unique regions of the human genome (Tarabichi et al. , Nature Biotechnology 2021), and biallelic mutations (Demeulemeester et al. , Nature Genetics 2022).


Analyze cancer genomics datasets, perform pre-processing and bioinformatics analysis of bulk and singe-cell sequencing data generated in-house and/or by large-scale consortia, and maintain and develop computational analysis processing pipelines and the lab's own bioinformatics methods.
  • Maintain and develop bulk and single cell DNA, RNA, and epigenomic computational algorithms and data processing pipelines
  • Maintain and further develop the lab's computational methods and software packages, in collaboration with students and postdocs in the lab
  • Use bioinformatics approaches and pipelines to analyze whole-genome sequencing data, RNA sequencing data, and whole-genome bisulfite sequencing data
  • Analyze single-cell sequencing data and spatial genomics and transcriptomics profiling data, and integrate data across 'omics layers
  • Develop and maintain pipelines for bioinformatics and statistical analyses of aforementioned data types, including raw data processing and downstream analyses, statistical analysis and summarizing findings
  • Deploy bioinformatics pipelines in high-performance computing environments
  • Visualize data and critically interpret results
  • Maintain knowledge of latest bioinformatic approaches and genomic technologies and implement these where appropriate
  • Present results at meetings
  • Prepare written reports, manuscripts for scientific publication, and grant applications
Expected Skills
  • Unix, R, Python, Perl, or other scripting and programming languages
  • Using high-performance computing platforms and pipelining tools (e.g. Nextflow)
  • Knowledge of bioinformatics tools used in genomics research
  • Demonstrated experience and understanding of genomic technologies and analysis of data generated
  • Analyzing, summarizing, and interpreting data
  • Knowledge of statistics

Required: Bachelor's degree in Biomedical Engineering, Electrical Engineering, Computer Engineering, Physics, Applied Mathematics, Science, Engineering, Computer Science, Statistics, Computational Biology, or related field.

Preferred: Master's degree or PhD.


Required: Three years experience in scientific software development/analysis. With Master's degree, one years experience required. With PhD, no experience required.

It is the policy of The University of Texas MD Anderson Cancer Center to provide equal employment opportunity without regard to race, color, religion, age, national origin, sex, gender, sexual orientation, gender identity/expression, disability, protected veteran status, genetic information, or any other basis protected by institutional policy or by federal, state or local laws unless such distinction is required by law.

Please let the company know that you found this position on this Job Board as a way to support us, so we can keep posting cool jobs.