Scientific Data Analyst

Job description


The EPA National Student Services Contract has an immediate opening for a full time Scientific Data Analyst position with the Office of Research and Development at the EPA facility in Research Triangle Park, NC.

The Office of Research and Development at the EPA supports high-quality research to improve the scientific basis for decisions on national environmental issues and help EPA achieve its environmental goals. Research is conducted in a broad range of environmental areas by scientists in EPA laboratories and at universities across the country.

What the EPA project is about

The Center for Computational Toxicology and Exposure (CCTE) supports ORD by providing solutions-driven research to rapidly evaluate the potential human health and environmental risks due to exposures to environmental stressors and ensure the integrity of the freshwater environment and its capacity to support human well-being. CCTE researchers are developing and applying cutting edge innovations in methods to rapidly evaluate chemical toxicity, transport, and exposure to people and environments. Within CCTE, the Scientific Computing & Data Curation Division (SCDCD) develops the knowledge and information architecture necessary for integrating, transforming, and managing large scale data streams related to assessing the risk of chemicals. SCDCD creates and manages online tools and ensures compatibility with existing chemistry, toxicology, and other experimental data sources.
As a team member, you will support research under the Chemical Safety for Sustainability (CSS) research program providing structured and computationally accessible data to support tiered toxicity testing of chemicals.

You will assist with support of data needs, including development of datasets to parametrize hazard models, assistance in the testing or evaluation of data and databases, and basic summaries and analyses of data. The work may include formatting datasets into standard templates and uploading into databases, testing and evaluating ease-of-use of software, and application of data science and machine learning techniques.

The duties of the team member will include, but are not limited to:

  • Write scripts in Python to support data extraction, transformation, loading, and migration capabilities for the ToxVal database, including identification of tasks for automation;
  • Utilize and monitor a code repository in BitBucket;
  • Perform user acceptance testing for software upgrades or new features;
  • Implement data management processes including data provenance and back-up techniques;
  • Contribute to design and implementation improvements to performance, efficiency, reusability, scalability, and stability of databases, data pipelines, and ETL (Extract Transform Load) processes; and
  • Develop, generate, and review QC, technical testing, and troubleshooting logs and reports to monitor and flag data quality according to specified criteria.

Communications-related responsibilities will include:

  • Contribute verbally (in person or online) and through written communication with an interdisciplinary team of developers and scientists to ensure development and resulting data products uphold scientific domain requirements;
  • Respond to and create Jira tickets describing work-related tasks;
  • Create data visualizations and presentations to communicate results to stakeholders;
  • Thoroughly document all work as directed by EPA mentor to comply with EPA quality assurance procedures for transparency and reproducibility of work;
  • Draft reports describing complex datasets or data manipulation procedures;
  • Present work in internal reports/memos to be used by EPA scientists; and
  • May have the opportunity to present work at scientific conferences or contribute to scientific manuscripts.

Required Knowledge, Skills, Work Experience, and Education

  • Domain knowledge of data management and/or mining techniques;
  • Strong reading comprehension skills and experience logically interpreting pieces of information from a variety of data source types; and
  • Experience programming in Python.

Desired Knowledge, Skills, Work Experience, and Education
  • Experience programming in R;
  • Experience with databases (e.g. MySQL);
  • Experience with toxicology data; and
  • Knowledge of organic chemistry nomenclature.

This job will be located EPA’s facility in Research Triangle Park, NC.

Salary: Selected applicant will become a temporary employee of ORAU and will receive an hourly wage of $30.76 for hours worked.

Hours: Full-time.

Travel: Travel related to the position is not anticipated.

Expected start date: The position is full time and expected to begin Spetember 2022. The selected applicant will become a temporary employee of ORAU working as a contractor to EPA. The project renews each May through 2025.

For more information, contact EPANSSC@orau.org. Do not contact EPA directly.

  • Be at least 18 years of age and
  • Have earned at least a Masters’ degree in computer science, statistics, information technology, bioinformatics, engineering, or a related field, including intermediate training or coursework in computer programming from an accredited university or college within the last 24 months and
  • Be a citizen of the United States of America or a Legal Permanent Resident.

EPA ORD employees, their spouses, and children are not eligible to participate in this program.

Please let the company know that you found this position on this Job Board as a way to support us, so we can keep posting cool jobs.