St. Michael's Hospital

Research Data Analyst MAP UpstreamLab

Job description

At the Upstream Lab, our research is driven by the desire to promote health equity and address social determinants of health. We are currently looking for aPermanent full timeResearch Data Analyst,tojoin our data cluster to support research projects focused on leveraging data and technology to improve health systems.

The research data analyst¿s primary role will be to perform data management and data analysis tasks including database extraction, dataset creation, database cuts, writing and executing reproducible code for data cleaning, processing, and for conducting analyses for data science projects. They will use their specialized technical skills to support studies that involve advanced analysis like AI/ML, geographical mapping, ML/NLP analysis (using Python/R). The research data analyst will serve as the resource for consultation and execution on data analysis for all data science related and other quantitative research projects and advise on data collection tools/forms, variable lists and types, and analysis plans. They will support capacity building for the team including training other lab members.

The research data analyst will report directly to the research program manager and work closely with research team and collaborators. This is a hybrid position, with staff required to come in office as needed.

Duties And Responsibilities

Data Management

  • Create and develop data management plans for grant submission and project initialization.
  • Maintain documentation on key datasets and program data requirements.
  • Develop standard operating procedures describing data analysis and error resolution processes.
  • Engage with partners/collaborators to discuss and clarify data requirements.
  • Attend meetings with clinicians, researchers, community, and leadership groups to understand project goals and deliverables.
  • Explain capabilities and limitations of datasets to researchers and other stakeholders.
  • Meet with research community, and leadership groups, as required, to provide overview on using/interpreting statistical data.
  • Performs cross functional and other duties as assigned and/or requested.

Data Quality Assurance

  • Perform quality control checks and data cleaning following standard operating procedures to ensure data is accurate and of high quality.
  • Analyze the causes of data errors and work with team members and external stakeholders to ensure errors/issues get resolved.

Analytic Methods

  • Identify methodological approaches and the data required to answer specific data science questions.
  • Assist with writing project protocols for data analyses, including protocols for: data extraction, data cleaning, and validation and checks for data quality.
  • Collaborate with team and external collaborators to develop new processes and solutions to meet their analytics needs.

Data Analysis

  • Lead data processing efforts for advanced analytics projects.
  • Pre-process raw data to prepare for analysis. This includes cleaning and merging data from multiple sources, as well as understanding overall data quality.
  • Conduct descriptive analyses to answer ad-hoc requests from team members.
  • Conduct exploratory analyses of data to assess feasibility of new data science projects using relevant software/programming language (e.g., R, Python, etc.).
  • Perform inferential statistical analysis in data analyses software (e.g., R and/or Python). Write reports summarizing the analysis.
  • Train machine learning models in Python and/or R with a focus on natural language processing.
  • Validate any output tables, listings or figures generated to ensure accuracy and reliability of analyses.
  • Write reports and contribute to research papers summarizing the analysis, tailored to different end-users (e.g. clinicians, researchers, etc.).
  • Provide technical support with any analytics required for interactive dashboards, and reports with interactive visualizations using libraries.
  • Efficiently debug analytic code.
  • Ensure that all analyses are reproducible.
  • Manage version control of code and documents using tools like git and GitHub.

The ideal candidate for this position will be highly motivated, adaptable, organized, team-minded, results oriented, and have:

  • Bachelor¿s degree in Computer Science, Statistics/Biostatistics, Engineering, and/or related discipline, requiredORdemonstrable equivalent combination of specialized education and experience.
  • At least 2 years of relevant work experience and computer programming/coding experience in a relevant software/programming/scripting language (R (preferably using tidyverse), Python, or other computer applications.
  • Experience in implementing real-world ML/NLP projects.
  • Experience in healthcare is an asset.
  • Fully proficient in the use of relevant programing languages and tools (e.g., SQL, R, Python, git)
  • Experience in data management and monitoring.
  • Demonstrated ability to manage or support statistical programming activities to support research operations (dry, wet, clinical etc.).
  • Experience merging and analyzing data coming from multiple sources (e.g., text files, databases, excel etc.).
  • Experience analyzing comparative studies and/or clinical trial data is an asset.
  • Experience working with and manipulating large datasets.
  • Experience in research data collection tools: proficiency with REDCap is an asset.
  • Must be detail oriented.
  • Proficient with MS Office Software (Word, Excel, PowerPoint, Outlook etc.).
  • Strong written and verbal communication.
  • Ability to adapt to changing priorities and workloads.
  • Ability to multi-task and prioritize.
  • Demonstrated ability to work effectively with and communicate with individuals of varying levels of statistical and computer expertise.

Please let the company know that you found this position on this Job Board as a way to support us, so we can keep posting cool jobs.