Reporting to the Technical Manager, Digital Products, the Junior Data Scientist works as part of a team to analyze structured and unstructured data, model complex problems, and identify opportunities for process and product optimization by using statistical, algorithmic, mining, and visual techniques. This role assists in developing machine learning (ML) predictive and prescriptive analytics models through the innovative understanding and use of large data sets and the verification of effectiveness to improve clinical processes and patient outcomes. The Junior Data Scientist supports Providence Health Care (PHC) strategic priorities by understanding the clinical, financial, and operational issues to be solved and working closely with stakeholders, clinical and technical experts, and functional teams to leverage knowledge, interpret outputs, deploy solutions, and provide actionable insights. The role also assists in developing a solid and sustainable machine learning foundation and competency for PHC.
- Knowledge of supervised machine learning, decision trees, and logistic regression.
- Display comprehensive understanding of, and skills using, statistical and data mining techniques such as GLM/Regression, Random Forest, Boosting, Trees, text mining, network analysis, simulation, scenario analysis, and clustering analysis.
- Demonstrated ability to perform analytical functions and transform database structures including creating datasets and writing computer code to execute complex queries using statistical computer languages such as Python, R, and SQL.
- Demonstrated proficiency working with large volumes of data across multiple servers using distributed data/computing tools such as Hadoop, Spark, MySQL, AWS, etc.
- Demonstrated proficiency working with both relational (SQL) and non-relational databases (NoSQL).
- Demonstrated understanding of data privacy, security and related tools such as anonymization and encryption
- Demonstrated ability to use web services such as Redshift, S3, DigitalOcean, etc.
- Demonstrated skills in using data visualization tools (such as Jupyter, Matplotlib, D3, ggplot, Periscope, Business Objects) and to visually present complex data to stakeholders for consideration.
- Demonstrated skills in knowledge synthesis and translation activities including working with and sorting and manipulating unstructured data from different platforms.
- Excellent oral and written communication skills and ability to clearly and fluently translate technical findings to non-technical partners and to communicate to multiple audiences using data storytelling and through graphics.
- Demonstrated ability to work collaboratively in an interdisciplinary environment and to develop recommendations using facilitation and consensus building.
- Strong analytical, critical thinking, and evaluation skills to discern and help solve the important problems facing health care, to identify new ways to leverage our data, and to direct efforts in the right direction.
A Masters’ Degree in Mathematics, Statistics, Computer Science, Engineering or other quantitative degree is required plus three (3) years’ experience working with large datasets and machine learning models including experience using statistical and data mining techniques, and distributed data/computing tools; writing computer code; querying databases; and using statistical computer languages, or an equivalent combination of education, training and experience.
1. Assists in transforming data into critical information and knowledge by working as part of the digital products team and with clinical management and staff, project/program managers, and members of the health informatics team to develop and implement ML Models. Uses these advanced ML models to identify patterns, trends, and opportunities to assist in making predictions or reducing workload that will have a significant impact across various clinical domains within PHC.
2. Identifies, cleans, and integrates large sets of structured and unstructured datasets from disparate sources for use in ML models and products. Enhances data collection procedures to include information that is relevant for building advanced ML models. Provides input to applications, databases, and systems used to assess study data quality.
3. Works as part of the digital products team to use advanced ML processes to convert data from non-functional forms, such as scanned image text, to functional forms ready for use in further ML models.
4. Assists in developing predictive and prescriptive analytic models in support of the organization’s clinical and business initiatives and priorities by working as part of the digital products team to apply advanced statistical and computational methods and innovative use of data, collaborate with Developers in the construction of analytic models, and maintain detailed project status plans to achieve ML development cycle timelines and avoid development delays.
5. Reviews clinical data at aggregate levels on a regular basis using analytical reporting tools to support the identification of risks and data patterns or trends. Creates analytical reports and presentations to facilitate review and adoption of data-driven choices. Collaborates with project/program teams to address data-related questions and to recommend potential solutions.
6. Works with other members of the digital products team to assist with recommendations to management regarding strategic actions to maintain the ML development pipeline, analytic architectures, and life cycle, to avoid potential negative consequences and system failures, and to increase the positive impacts of ML systems.
7. Works closely with clinical and management teams across PHC to strategize, develop, and implement artificial intelligence (AI) products that translate into improved quality of care, clinical outcomes, reduced costs, temporal efficiencies, and process improvements.
8. Identifies, engages, and collaborates with specific stakeholders as required for the development of AI products designed around PHC’s strategic priorities and clinical/business problems. Assesses and implements improvements to AI products as needed and creates anomaly detection systems to track performance and data accuracy.
9. Assists the digital product team members to communicate analytic solutions to management and shares AI product status throughout the various stages of the product lifecycle.
10. Works with other members of the digital products team to support management in the development of strategies for scaling successful projects across the organization based on feedback from clinical/business clients and end-users by maintaining project and other documentation, reviewing findings, and presenting analysis and actionable insights for further discussion and decision.
11. Assists data scientists in fostering and developing a solid and sustainable machine learning foundation and competency for PHC. Assists management with the dissemination of successes and failures in an effort to increase analytics literacy and adoption across PHC.
12. Keeps up-to-date with the latest technology trends and methods by staying abreast of state-of-the-art literature in the fields of operations research, statistical modeling, statistical process control and mathematical optimization.
13. Performs other related duties as required.