We are looking for a Data Engineer to expedite our Data Platform Stack to support our growth and also enable new product development in a dynamic, fast-paced startup environment.
Specific Responsibilities also include-
- Key stakeholder in influencing the roadmap of Infostretch’s Digital Therapeutics Data Platform.
- Build, operate and maintain highly scalable and reliable data pipelines to enable data collection from wearables, partner systems, EMR systems and 3rd party clinical sources.
- Enable analysis and generation of insights from structured and unstructured data.
- Build Datawarehouse solutions that provide end-to-end management and traceability of patient longitudinal data, enable and optimize internal processes and product features.
- Implement processes and systems to monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
- Deploy, support and productize analytics & visualization solutions to help improve patient acquisition, increase treatment / medication adherence, increase awareness of products and services with payors / providers and patients, provide real world clinical and patient reported evidence.
- Build and develop tools to support the use of AI / ML and other analytical models to improve understanding of patient behavior, provider prescribing, the patient experience on treatment, treatment patterns and more.
- Work with product and data science teams to leverage batch & streaming - unstructured, IoT, image, Kinematic, ePRO (Patient Reported Outcomes) and Clinical data to understand treatment and prescribing patterns, predictive maintenance of devices, publish real world evidence for health economics and outcomes, analyze patient behavior and sales effectiveness.
- Collaborate with internal stakeholders to develop business domain concepts and data modeling approaches to problems faced by the organization in the analytics arena.
- Maintain and optimize existing data platform services and capabilities to identify potential enhancements, performance improvements, design improvements.
- Writes & maintain unit/integration tests, systems documentation.
Desired Skills and Experience Minimum qualifications are:
- Masters or a bachelor's degree in Information Systems, MIS, Statistics, related field or equivalent work experience required.
- Extremely strong skills in at-least one programming and scripting language (Java, Python, Julia, Ruby, Go).
- Has built and deployed into production large-scale batch and real-time data pipelines using technology such as AWS Stack, Airflow, Spark, Cloudera, HortonWorks, H20.
- Deep understanding of how big data specific algorithms work and have experience building and maintaining high-performance algorithms
- Deep experience with AWS Big Data platform and services. (Redshift, Redshift Spectrum, S3, Glacier, DynamoDB, Parquet/Avro/ORC, EKC, ECS).
- Experience with one or more data analytics and visualization packages (Tableau, Quicksight, MicroStrategy).
- Strong communications skills for working with stakeholders with various backgrounds
- Expert knowledge of scaling and tuning large-scale distributed SQL and NoSQL systems.
- Strong quantitative, analytical, process development, facilitation and organizational skills required.
- Ability to multi-task, prioritize assignments and work well under deadlines in a changing environment with a cross functional agile team.
- 5+ years of experience in building and sustaining big data solutions, preferably in the healthcare industry or a regulated industry
- Familiarity with FDA Quality System Regulations, Medical Device Directive, ISO 13485, ISO 14971, and ISO 62304 standards is a plus