Job Description -
Experience - 5 - 6 years
Roles & Responsibilities
Is is a SaaS healthcare platform supporting revenue cycle management for providers and hospital systems. In this position you will actively contribute to the development of Healthcare’s SaaS based products and platform. We are looking for a savvy Engineer to join our Data Products team. The hire will be responsible for developing our data and data pipeline architecture and working alongside a Data Scientist on data cleansing, exploration, organization and building machine learning models.
Responsibilities For Data Engineer
- Create and maintain optimal data pipeline architecture for sourcing data from multiple structured and unstructured data sources.
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Perform data cleaning, exploration and statistical analysis to ensure data quality
- Work alongside a Data Scientist to develop modeling techniques for various product capabilities
Qualifications For Data Engineer
- Ability to crack complex problem statements around data structures irrespective of language (Java / Python / etc)
- Ability to follow design principles and not shy away from experimenting and learning new language, framework and tools.
- Ability to understand SQL or NoSQL – basically working with any DB with advanced querying abilities.
- Preferred to have (not necessary) Experience building and optimizing ‘big data’ data pipelines, architectures, and data sets. Though what we are looking for is ability to work under complex design architecture and comprehend.
- Knowledge of sprint and scrum
- Preferred - 5+ Years of Experience in ETL Applications ...like Apache Airflow, pipelines like Glue etc.
- Preferred - Strong analytical skills related to working with unstructured datasets.
- Preferred - Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- Preferred - A successful history of manipulating, processing and extracting value from large disconnected datasets.
- Preferred - Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
- Preferred - Experience with big data tools: Hadoop, Spark, Kafka, is a plus.
- Preferred - Experience with AWS cloud services: EC2, EMR, RDS, Redshift
- Preferred - Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
Skills: python,etl,big data