Develop optimal engineering solution for automating and monitoring production grade machine learning and data-driven initiatives; supporting Batch and Real-time delivery
Conduct design and code reviews with a major focus on performance, scalability and future expansion
Work closely with data scientists, business and IT teams to build platform and framework to enable machine learning and data analytics activities on a large-scale
Manage, monitor and optimise full machine learning lifecycle
Create and enhance data solutions enabling seamless delivery of data and is responsible for collecting, parsing, managing and analysing large sets of data
Create logical data model and implements the physical database structure and constructs and implements operational feature stores
Evaluate and renew implemented data and machine learning pipelines solutions to ensure their relevance and effectiveness in supporting business needs and growth
We are committed to a safe and healthy environment for our employees & customers and will require all prospective employees to be fully vaccinated.
The Ideal candidate should possess:
Degree in Computer Science, Data Science, IT or a related discipline.
2+ years of experience in software engineering or data engineering.
Programming experience in Python, Java, Pyspark and Scala.
Knowledge of Big Data frameworks like Hadoop, Spark, Impala, Hive, etc
Experience in data profiling, ETL development, testing and implementation
Experience in building and optimizing ‘big data’ data pipelines, architectures and data sets