The successful incumbent will build and support data pipelines and datamarts built off those pipelines, both must be scalable, repeatable and secure. Facilitate gathering data from a variety of different sources, in the correct format, assuring that it conforms to data quality standards and assuring that downstream users can get to that data timeously.
Design and develop data feeds from an on-premise environment into a datalake environment in an AWS cloud environment.
Design and develop programmatic transformations of the solution, by correctly partitioning, formatting and validating the data quality.
Design and develop programmatic transformation, combinations and calculations to populate complex datamarts based on feed from the datalake.
Provide operational support to datamart datafeeds and datamarts.
Design infrastructure required to develop and operate datalake data feeds.
Design infrastructure required to develop and operate datamarts, their user interfaces and the feeds required to populate the datalake.
Responsible for the infrastructure that provides insights from raw data, handling and integrating diverse sources of data seamlessly.
Enable solutions, by handling large volumes of data in batch and real-time by leveraging emerging technologies from both the big data and cloud spaces.
Developing proof of concepts and implements complex big data solutions with a focus on collecting, parsing, managing, analysing and visualising large datasets.
Apply technologies to solve the problems of working with large volumes of data in diverse formats to deliver innovative solutions.
Substantial expertise in a broad range of software development and programming fields.
Knowledge of data analysis, end user requirements and business requirements analysis to develop a clear understanding of the business need and to incorporate these needs into a technical solution.
Solid understanding of physical database design and the systems development lifecycle. This role must work well in a team environment.
IT Degree/Diploma (3 years)
AWS Certification at least to associate level
Experience working as a Technical Lead (3+ years)
Business Intelligence (8+ years)
Extract Transform and Load (ETL) processes (8+ years)
Cloud AWS (4 + years)
Agile exposure, Kanban or Scrum (5+ years)
Desirable:Retail Operations (5+ years)
Big Data (5+ years) Experience working as a Technical Lead in the space
Required Knowledge and Skills
Creating data feeds from on-premise to AWS Cloud
Support data feeds in production on break fix basis
Creating data marts using Talend or similar ETL development tool
Manipulating data using python and pyspark
Processing data using the Hadoop paradigm particularly using EMR, AWSs distribution of Hadoop
Devop for Big Data and Business Intelligence including automated testing and deployment