Job description

Job Purpose

The successful incumbent will build and support data pipelines and datamarts built off those pipelines, both must be scalable, repeatable and secure. Facilitate gathering data from a variety of different sources, in the correct format, assuring that it conforms to data quality standards and assuring that downstream users can get to that data timeously.

Job Objectives

  • Design and develop data feeds from an on-premise environment into a datalake environment in an AWS cloud environment.
  • Design and develop programmatic transformations of the solution, by correctly partitioning, formatting and validating the data quality.
  • Design and develop programmatic transformation, combinations and calculations to populate complex datamarts based on feed from the datalake.
  • Provide operational support to datamart datafeeds and datamarts.
  • Design infrastructure required to develop and operate datalake data feeds.
  • Design infrastructure required to develop and operate datamarts, their user interfaces and the feeds required to populate the datalake.

Job Function

  • Responsible for the infrastructure that provides insights from raw data, handling and integrating diverse sources of data seamlessly.
  • Enable solutions, by handling large volumes of data in batch and real-time by leveraging emerging technologies from both the big data and cloud spaces.
  • Developing proof of concepts and implements complex big data solutions with a focus on collecting, parsing, managing, analysing and visualising large datasets.
  • Apply technologies to solve the problems of working with large volumes of data in diverse formats to deliver innovative solutions.
  • Substantial expertise in a broad range of software development and programming fields.
  • Knowledge of data analysis, end user requirements and business requirements analysis to develop a clear understanding of the business need and to incorporate these needs into a technical solution.
  • Solid understanding of physical database design and the systems development lifecycle. This role must work well in a team environment.

Qualifications

  • IT Degree/Diploma (3 years)
  • AWS Certification at least to associate level

Experience

  • Experience working as a Technical Lead (3+ years)
  • Business Intelligence (8+ years)
  • Extract Transform and Load (ETL) processes (8+ years)
  • Cloud AWS (4 + years)
  • Agile exposure, Kanban or Scrum (5+ years)
  • Desirable:Retail Operations (5+ years)
  • Big Data (5+ years) Experience working as a Technical Lead in the space

Required Knowledge and Skills

  • Creating data feeds from on-premise to AWS Cloud
  • Support data feeds in production on break fix basis
  • Creating data marts using Talend or similar ETL development tool
  • Manipulating data using python and pyspark
  • Processing data using the Hadoop paradigm particularly using EMR, AWSs distribution of Hadoop
  • Devop for Big Data and Business Intelligence including automated testing and deployment

Skills:

  • Talend
  • AWS: EMR, EC2, S3
  • Python
  • Business Intelligence Data modelling
  • SQL
  • PySpark or Spark

Please let the company know that you found this position on this Job Board as a way to support us, so we can keep posting cool jobs.