Lead Data Engineer

Company:
Location: Cape Town, Western Cape

*** Mention DataYoshi when applying ***

This role will Lead the Data Engineering team and reports into the Head of Data Engineering.

Team and Technical Leadership responsibilities will form part of this role, this will include managing the teams Performance and Development, Planning and Reporting. You will act as mentor and Technical Lead to the Data Engineering team. Sound Cloud AWS or Azure experience is not negotiable for this role. You will be considered as an expert in both the technical and/or functional areas, introduce new systems, processes, methodologies. You will be function as part of an Agile team.

You will be required to lead the Data Engineering team who build, and support data pipelines and data marts built off those pipelines, which are both scalable, repeatable and secure.

As a Lead you will ensure that the Data Engineering team helps to facilitate gathering data from a variety of different sources, in the correct format, assuring that it conforms to data quality standards and assuring that downstream users can get to that data timeously.

Your team is responsible for the infrastructure that provides insights from raw data, handling and integrating diverse sources of data seamlessly. They enable solutions, by handling large volumes of data in batch and real-time by leveraging emerging technologies from both the big data and cloud spaces. This includes developing proof of concepts and implements complex big data solutions with a focus on collecting, parsing, managing, analysing and visualizing large datasets. They will know how to apply technologies to solve the problems of working with large volumes of data in diverse formats to deliver innovative solutions.

Data Engineering is a technical job that requires substantial expertise in a broad range of software development and programming fields. You will have a knowledge of data analysis, end user requirements and business requirements analysis to develop a clear understanding of the business need and to incorporate these needs into a technical solution. A solid understanding of physical database design and the SDLC.


Managing a team that

  • Designs and develops data feeds from an on-premise environment into a data lake environment in an AWS cloud environment
  • Designs and develops programmatic transformations of the solution, by correctly partitioning, formatting and validating the data qualitys
  • Designs and develops programmatic transformation, combinations and calculations to populate complex data marts based on the feed from the data lake
  • Provides operational support to data mart data feeds and data marts
  • Designs infrastructure required to develop and operate data lake data feeds
  • Designs infrastructure required to develop and operate data marts, their user interfaces and the feeds required to populate the data lake


Minimum requirements

  • Completed Degree
  • AWS Certified (Associate Level)
  • Experience as a Technical Lead
  • 5 years working knowledge of:
  • Creating data feeds from on-premise to AWS Cloud
  • Supporting data feeds in production on break fix basis
  • Creating data marts using Talend or similar ETL development tool
  • Manipulating data using Python and PySpark
  • Processing data using the Hadoop paradigm particularly using EMR, AWSs distribution of Hadoop
  • DevOps for Big Data and Business Intelligence including automated testing and deployment


Required Working Experience

  • 6-7 years+ in Business Intelligence
  • 4-5 years+ in Business Intelligence Data Modelling
  • 4-5 years+ using SQL
  • 3-4 years+ in Big Data
  • 6-7 years+ in Extract Transform and Load (ETL) processes
  • 3-4 years+ in Cloud AWS (EMR, EC2, S3)
  • 3-4 years+ in Agile (Kanban or Scrum)
  • 2-3 years+ using Talend
  • 2-3 years+ using Python
  • 2-3 years+ using PySpark or Spark
  • 6-7 years+ in Retail Ops/ similar (preferred)

*** Mention DataYoshi when applying ***

Offers you may like...

  • Farmers Insurance Group

    Lead Data Scientist
    Woodland Hills, CA 91367
  • Airtel India

    Lead Data Analyst
    Gurgaon, Haryana
  • Agilon Health

    Lead Data Scientist
    Remote
  • GO-JEK

    Lead Data Scientist
    Bengaluru, Karnataka
  • Indicia Worldwide

    Lead Data Scientist
    Mumbai, Maharashtra