Data Engineer

Location: Johannesburg, Gauteng

*** Mention DataYoshi when applying ***


    Job ID:


    Job Sector:



    South Africa





Job Details

Risk Management: understanding all risks – from the economic to the political – that could affect our global business, and offering guidance to all parts of the bank

Job Purpose

Apply data mining techniques and conduct statistical analysis to large, structured and unstructured data sets to understand and analyse phenomena. Model complex business problems, discovering insights and opportunities through statistical, algorithmic, machine learning and visualisation techniques, working closely with clients, data and technology teams to turn data into critical information used to make sound business decisions. Execute intelligent automation and predictive modelling.

Key Responsibilities/Accountabilities


  • Directs the gathering of data for use in Data Science models, ensuring that chosen datasets best reflect the organisations goals. Performs data preprocessing including data manipulation, transformation, normalisation, standardisation, visualisation and derivation of new variables/features.
  • Utilises advanced data analytics and mining techniques to analyse data, assessing data validity and usability; reviews data results to ensure accuracy; and communicates results and insights to stakeholders.
  • Designs various mathematical, statistical, and simulation techniques to typically large and unstructured data sets in order to answer critical business questions and create predictive solutions which drive improvement in business outcomes. Drives analytics and insights across the organisation by developing advanced statistical models and computational algorithms based on business initiatives.
  • Codes, tests and maintains scientific models and algorithms; identifies trends, patterns, and discrepancies in data; and determines additional data needed to support insight. Processes, cleanses, and verifies the integrity of data used for analysis.
  • Use data profiling and visualisation techniques using tools to understand and explain data characteristics that will inform modelling approaches.
  • Communicate data information to business with various skill levels and in various roles, presenting trends, correlations and patterns found in complicated datasets in a manner that clearly and concisely conveys meaningful insights and defend recommendations.
  • Mines data using state-of-the-art methods. Enhances data collection procedures to include information that is relevant for building data models.


  • Creates, maintains and optimises modelling solutions that enable the forecast of quality data outcomes. Ensures that volumetric predictions are modelled so that resource requirements are optimally considered.
  • Develops and maintains optimal evaluation techniques to ensure that modelled outcomes are rigorous and creates model performance tracking.
  • Drives sustainable and effective modelling solutions.
  • Develops, implements, monitors and maintains a comprehensive operational IA plan, rules, methodologies and coding initiatives in order to drive IA for remediation efforts. Develops and co-ordinates a comprehensive strategy for productionalising automation software so that it is accurate and well maintained.

Risk, Regulatory, Prudential and Compliance:

  • Provides input into Data management and modelling infrastructure requirements and adheres to the organisations’s infrastructure development processes, including the management of User Acceptance Testing (UAT). Conducts regression testing across all relevant systems as required.


  • Ensure business integration through integrating model outputs into endpoint production systems, where requirements must be understood and adopted relating to data collection, integration and retention requirements incorporating business requirements and knowledge of best practices.

Technology and Infrastructure:

  • Builds machine learning models from and utilises distributed data processing and analysis methodologies. Competent in Machine Learning programming in R or Python, with supplementary still in Matlab, Java, etc.
  • Familiar with the Hadoop distributed computational platform, including broader ecosystem of tools such as HDFS / Spark / Kafka.


  • Liaise and collaborate with the Data Science Guild, providing support to the entire department for its data centric needs. Collaborate with subject matter experts to select the relevant sources of information and translates the business requirements into data mining/science outcomes. Presents findings and observations to team for development of recommendations.
  • Acts as a subject matter expert from a data science perspective and provides input into all decisions relating to data science and the use thereof.
  • Educate the organisation on data science perspectives on new approaches, such as testing hypotheses and statistical validation of results.
  • Ensure ongoing knowledge of industry standards as well as best practice and identify gaps between these definitions/data elements and organisation data elements/definitions.

Preferred Qualification and Experience

Minimum Qualifications:

  • First Degree Field of study: Information Technology

  • Proficiency in application and web development. Structured and Unstructured Query languages e.g. SQL, Qlikview; Tableau; SSIS SSRS, Python JSON , C#, Java, C++, HTML

Preferred Qualifications:

  • Honours Degree Field of study: Information Technology


  • 5-7 years Proven development experience in software and software engineering. Understanding of financial services data processes, systems, and products. Experience in technical business intelligence. Knowledge of IT infrastructure and data principles. Project management experience. Exposure to governance and regulatory matters as it relates to data. Experience in building models (credit scoring, propensity models, churn, etc.).
  • 5-7years experience in working with unstructured data (e.g. Streams, images) Understanding of data flows, data architecture, ETL and processing of structured and unstructured data. Using data mining to discover new patterns from large datasets. Implement standard and proprietary algorithms for handling and processing data. Experience with common data science toolkits, such as SAS, R, SPSS, etc. Experience with data visualisation tools, such as Power BI, Tableau, etc.

Knowledge/Technical Skills/Expertise

  • Diagramming and Modelling
  • Data Integrity
  • Research and Information Gathering
  • Data Analysis
  • Knowledge Classification
  • Database Administration

PLEASE NOTE: All our recruitment and selection processes comply with applicable local laws and regulations. We will never ask for money or any form of payment as part of our recruitment process. If you experience this, please contact our Fraudline on +27 800222050 or forward to

*** Mention DataYoshi when applying ***

Offers you may like...

  • The Upside Travel Company, LLC

    Senior Data Engineer
  • PPL Corporation

    Senior Data Engineer- Remote
    Allentown, PA
  • Artemis Consulting Inc

    Senior AWS Cloud Data Engineer
    Washington, DC 20001
  • Illuminate Education

    Data Engineer [remote]
    Minneapolis, MN 55402
  • Atomic

    Data Engineer