Wholesale Clients Digital
Responsible for all components of product development and deployment of Data Science as a business capability for Standard Bank (CIB). The role requires involvement in problem definition, data cleansing, hypothesis generation and testing, model training and testing and finally deployment and monitoring of predicted outcomes.
The candidate will need to rapidly understand business problems, internalize and contextualize domain specific information and identify deployable, pragmatic solutions that are relevant and add real business value.
The candidate will need to be able to effectively communicate the intended processes, provide a clear understanding of what data is required and available. Also they need to ensure a clear understanding of measurement criteria for success and be able to manage expectations around deliverables.
Results need to be presented to stakeholders effectively, both visually and verbally, together with the appropriate recommendations and business value clearly elucidated.
- Directs the gathering of data for use in Data Science models, ensuring that chosen datasets best reflect the organisations goals. Performs data pre-processing including data manipulation, transformation, normalisation, standardisation, visualisation and derivation of new variables/features. Utilises advanced data analytics and mining techniques to analyse data, assessing data validity and usability; reviews data results to ensure accuracy; and communicates results and insights to stakeholders.
- Designs various mathematical, statistical, and simulation techniques to typically large and unstructured data sets in order to answer critical business questions and create predictive solutions which drive improvement in business outcomes. Drives analytics and insights across the organisation by developing advanced statistical models and computational algorithms based on business initiatives.
- Codes, tests and maintains scientific models and algorithms; identifies trends, patterns, and discrepancies in data; and determines additional data needed to support insight. Processes, cleanses, and verifies the integrity of data used for analysis.
- Use data profiling and visualisation techniques using tools to understand and explain data characteristics that will inform modelling approaches. Communicate data information to business with various skill levels and in various roles, presenting trends, correlations and patterns found in complicated datasets in a manner that clearly and concisely conveys meaningful insights and defend recommendations.
- Mines data using state-of-the-art methods. Enhances data collection procedures to include information that is relevant for building data models.
- Creates, maintains and optimises modelling solutions that enable the forecast of quality data outcomes. Ensures that volumetric predictions are modelled so that resource requirements are optimally considered. Develops and maintains optimal evaluation techniques to ensure that modelled outcomes are rigorous and creates model performance tracking. Drives sustainable and effective modelling solutions.
- Develops, implements, monitors and maintains a comprehensive operational IA plan, rules, methodologies and coding initiatives in order to drive IA for remediation efforts. Develops and co-ordinates a comprehensive strategy for productionalising automation software so that it is accurate and well maintained.
Risk, Regulatory, Prudential and Compliance
- Provides input into Data management and modelling infrastructure requirements and adheres to the organisations’s infrastructure development processes, including the management of User Acceptance Testing (UAT). Conducts regression testing across all relevant systems as required.
- Ensure business integration through integrating model outputs into end-point production systems, where requirements must be understood and adopted relating to data collection, integration and retention requirements incorporating business requirements and knowledge of best practices.
Technology and Infrastructure
- Builds machine learning models from and utilises distributed data processing and analysis methodologies. Competent in Machine Learning programming in R or Python, with supplementary still in Matlab, Java, etc. Familiar with the Hadoop distributed computational platform, including broader ecosystem of tools such as HDFS / Spark / Kafka.
- Liaise and collaborate with the Data Science Guild, providing support to the entire department for its data centric needs. Collaborate with subject matter experts to select the relevant sources of information and translates the business requirements into data mining/science outcomes. Presents findings and observations to team for development of recommendations.
- Acts as a subject matter expert from a data science perspective and provides input into all decisions relating to data science and the use thereof. Educate the organisation on data science perspectives on new approaches, such as testing hypotheses and statistical validation of results. Ensure ongoing knowledge of industry standards as well as best practice and identify gaps between these definitions/data elements and organisation data elements/definitions.
Minimum Qualification and Experience
Degree in Mathematical Sciences (Honours Degree preferred).
- Business oriented individuals with proven commercially successful data related projects will be given first preference.
- Proficiency in programming, data manipulation, application of machine learning techniques, data visualisation and story-telling.
- Experience in structured and unstructured query languages and data science toolkits e.g. SQL, Jupyter, Zeppellin, Tableau, PowerBI, Spark, Python, R and SAS.
- Ability to conceptualise and frame a problem, identify objective measures to estimate accuracy of machine learning/statistical processes and then provide detailed explanations (visually and verbally) in plain English to non-technical audiences around proposed solution. Ability to develop well defined processes that are methodical, provable, easily understood and operationalised within the production environment. Ability to work in a fast-paced, dynamic and multidisciplinary environment. Advise senior management in clear language about the implications of their work for the organisation.
- Experience in working with unstructured data (e.g. Streams, images) Understanding of data flows, data architecture, ETL and processing of structured and unstructured data. Using data mining to discover new patterns from large datasets. Implement standard and proprietary algorithms for handling and processing data. Experience with common data science toolkits, such as SAS, R, SPSS, etc. Experience with data visualisation tools, such as Power BI, Tableau, etc.