Anglo American

Specialist Data Engineer

Job description

Business Unit / Group Function:
Not defined
Data Analytics
Open Location
Reference Id:
Experience / Work Type:
Associate / Permanent Employee
Closing Date:
15 September, 2022

Closing Date: 15 September, 2022


Company Description:

Anglo American is on a journey to create Intelligent Mines where data-driven insight creates sustainable value across an integrated value chain. The drive is to create a system which empowers data-driven decision-making abilities at all levels in the organization.

Data Analytics is a new discipline in the Technical and Sustainability (T&S) function of Anglo American. The work of the discipline is aimed at better leveraging our data to deliver new insights, value, and smarter ways of working from discovery to market.

We are going to generate more data than we ever have before, and we need to build the systems to support this. Data will come in from different steps in the value chain which will enable us to make better decisions. New digital technologies bring us better ways of doing the things we already do.

We are building the VOXEL Solution – an ecosystem of products, applications, training programmes and data policy that operate across the full mining value chain to provide a game-changing improvement in access to knowledge, decision-making and performance of the business. This role will work in the Data Team. The Data team provide the foundational data platform, architecture and data management required to deliver all DA activities. We provide the fuel that the Products need to succeed and maximise the insights available in the data. Our mission is to deliver “Frictionless Access to Trusted Data to enable Better Decision Making”.

Job Description:

Provide data engineering solutions to ensure seamless integration of source systems and data flows across VOXEL products, maximizing insights and enabling better decision making.

Performance & Delivery


  • Design and develop a scalable approach to ingest and curate data into the VOXEL Data Lake.
  • Design and integrate approach for ingesting and sharing unstructured data like geo spatial images, videos, etc.
  • Design approach for sharing /exposing data with the consuming applications meeting their functional / non-functional requirements.


  • Drive data engineering teams to adopt standards and best practices for data engineering.
  • Ensure that the data policies defined by the Data Governance group are implemented.
  • Ensure conformance with the overall architectural guidelines and optimise platform usage.
  • Validate solutions for migration of data components into cloud.
  • Review and approve ETL data flow diagrams of project teams working outside of VOXEL data lake.
  • Perform QAs on the data solution and define backlog for solution hardening.


  • Integrate VOXEL data lake with the Enterprise Data Catalogue so that data is discoverable and accessible
  • Integrate VOXEL data lake with Data Quality tool to measure and report data quality to data stewards / data custodians
  • Set objectives and track progress of the agile software delivery teams.

Role-specific knowledge:

  • Advanced knowledge and experience in data engineering and data management.
  • Sound knowledge of cloud architecture and PaaS services.
  • Good knowledge and experience of programming, database administration, database and application design.
  • Good understanding enterprise data modelling, data quality, data integration, data lake and data quality workflows


  • Advanced: Designing and building configurable and scalable data pipelines to ingest and curate data in near real time
  • Advanced: Azure Data Lake Gen 2, Azure SQL, Azure Data Factory, Event Hub, IoT Hub, DataBricks.
  • Advanced: Data Engineering methodologies including batch/streaming; delta/full/historical loads, managing data ingestion using APIs, handling schema drifts at source etc.
  • Proficient: Strong knowledge of other supporting Azure PaaS services like KeyVault, Log Analytics workspace, App registrations.
  • Proficient: Building and deploying data pipelines at scale in production systems, using CI/CD methodologies.
  • Proficient: Programming and Scripting languages (Python, PowerShell).
  • Proficient: Data visualization and data migration skills, relational database management skills or foundational database skills, Microsoft SQL Server experience and other database skills e.g., NoSQL and cloud computing.
  • Proficient: Building APIs and backend services to power applications.
  • Proficient: Semi structured and unstructured data & Big Data.
  • Proficient: Metadata management including using a data catalogue.


  • Required: Bachelor’s Degree Informatics, Applied mathematics and statistics, Computer Science/Engineering/Information Technology, Geosciences
  • Desirable: Master’s Degree in Data Science or Computer Science

Please let the company know that you found this position on this Job Board as a way to support us, so we can keep posting cool jobs.