UST Global® is is looking for Senior Azure Data Engineer to build Data pipelines to work with one of the leading Retail providers in the US.
The ideal candidate must possess strong background on Data Engineering development technologies & extensive experience in Azure. The candidate must possess excellent written and verbal communication skills with the ability to collaborate effectively with domain experts and technical experts in the team
- Experience working with Azure Data Lake Storage Gen2 (ADLS)
- Experience with Azure Data Factory (ADF) and ETL concepts
- Creating and defining; -Pipelines and Datasets
- Understanding of variables and parameters and their use in creating dynamic pipelines.
- ADF orchestration concepts.
- Scheduling and triggering ADF jobs.
- Defining data mappings and complex transformation logic within Mapping Flows
- Understanding of Linked Services
- Experience with Databricks in Azure
- Understanding of using Pyspark and Spark-SQL to perform key activities;
- Create data frames from a variety of sources
- Perform transformations, filtering, data cleansing activities
- Experience writing PowerShell or another comparable scripting language.
Generic Data Engineer requirement
- Experience in SQL and NoSQL programming languages, Data engineers must know how to manipulate database management systems (DBMS), which is a software application that provides an interface to databases for information storage and retrieval.
- Experience in Data Warehousing solutions, Data warehouses store huge volumes of current and historical data for query and analysis. This data is ported from numerous sources, such as a CRM system, accounting software, and ERP software. The data is then used by the organization for reporting, analytics, and data mining
- Experience in ETL tools - ETL (Extract, Transfer, Load) refers to how data is taken (extracted) from a source, converted (transformed) into a format that can be analyzed and stored (loaded) into a data warehouse. This process uses batch processing to help users analyze data relevant to a specific business problem
- Machine learning algorithms—also called models—help data scientists make predictions based on current and historical data. Data engineers only need a basic knowledge of machine learning as it enables them to understand a data scientist’s needs better (and, by extension, the organization’s needs), get models into production and build more accurate data pipelines.
- Knowledge in Data APIs, An API is an interface used by software applications to access data. It allows two applications or machines to communicate with each other for a specified task. For example, web applications use API to allow the user-facing front end to communicate with the back-end functionality and data. When a request is made on a website, an API allows the application to read the database, retrieve information from the relevant tables in the database, process the request and return an HTTP-based response to the web template, which is then displayed in the web browser. Data engineers build APIs in databases to enable data scientists and business intelligence analysts to query the data.
- Python, Java, and Scala programming languages. Python is the top programming language used for statistical analysis and modeling. Java is widely used in data architecture frameworks and most of their APIs are designed for Java. Scala is an extension of the Java language that is interoperable with Java as it runs on JVM (a virtual machine that enables a computer to run Java programs).
- Knowledge of algorithms and data structures. Data engineers focus mostly on data filtering and data optimization, but a basic knowledge of algorithms is helpful for understanding the big picture of the organization’s overall data function, as well as define checkpoints and end goals for the business problem at hand.