Data Engineer - PySpark/SQL

November 21, 2022

Gurugram, Haryana, India, India

November 21, 2022

Job description

Role Overview

Job Description :

Data Engineer with experience in Pyspark, AWS, Git, Databricks & SQL

Responsibilities

Data Ingestion from various sources to AWS
Building and maintaining Scalable Data Pipelines on AWS
Building API Integrations for Data Transfer
Scheduling/Automating jobs on AWS
ETL and Optimizing code for performance - Pyspark and AWS Glue
Schema Design/ Data Architecture
Demonstrate cross-functional collaboration skills and work with people at various levels across the organization.

Working Conditions

Candidate Profile

Bachelor's or master's degree from an accredited four-year college in computer science, software engineering, information technology, or another quantitative field.
A minimum of 2+ year experience on AWS Glue and pyspark.
Demonstrated Experience in Terraform.
Advanced knowledge of AWS Glue and pyspark such as Spark Optimization Techniques and Pyspark APIs etc. and familiarity with Data Migration Services (should know data migration from Servers like SQL, Oracle, CSVs).
Comfortable in Kinesis.
Understanding of overall system in GIT i.e., Pull/Push /New branch/merge etc. commands.
Experience in databricks i.e., Notebook.
Highly innovative, flexible and self-directed.
Experience in communicating with users, other technical teams, and management to collect requirements.

Preferred Qualifications

Experience in AWS services (Data Migration Services & Kinesis) and understanding of core AWS services, uses, and basic AWS architecture best practices (Pyspark and AWS Glue).
Significant experience in the de-regulated energy industry or the power generation industry.
Knowledgeable of client terminology, concepts, and value chain.

Additional Knowledge, Skills And Abilities