HARP Technologies And Services Pvt Ltd

Data Engineer - Azure Databricks

SQL ETL Scala

February 9, 2024

Apply Now

Gautam Buddha Nagar, Uttar Pradesh, India, India

February 9, 2024

Apply Now

Job description

Work Mode - Remote ( candidates should be from Pune, Gurgaon, Bangalore, Hyderbad, Noida, New Delhi)

Shift - EST timing

Employment Type - Contract long term project ( based on performance & cultural fit possible absorption by client in 6 months)

Hiring for one of our Clients who are global leader in data warehouse modernization and migration helping businesses to move their data, workloads, ETL, and analytics to the cloud with use of automation.

We are seeking a skilled Data Engineer with 4 to 5 years of experience specializing in DataBricks. The ideal candidate should have a robust understanding and hands-on expertise in PySpark and various components within DataBricks. As a crucial member of our data team, you will play a pivotal role in developing, optimizing, and maintaining our data infrastructure, ensuring seamless and efficient data Design, develop, and maintain data pipelines using DataBricks and PySpark to process and manipulate large scale datasets.

Proven experience in optimizing Apache Spark batch processing workflows.
Extensive experience in building and maintaining streaming data pipelines.
Optimize and finetune existing DataBricks jobs and PySpark scripts for enhanced performance and reliability.
Troubleshoot issues related to data pipelines, identify bottlenecks, and implement effective solutions.
Implement best practices for data governance, security, and compliance within DataBricks environments.
Work closely with Data Scientists and Analysts to support their data requirements and enable efficient access to relevant datasets.
Stay updated with industry trends and advancements in DataBricks and PySpark technologies to propose and implement innovative solutions.
Demonstrated expertise in optimizing systems for low-latency and high-throughput performance.
Proficiency in using Spark SQL and DataFrame API for dynamic data transformations.
Experience with using programming languages such as Python or Scala to implement advanced filtering logic in Databricks notebooks or scripts.
Familiarity with the principles of distributed systems and their application in message broking.
Collaborate with cross functional teams to gather requirements, understand data needs, and implement scalable Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
4 to 5 years of proven experience as a Data Engineer with a strong emphasis on DataBricks.
Experience in cloud environment AWS tools services is mandatory.
Proficiency in PySpark and extensive hands on experience in building and optimizing data pipelines using DataBricks.
Solid understanding of different components within DataBricks such as clusters, notebooks, jobs, and libraries.
Strong knowledge of SQL, data modeling, and ETL processes.
Ability to analyz excellent communication skills with the ability to collaborate with cross functional teams.

(ref:hirist.com)