Job description

Key Responsibilities

Data Ingestion: Design and implement data ingestion pipelines using Databricks and PySpark, with a focus on Autoloader for efficient data processing.

Nested JSON Handling: Develop and maintain processes for handling complex nested JSON files, ensuring data integrity and accessibility.

API Integration: Integrate and manage data from various APIs, ensuring seamless data flow and consistency.

Data Modeling: Create and optimize data models to support analytics and reporting needs.

Performance Optimization: Optimize data processing and storage solutions for performance and cost-efficiency.

Collaboration: Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver effective solutions.

Data Quality: Ensure the accuracy, integrity, and security of data throughout the data lifecycle.

Qualifications

Technical Expertise: Proficiency in Databricks, PySpark, and SQL. Strong experience with Autoloader and handling nested JSON files.

API Experience: Demonstrated experience in integrating and managing data from various APIs.

Problem-Solving Skills: Strong analytical and problem-solving abilities.

Communication Skills: Excellent communication skills to collaborate with cross-functional teams.

Experience: Previous experience 3- 5 years in data engineering, data integration, and data modeling.

Education: A degree in Computer Science, Engineering, or a related field is preferred.

Preferred Qualifications

Experience with cloud platforms such as AWS, Azure, or Google Cloud.

Familiarity with data warehousing concepts and tools.

Knowledge of data governance and security best practices.

Please let the company know that you found this position on this Job Board as a way to support us, so we can keep posting cool jobs.

Similar jobs

Browse All Jobs
Phoenix Recruitment
November 4, 2024
Moyo
November 4, 2024

Senior Data Engineer

SURGO
November 4, 2024

Data Engineer