Proven experience working as a data engineer in Databricks environment;
Highly proficient in using spark framework within Databricks;
Experience in designing and delivering solutions utilizing AWS services;
Expertise with CI/CD tools (GitLabs, CodeCommit or similar) to automate building, testing and deployment of data pipelines and to manage the infrastructure (Terraform or CloudFormation);
Understanding of relational databases (e.g., MySQL, PostgreSQL), NoSQL databases (e.g., key-value stores like Redis, DynamoDB), and Search Engines (e.g., Elasticsearch);
Experience with generating ETL pipelines;
History working in a team environment under agile methodology;
Fluent in English.
What will be your attributions at #Luby?
Designing and implementing highly performant data pipelines from multiple sources using Databricks;
Integrating end to end data pipelines to move data from source systems to target repositories within Databricks and AWS services (SQS, RDS) ensuring data quality and consistency is maintained throughout the process;
Optimizing Spark performance for scalability and troubleshooting;
Leverage data sources, relevant external data and domain expertise to achieve business objectives;
Evaluate models and using analytical skills, provide insights;
Assist product owners with analysis and provide recommendations based on the data;
Work with other team members to assist in delivering additional project components (i.e. API, Search).