We're looking for motivated Data Engineers who will work on owning Data Pipelines, Feature Stores, and Inference Pipelines for some of our client projects, especially around building a scalable market intelligence platform that has CXO-level visibility for some of our Gaming customers.
Responsibilities
Build and manage Data Ingestion, and transformation pipelines designed for high throughput/low latency in Spark/Spark Streaming/Python.
Participate in deploying Data & Machine Learning feature pipelines via Airflow.
Be impact-oriented and have a sharp focus on iterating on small cycles.
Collaborate with Data Scientists, Analysts, and ML Engineers to build.
In the longer term, the responsibility of this position will also include working on some POCs on LLMs/Vector stores/RAG pipelines.
Requirements
Very good coding skills in any of the following programming languages - Python/Spark/Go. Has hands-on knowledge of OOPS/SOLID principles.
Very good understanding of SQL (joins, optimized queries, etc).
Good conceptual understanding of Databases (row-based, column-based, key-value stores, NoSQL Databases).
Alternatively, a good understanding of data structures used in row-based vs column-based stores is expected.
Exposure to Airflow is a plus.
This job was posted by Soumanta Das from Yugen.ai.