Data Engineer (Python, SQL), AdTech

Company:
Location: Kraków, małopolskie

*** Mention DataYoshi when applying ***

We’re looking for a Data Engineer to build innovative data pipelines for processing and analyzing client’s large user datasets (250 billion + events per month). Our client is one of the fastest growing and most promising AdTech companies in the US. Our Data team is working on Data Warehouse gathering ads performance data and data related features aimed at conversion improvements.

Responsibilities:

  • Develop ETL (Extract, Transform and Load) Data pipelines in Spark, Kinesis, Kafka, custom Python apps to transfer massive amounts of data (over 20TB/ month) most efficiently between systems
  • Engineer complex and efficient and distributed data transformation solutions using Python, Java, Scala, SQL
  • Productionalize Machine Learning models efficiently utilizing resources in clustered environment
  • Research, plan, design, develop, document, test, implement and support proprietary software applications
  • Analytical data validation for accuracy and completeness of reported business metrics
  • Open to taking on, learn and implement engineering projects outside of core competency
  • Monitor system performance after implementation and iteratively devise solutions to improve performance and user experience

Requirements:

  • 3+ years of experience of developing in Python to transform large datasets on distributed and cluster infrastructure
  • 3+ years of experience in engineering ETL data pipelines for Big Data Systems
  • Proficient level of SQL. Have some experience performing data transformations and data analysis using SQL
  • Comfortable in juggling multiple technologies and high priority tasks
Nice to have:
  • BS or higher degree in computer science, engineering or other related field
  • 5+ years of Object Oriented Programming experience in any of languages such as Java, Scala, C++
  • Prior experience of designing and building ETL infrastructure involving streaming systems such as Kafka, Spark, AWS Kinesis
  • Experience of implementing clustered/ distributed/ multi-threaded infrastructure to support Machine Learning processing on Spark or Sagemaker
  • Experience with Distributed columnar databases like Veritca, Greenplum, Redshift, or Snowflake

We offer:

  • Opportunity to work on bleeding-edge projects
  • Work with a highly motivated and dedicated team
  • Competitive salary
  • Flexible schedule
  • Medical insurance
  • Benefits program
  • Corporate social events

About us:

Grid Dynamics is the engineering services company known for transformative, mission-critical cloud solutions for retail, finance and technology sectors. We architected some of the busiest e-commerce services on the Internet and have never had an outage during the peak season. Founded in 2006 and headquartered in San Ramon, California with offices throughout the US and Eastern Europe, we focus on big data analytics, scalable omnichannel services, DevOps, and cloud enablement.

*** Mention DataYoshi when applying ***

Offers you may like...

  • Living Security

    Senior Data Engineer
    Austin, TX 78738
  • LOCKHEED MARTIN CORPORATION

    Data Engineer Staff
    Orlando, FL 32825
  • CyberCoders

    Data Engineer
    Chicago, IL 60608
  • iknowvate technologies

    Big Data Engineer | FULLY REMOTE
    Las Vegas, NV
  • Seamless.AI

    Data Engineer - Remote US
    Columbus, OH