Lead Data Engineer (Data Platform team)

Company:
Location: New York, NY

*** Mention DataYoshi when applying ***

Job Summary:

Data Engineering's mission at Disney Streaming is to provide a robust, self-service data ecosystem that enables data producers and data consumers to provide operational and analytic transparency for our rapidly growing business. To that end, the Data Platform team is responsible for building the tools and services that empower our users to make data-driven technical and business decisions and democratize data across the organization.

Responsibilities:

  • Collaborate with product teams, software engineers and data scientists to gather requirements and design self-service data solutions for our stakeholders’ needs
  • Build and maintain tools and services to support operating on data in the platform, including ETL, ingress, and egress patterns.
  • Build and maintain tools and services to support data discovery, lineage, governance, and privacy compliance across the data platform
  • Develop data catalogs and data validations to ensure clarity and correctness of key business metrics
  • Drive and maintain a culture of quality, innovation, and experimentation
  • Coach data engineers on best practices and technical concepts behind large-scale data platforms
  • Provide technical vision for DSS data platform, leading all technology and system design/architecture decisions
  • Lead implementation/realization of aforementioned technical vision
  • Work in an Agile environment that focuses on collaboration and teamwork
  • Evangelize Data Platform throughout the Disney Streaming Services organization


Basic Qualifications:

  • 6-10 years of experience developing in one or more of the following: Python, C++, or any JVM language
  • 3-5 years of experience deploying and running cloud-based data solutions (AWS, GCP, Azure) and familiar with tools such as CloudFormation, Kinesis, ECS, S3 (or equivalent)
  • 3-5 years of experience engineering big-data solutions using technologies like Databricks, EMR, and Spark
  • In-depth understanding of data partitioning and sharding techniques
  • Familiarity with metadata management, data lineage, and principles of data governance
  • Experience loading and querying cloud-hosted databases such as Redshift and Snowflake
  • Experience building streaming data pipelines using Kafka, Spark, Flink, or Samza
  • Experience with API design


Preferred Qualifications:

  • Experience with functional programming in Scala
  • Experience building distributed systems with microservices and/or service-oriented architectures
  • Familiarity with containerization/virtualization, e.g., Docker, Kubernetes
  • Familiarity with workflow scheduler tools, e.g. Airflow, Argo
  • Familiarity with common ML tooling, e.g. MLFlow, Tensorflow, scikit-learn
  • Knowledge of CI/CD best practices
  • AWS experience
  • Familiarity with binary data serialization formats such as Parquet, Avro, and Thrift
  • Experience deploying data notebook and analytic environments such as Jupyter and Databricks


Required Education

Bachelor’s degree in Computer Science or related field or equivalent work experience

*** Mention DataYoshi when applying ***

Offers you may like...

  • H&M Group

    Competence Lead Data Analyst to H&M Business Tech ...
    Stockholm
  • Wiser Solutions

    Lead Data Engineer
    Remote
  • Visa

    Lead Data Engineer, Open Source - Data Platform
    Palo Alto, CA
  • Stirah

    Lead Data Scientist - Flexible Office Homeoffice R...
    Lisboa
  • Stirah

    Lead Data Scientist - Flexible Office Homeoffice R...
    New Delhi, Delhi