Data Engineer

Company:
Location: New York, NY

*** Mention DataYoshi when applying ***

As a Senior Data Engineer here at Horizon your objective will be to assist in building out Horizon’s data platform. Utilizing your skills with Python and AWS, you will be tasked with creating automated highly robust and performant data pipelines that will be able to scale with the ever-increasing data needs of data analysts and client teams. These data pipelines will serve vital business operations such as client reporting, analytics/data science, activation and etc.

Job Duties

  • (45%) Build robust and scalable data integration (ETL) pipelines using SQL, EMR, Python and Spark
    • Design solution based on needs gathered after discussing with end users/stakeholders
    • Code/implement solution based on design adhering department best practices and processes
    • Drive solution from initial implementation, testing/QA and final delivery to end users/ stakeholders
    • Maintain delivered solutions in respect to changing requirements or unexpected failures
  • (30%) Mentor and manage junior members of the team
    • Perform code reviews on work submitted by junior developers
    • Advocate and enforce MTD coding standards (e.g. Clean Code) and industry best practices
    • Ensure junior developers are on target will deliverables and do not have obstacles blocking them
  • (15%) Technical Knowledge and leadership
    • Expected to participate with other senior engineers and tech leaders to drive evolution of tech standards/processes
    • Work to understand new practices and technologies relevant to data engineering field and help drive adoption within team
  • (10%) Collaborate with Software Solution team members and other staff to validate desired outcomes for code prior to, during, and post development
    • Train other technical staff to understand how to access/utilize delivered solution
    • Flesh out test cases and edge/outlier cases to test for
  • (5%) Help onboard new engineers

Preferred Skill:

  • Bachelor’s degree in Computer Science or related major from 4 year University is required. A masters is preferred
  • 3 years experience building data pipelines and implementing feeds for data warehouse
  • Strong communications skills both written and verbal
    • Strong technical understanding to be able to contribute in meetings to discuss best practices and/or technical solutions to business problems
    • Able to communicate effectively with non-technical co-workers and stakeholders. Be able to explain technical concepts and issues at a level that non-technical people can understand
    • Able to understand requirements and business needs from client teams and stake holders and translate those to technical requirements
  • Python 3
    • Experience utilizing Pandas library to explore, cleanse and standardize data
    • Experience utilizing PySpark library to process large datasets in a distributed scalable fashion
  • Advanced SQL
  • Code Repository (Github, Bitbucket)
  • Linux/Shell scripting

Nice to Haves

  • Experience with Snowflake
  • Experience with Airflow
  • Experience with AWS
    • S3
    • EMR
    • EC2
    • Cloudformation
    • Athena
    • Lambda

The statements herein are intended to describe the general nature and level of work being performed by employees and are not to be construed as an exhaustive list of responsibilities, duties and skills required of personnel so classified. Furthermore, they do not establish a contract for employment and are subject to change at the discretion of the employer.

*** Mention DataYoshi when applying ***

Offers you may like...

  • Tiger Analytics

    Big Data Engineer (Azure)
    Remote
  • Reify Health

    Senior Data Engineer - Global Kappa Architecture (...
    Boston, MA
  • Kelly

    Full Stack Data Engineer - Remote
    Horsham, PA 19044
  • Reify Health

    Senior Data Engineer - Data Products (Boston/Remot...
    Boston, MA
  • Cubane Solutions AB

    Data Engineer
    Remote