Data Engineer

Location: Washington, DC

*** Mention DataYoshi when applying ***


  • Primary Responsibilities:
    • Integrate data using SQL and REST APIs, and author data transformation pipelines for new data sources using PySpark and Pandas;
    • Maintain existing PySpark and Pandas data pipelines, including monitoring pipeline health, debugging pipeline failures and making ad-hoc changes to pipeline code to support user requests;
    • Write code to convert data from its native format to Palantir model format
    • Implement ad-hoc changes to dashboards written in Javascript and SQL to support user requests.
    • On call every other weekend to support workflows related to the HHS and CDC COVID-19 response.

  • Required Abilities:
    • Experience writing production data pipelines in the following coding languages:
      • Pyspark
      • Pandas
      • SQL
    • Experience with frontend development in Javascript
    • Experience working with REST APIs
    • Prior experience with Apache Spark is also desirable
  • Requirements:
    • At least (3) years of relevant technical experience in software engineering and systems engineering and deployment of complex computer systems; and
    • Bachelor’s Degree in Science, Technology, Engineering, or Math (STEM) field.

*** Mention DataYoshi when applying ***

Offers you may like...

  • Thermo Fisher Scientific

    Sr. Data Engineer- Remote
    Pittsburgh, PA 15122
  • Sapient Industries

    Data Engineer
    Philadelphia, PA 19103
  • Harvard University

    Data Engineer
    Cambridge, MA
  • Mastery Logistics Systems, Inc.

    Data Engineer (Mongo)
    United States
  • Facebook

    Data Engineer, Analytics (Integrator)
    Sunnyvale, CA 94089