Data Engineer

Location: New York, NY 10013

*** Mention DataYoshi when applying ***

About Dorilton Capital:

Dorilton is a private investment firm that invests in businesses across a range of industry sectors, working in partnership with management to build value over the long-term. By providing funding and expertise to drive growth, we help our companies, and our people achieve their full potential.

About the Role:

We are seeking a highly motivated and goal-oriented Data Engineer to join our rapidly expanding organization. The Data Engineer will be part of a growing Advanced Analytics team and will develop, deploy, and maintain scalable software solutions and services. The ideal candidate is a Data Engineer capable to leverage cutting edge technologies to optimize long-term digital solutions. As a Data Engineer, you will contribute to the design and implementation of data pipelines for the Dorilton platform. In this role, you will partner closely with the Advanced Analytics team members to help collect, stream, transform, and effectively manage data for integration into critical reporting, data visualizations, and data products. You will take their findings and create robust, reusable code that is scalable and addresses the needs of our growing business. The candidate will be focused, driven, and have strong communication skills. They should be able to think critically, have excellent time management skills, and have a demonstrated ability to execute in fast paced environments.

Position Responsibilities

  • Ingest raw data: assessing quality, cleansing, structuring for downstream processing.
  • Facilitate and execute the collection, processing, and analysis of virtually all business data.
  • Combine and correlate large datasets from multiple data sources and analyze for integrity.
  • Build the infrastructure required for efficient ETL of data from a wide variety of data sources.
  • Write code to interact with multiple external APIs and systems.
  • Design complex queries to crunch data and produce summary and aggregate datasets.
  • Help maintain the cloud-based Data Warehouse central to our efforts.
  • Build data expertise and leverage data controls to ensure privacy, security, compliance, data quality, and operations for allocated areas of ownership.
  • Assist with data-related technical issues and support their data infrastructure needs.
  • Create scalable architecture to support the management, deployment, and usage of machine learning and mathematical models.
  • Collaborate with the analytics team members to optimize data pipelines and bring analytical prototypes to production.
  • Partner with our scientists to manipulate datasets and to provide new features tailored to our business requirements.
  • Identify, design, and implement internal process improvements: find related data sources to our business needs, automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.

Position Requirements:

  • A bachelor's degree, preferably in a computer science or a related field.
  • Working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
  • Experience with schema design and dimensional data modeling.
  • Experience in custom ETL design, implementation, and maintenance.
  • Experience building and optimizing data pipelines.
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and find opportunities for improvement.
  • A successful history of manipulating, processing, and extracting value from large, disconnected datasets.
  • Experience analyzing data to identify deliverables, gaps, and inconsistencies.
  • Experience supporting and working with multi-functional teams in a dynamic environment.
  • Solid software testing, documentation, and debugging practices in the context of distributed systems.
  • Good knowledge of Unix/Linux including scripting.

Key Technologies

  • Cloud environments such as GCP, Azure, AWS, etc.
  • Experience with object-oriented/object function scripting languages: Python, Java, etc.
  • Experience with Snowflake.
  • Experience with visualization tools (Power BI or Tableau).
  • Bonus: Experience with Spark.
  • Bonus: Experience working with Kubernetes.
  • Bonus: Experience implementing and tuning machine learning models.

*** Mention DataYoshi when applying ***

Offers you may like...

  • NN Tech, LLC

    Data Engineer
  • The University of Pittsburgh

    Data Engineer (remote)
    Pittsburgh, PA
  • Exact Sciences Corporation

    Data Engineer II
    Madison, WI 53711
  • Spotify

    Data Engineer, Insights Platform
    New York, NY
  • Benefit Recovery Group

    ML Data Engineer ** REMOTE **
    Nashville, TN 37215