Lead Data Engineer, Data Lake

Location: Kirkland, WA

*** Mention DataYoshi when applying ***


AuthenticID (https://authenticid.co/) is solving the biggest business issue in the world today: Fraud.

Fraud losses are skyrocketing for companies because the old model of verifying identity (“something you know”) isn't working with so many data breaches, hacks, dark web data sales and sophisticated fake ID creation.

AuthenticID uses a next-generation model of verifying identity based on combining a government-issued ID (“something you have”) with biometric data of face/voice/fingerprint (“something you are”). Our solution provides identity verification in seconds using a mobile phone and is able to defeat most fraud attempts using advanced AI capabilities.

With many new Fortune 500 clients, AuthenticID is going through a rapid growth phase and needs people who understand this type of scaling and bring a passion to help solve big problems that affect millions daily.

Okay? You might be asking. It's great to talk about the business of privacy, but... what exactly are these solutions you're building? …We can't tell you quite yet.

Our job is to keep our work out of the wrong hands. But more than that, we need people on our team who believe in the mission to safeguard privacy amidst 21st century interactions. It's more than a job, it's a movement on the frontlines of privacy and commerce affecting tens of millions every year.


The Lead Data Engineer reports directly to the Manager of ML Engineering and will be part of the Research and Development team. The R&D team fills the mission-critical role of ensuring that accurate data is securely accessible for the Research team as well as the Decisioning team at scale with consistent performance, integrity, and extensive monitoring and logging. Without accurate data, right frameworks and reliable workflow systems in place we cannot rapidly advance machine learning research or make right decisions in alignment with the company's strategic vision.

In this role you will build and supervise quality data systems in the cloud, identify opportunities for improvement and implement best practices to help us accomplish that vision. This role is a direct contributor to R&D's transformation initiative and a key to continuously improving our data processes and technologies. The role also works closely with the Lead ML Engineer and Lead Data Scientist to ensure a high level of accuracy and quality in the curated data required by different teams and departments.

Top 3 key outcomes in the first year include:

  • Develop – Understand the data and its sources and consolidate them into a data lake solution. Develop data catalog as an index to location, schema, and runtime metrics of the data. Design automated ETL processes to populate a data warehouse as needed.
  • Support - Partner with the Research, ML engineering, and Decisioning teams to identify their data requirements and work with them to build solutions to collect required data. Design dashboards and metrics to monitor the data flow and lineage throughout its life cycle.
  • Lead - Strive to continuously apply best practices to our data systems and technologies that efficiently improve our scalability and agility as a research and data science organization. Mentor junior data engineers.


  • Implement data lake, data catalog, and data warehouse solutions
  • Build end-to-end scalable, reliable, and maintainable data pipelines
  • Partner with Data Scientists and ML Engineers to understand data, automate data retrieval and labelling, and develop benchmarking services and data quality assurance metrics
  • Design, implement, and maintain dashboards and monitoring systems to provide visibility and feedback throughout the data life cycle
  • Work with the infrastructure and product teams to provide input to improve our data platform


  • Track record of success as a senior/lead Data engineer in an AWS or Cloud environment
  • Proficiency in AWS data processing and analytic systems
  • Solid understanding of SQL
  • Software Engineering proficiency in at least one of the common big data programming languages; Python, Java, or Scala
  • Experience in reporting, analytic, and dashboarding tools
  • Experience in building both batch and stream data processing systems
  • Experience in building datasets for computer vision tasks or curating data for data science consumers, preferred
  • Understanding of machine learning and deep learning concepts is an asset
  • $100M and upward company scale up experience
  • AuthenticID company values and culture fit
  • STEM Bachelor's Degree or equivalent
  • Background check and drug screen required


  • 8+ years of Data Engineering experience working with distributed data technologies for building scalable, reliable, and maintainable data pipelines
  • AWS certifications, preferred


  • Competitive salary and option grants
  • Flexible hours and recovery days
  • Medical & Dental Insurance, and Life
  • Once-in-a-lifetime experience taking a startup into scale mode, working directly with experienced founders and a diverse, fun-loving and hardworking team

LOCATION: Seattle, WA region or United States Remote

AuthenticID is an equal-opportunity employer and we welcome applicants from all backgrounds. If you're passionate about consumer identity privacy and a team player who wants to join a growing diverse and dynamic team, we look forward to hearing from you!

*** Mention DataYoshi when applying ***

Offers you may like...

  • e-Merge IT Recruitment

    Lead Data Engineer – Cape Town – up to R1,2 per an...
    Cape Town, Western Cape
  • Lumen

    Lead Data Scientist- Lumen MTS
  • Sabenza IT

    Lead Data Scientist
    South Africa
  • MECS Africa

    Lead Data Scientist (AI)
  • SUMMIT Africa Recruitment

    Lead Data Scientist (AI)
    Cape Town, Western Cape