Senior Data Engineer

Company:
Location: New York State

*** Mention DataYoshi when applying ***

Senior Data Engineer

Location: Fully Remote, or NYC Office

About Vic.ai

Vic.ai is creating the Intelligent Accounting era, using artificial intelligence to automate accounting and provide advisory, business insight, and eventually business foresight.

We're a Series A stage start-up, founded by Norwegian entrepreneurs and backed by renowned Silicon Valley investors (including Costanoa Ventures, Cowboy Ventures, and GGV Capital). We're US-based, but our team is global, from New Zealand to California. We're bringing AI to Finance and Accounting because the industry is ripe for automation and big-data insight and the market is huge: $200B just in the US.

We are the first company to develop a fully Autonomous Accounting solution for processing invoices, automating all required accounting tasks using AI, without human intervention. Our solution for this won the 2021 US Fintech Awards category for Accounting Tech of the Year.

We have a relaxed, professional culture, and most of our engineering team does asynchronous remote work on a permanent basis. We're a team of builders—when we aren't building Vic.ai, we're tinkering with a personal project, contributing to open source, modding a drone, building a computer from components, etc.

About you

You've been a software engineer for 5+ years, but you've been a tinkerer and a builder your whole life. You know a great data pipeline from a standard one, because you started out building textbook ones, and learned through experience all the things that make a pipeline robust, transparent, and scalable.

You're ready for the next step in your career, ready to take on fast-moving challenges. You're enthusiastic about AI and the possibilities it opens for software development and transforming traditional work. You have experience in working with machine learning models, and understand how cloud-based software development and DevOps works in the machine learning context. You aim at reaching greatness and delivering exceptional outcomes in your work.

As a team player, you are not afraid of reaching out to your colleagues to discuss development challenges, especially when you are stuck trying to solve a specific issue. You understand that software engineering requires a culture of code ownership, taking initiative when needed, and flexible collaboration across the wider company.

You are fluent in English. You can work remotely if required. Preferred locations EU and US.

Role details

You will become part of our core ML team, responsible for the systems that tie everything together into a production environment for our customers. The main focus of the role is to develop and scale our core cloud-based processing service, which includes our OCR technology, extracting data from various sources into databases, generating datasets for machine learning, distributed model training, and structuring communication with our AI models.

We mainly develop in and use:

  • Python
  • Pandas, Numpy
  • GitHub
  • Docker, Rancher
  • Redis, PostgreSQL
  • AWS (EC2, RDS, KMS, SNS, Lambda, etc)

extensively throughout our ML technology stack, so advanced knowledge of these is required.

In addition, our stack uses the following tooling, knowledge of which is desired:

  • Celery, Django
  • Tesseract, Textract
  • LightGBM, CatBoost, Keras
  • Elixir
  • Ray

Key areas of responsibility

  • You will own the data pipelines that feed and interact with our AI models! You will build them, monitor them, scale them
  • Our AI eats large amounts of data. You will work closely with the AI developers to set up scalable data storage solutions, for managing our datasets
  • You will collaborate with the ML team in various MLOps-related tasks, such as distributed model training
  • As part of continuous improvement take ownership of relevant system components to improve functionality, stability and/or capability

Qualifications

Bachelor's Degree in Computer Science, Software Engineering, or related fields. 5+ years of commercial software development

Experience

  • Experience (5+ years) with Python. In addition, experience with Python data engineering and ML tools (Pandas, Numpy, …)
  • Experience operating, scaling, and optimizing databases and storage systems on the cloud. Strong AWS experience preferred
  • You know that deploying software is as much about the environment as it is about the code: experience with CI systems (TravisCi or CircleCi is preferred), performance monitoring, log monitoring, and security

What we offer

  • An exciting work environment operating at the forefront of AI technology development
  • A company full of talented, curious, and friendly people
  • A competitive compensation package
  • Company-paid benefits for employees such as medical, dental, vision, disability, and life insurance
  • The opportunity to work fully remotely, with flexible time schedules
  • A workstation and tools of your choice

*** Mention DataYoshi when applying ***

Offers you may like...

  • Progressive Edge

    Senior Data Analyst (Retail Stores)
    Cape Town, Western Cape
  • Robert Walters

    Senior Data Analyst - Porfolio Monitoring Team
    Dublin
  • Lionbridge Technologies

    Senior Data Analyst
    Ballina, County Mayo
  • eClerx LLC

    Senior Data Analyst
    Hong Kong
  • SPRINGER PROFESSIONAL GROUP

    Senior Data Analyst (Team Lead) / Data Analyst – G...
    Hong Kong