Job description

Job Description

Lead Data Engineer


Come join our diverse Advance AI Team and start reimagining the future of the automotive aftermarket. We are a highly motivated tech-focused organization, excited to be in the midst of dynamic innovation and transformational change.

Driven by Advance’s top-down commitment to empowering our team members, we are focused on delighting our Customers with Care and Speed, through delivery of world class technology solutions and products.

We value and cultivate our culture by seeking to always be collaborative, intellectually curious, fun, open, and diverse. As a Data Engineer within the Advance AI team, you will be a key member of a growing and passionate group focused on collaborating across business and technology resources to drive forward key programs and projects building enterprise data & analytics capabilities across Advance Auto Parts.


As the Lead Data Engineer, you will be hands-on using AWS, SQL, and Python daily with a team of Software, Data, and DevOps Engineers. This position has access to massive amounts of customer data and will be responsible for the end-to-end management of data from various sources. You will work with structured and unstructured data to solve complex problems in the aftermarket auto care industry.

Essential Duties and Responsibilities include the following: other duties may be assigned:


  • Build processes supporting data transformation, data structures, metadata management, dependency, and workload management.
  • Build test and debug Website tags and support digital marketing campaign by automating the process
  • Implement cloud services such as AWS EMR, EC2, EMR, Snowflake, Elastic-Search, Juypter notebooks
  • Develop stream-processing systems: Storm, Spark-Streaming, Kafka etc
  • Deploy data pipeline and workflow management tools: Azkaban, Luigi, Airflow, Jenkins
  • Scale and deploy AI products for customer facing application impacting millions of end-users
  • Develop the architecture for deploying AI products that can scale and meet enterprise security and SLA standards
  • Develop, construct, test, and maintain optimal data architectures.
  • Identify, design, and implement internal process improvements such as automating manual processes, optimizing data delivery, and redesigning infrastructure for greater scalability.
  • Perform hardware provisioning, forecasting hardware usage, and managing to a budget
  • Perform security standards like symmetric and asymmetric encryption, virtual private clouds, IP management, LDAP authentication, and other methods


  • Share outcomes through written communication, including an ability to effectively communicate with both business and technical teams
  • Succinctly communicate timelines, updates, changes to existing & new projects and deliverables in timely fashion


  • Be a self-starter, comfortable with ambiguity and enjoy working in a fast-paced dynamic environment.
  • Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
  • Working knowledge of any Tag Management platform. Preferred: Telium
  • Strong analytic skills related to working with unstructured datasets.
  • Experience building and optimizing ‘big data’ data pipelines, architectures, and data sets.
  • Experience in cloud services such as AWS EMR, EC2, Juypter notebooks
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and see opportunities for improvement.
  • A successful history of manipulating, processing, and extracting value from large, disconnected datasets.
  • Solid understanding of message queuing, stream processing, and highly scalable ‘big data’ data stores.
  • Knowledge or experience with Agile methodologies is a plus.
  • Experience supporting and working with multi-functional teams in a dynamic environment.
  • Experience using the following software/tools:
    • Strong experience with AWS cloud services: EC2, EMR, Snowflake, Elastic-Search
    • Experience with stream-processing systems: Storm, Spark-Streaming, Kafka etc.
    • Experience with object-oriented/object function scripting languages: Python, Java script etc.
    • Experience with big data tools: Spark, Kafka, Kubernates etc.
    • Experience with relational SQL and NoSQL databases, including Postgres and Dynamo DB.
    • Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.


  • Bachelor’s Degree in Computer Science or related field required, Master’s preferred and
  • 6-8+ years of direct experience
  • Or, equivalent combination of experience and/or education

Supervisory Responsibilities

  • This position will not be responsible for managing a team

Please let the company know that you found this position on this Job Board as a way to support us, so we can keep posting cool jobs.