Job description

Job Description

Project Length: 4 months

North Highland is seeking a Data Engineer with a strong technical background and practical business experience to design and implement data infrastructure for a conversational analytics project in a Google Cloud Platform (GCP) environment. The ideal candidate will have a proven track record of delivering data solutions in client environments while effectively communicating and collaborating with cross-functional teams.

Key Responsibilities:

  • Design and implement scalable data pipelines in GCP for processing large volumes of conversational data
  • Build and maintain data lakes and data warehouses using GCP services such as BigQuery and Cloud Storage
  • Develop ETL processes to integrate data from various sources, including speech-to-text APIs and natural language processing tools
  • Implement real-time data streaming solutions using tools like Cloud Pub/Sub and Dataflow
  • Optimize data models and query structures for improved performance and cost-efficiency in BigQuery
  • Collaborate with AI / machine learning engineers to prepare data for AI/ML model training and deployment
  • Ensure data quality, security, and compliance with data governance policies
  • Implement logging and monitoring solutions for data pipelines and infrastructure

Required Skills and Experience:

  • 5+ years of experience in a Data Engineering role, with at least 2 years working on GCP
  • Strong expertise in building and optimizing big data pipelines and architectures in cloud environments
  • Proficiency in SQL and experience with BigQuery
  • Experience with GCP data services such as Dataflow, Dataproc, and Cloud Composer
  • Knowledge of data modeling, data warehousing, and dimensional modeling techniques
  • Familiarity with NoSQL databases and data lakes
  • Experience with stream processing technologies like Apache Beam or Kafka
  • Strong programming skills in Python or Java

Required Technologies:

  • Google Cloud Platform (GCP) services: BigQuery, Cloud Storage, Dataflow, Dataproc, Cloud Pub/Sub, Dataplex
  • Data processing frameworks: Apache Beam, Apache Spark
  • ETL tools: Cloud Data Fusion, Dataflow templates
  • Containerization: Docker, Kubernetes (GKE)
  • CI/CD tools: Cloud Build, Jenkins
  • Version control: Git
  • Scripting languages: Python, SQL

Education and Certifications:

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field
  • Google Cloud Professional Data Engineer certification (preferred)
  • Additional certifications in data engineering or cloud technologies (preferred)

Please let the company know that you found this position on this Job Board as a way to support us, so we can keep posting cool jobs.