Senior Data Engineer - Remote

Company:
Location: Milwaukee, WI

*** Mention DataYoshi when applying ***

Senior Data Engineer - Remote - WD30120363186


The future is being built today and Johnson Controls is making that future more productive, more secure, and more sustainable. We are harnessing the power of cloud, AI/ML and Data analytics, the Internet of Things (IoT), and user design thinking to deliver on the promise of intelligent buildings and smart cities that connect communities in ways that make people’s lives and the world better.

What you will do

The Johnson Controls AI Hub’s mission is to infuse AI capabilities into products using a collaborative approach working alongside multiple business units. One of the charters of the hub is to create end-to end enablers in order to streamline AI/ML operations right from Data supply strategy to Data discovery to Model training and development to deployment of AI services in the Cloud as well as at the Edge.

At AI Hub, we realize that AI/ML solutions for Smart Buildings are much more accurate by following a Data-centric approach instead of a Model-centric development approach. The type and amount of labelled data, available at the right time at the right place in the MLOps pipeline is crucial to building more accurate AI models. To this end, we are looking for a stellar hands-on Senior Data Engineer to envision and architect data and analytics pipelines, create a common data catalog and ensure data governance and security throughout its lifecycle.

In this vital role, you will work with solution engineers, data scientists, product managers and domain experts across JCI to implement a data supply and management strategy inclusive of identifying data quality issues to achieve specific business outcomes.

How you will do it

Using a combination of PaaS and in-house microservices, build and own a highly scalable, secure, and cost-effective data platform for various AI solutions

Work with Product and Engineering teams, architect and implement end-to-end Data pipelines accounting for the variability in data sources and collection policies for cloud and edge, transformations for better feature extraction, and serving infrastructure

Work with Data scientists, MLOps engineers, and SMEs from domain to understand how data availability and quality affects AI model performance

Evaluate vendors, open source and proprietary technologies and present recommendations to onboard potential vendors, automate workflows with versioned experimentation, digital feedback, and monitoring.

Qualifications


What we look for

Required

  • BS in Computer Science/Electrical or Computer Engineering, Statistics or has a degree and demonstrated technical abilities in similar areas
  • 5+ years of total experience in building and managing data warehouses, data lakes and big data platforms and applications
  • 1+ years of experience in building high-throughput data pipelines between edge-to-cloud and cloud-to-cloud
  • Deep experience in (Modern) Data warehousing methodologies and designing RDBMS, Timeseries, NoSQL and columnar data stores in distributed environments taking into account sharding, throughput and backup strategies
  • Hands-on experience building Data platforms using PaaS services provided by public clouds such as Microsoft Azure, Amazon Web Services or Google Cloud Platform
  • Experience working with message brokers, caches, queues, pub/sub concepts and implementing ETL using microservices architecture
  • Fluent in languages such as Python, Java, Scala, PySpark, Spark, Bash Scripting, SQL
  • Container experience using technologies such as Kubernetes, Docker, AKS, Openshift, Service Fabric
  • Knowledgeable in the SCRUM/Agile development methodology
  • A passion for all things Data - including Analytics, RPA, AI/ML and IoT
  • Strong spoken and written communication skills

Preferred Qualifications

  • 7+ years of total experience in building and managing data warehouses, data lakes and big data platforms and applications
  • 3+ years of experience in building high-throughput data pipelines between edge-to-cloud and cloud-to-cloud
  • Experience in Azure Data Catalog, Purview, Data Governance, or equivalent Data Governance tools
  • Experience in working with schema registry, Avro, Parquet etc.,
  • Experience in Azure Synapse – Serverless pool, Dedicated Pool, Spark pool, building Data Mart, Materialized views, Performance improvements and Query optimization
  • Experience in working with Kafka, KsqlDB, Azure Databricks, Azure Synapse, SQL Database, Data Factory, ADLS Gen2, Azure Functions, Snowflake is a plus
  • Strong Azure DevOps skills: both Git & complex deployment pipelines (both IaC-level AND application/logical/data pipeline level/ DB Deployment)

Johnson Controls is an equal employment opportunity and affirmative action employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, age, protected veteran status, status as a qualified individual with a disability, or any other characteristic protected by law. For more information, please view EEO is the Law. If you are an individual with a disability and you require an accommodation during the application process, please visit www.johnsoncontrols.com/tomorrowneedsyou.

Job Engineering

Primary LocationUS-WI-Milwaukee

Organization Bldg Technologies & Solutions

*** Mention DataYoshi when applying ***

Offers you may like...

  • VirginPulse

    Senior Data Engineer
    Remote
  • eCivis

    Senior Data Engineer (BI) -Remote Must live in USA...
    Pasadena, CA 91103
  • eCivis

    Senior Data Engineer / ETL Developer- Remote USA
    Pasadena, CA 91103
  • Luxoft

    Senior Data Engineer
    Remote
  • Human Profiler

    Senior Data Engineer Expert
    Lisboa