All new
Data Science
jobs, in one place.

Updated daily to help you be the first to apply ⏱

avatar4avatar1avatar5avatar3avatar2
Big Data Engineer
  • Python
  • Spark
  • SQL
  • Linux
  • Big Data
  • ETL
  • Apache Spark
Schneider Electric
Barcelona, Barcelona provincia
109 days ago
Schneider Electric is leading the Digital Transformation of Energy Management and Automation in Homes, Buildings, Data Centers, Infrastructure and Industries.


We believe that great people and partners make Schneider a great company and that our commitment to Innovation, Diversity and Sustainability ensures that Life Is On everywhere, for everyone and at every moment.

https://youtu.be/VbldHPFltQQ



At Global Data Hub team, we are building Intel Data Store (IntelDS), a global Data Lake for all enterprise data. It is a Big Data platform fully hosted on AWS and connected today to more than 40 data sources.The job purpose is to support the big data engineering team building and improving IntelDS by:
  • Connecting new sources to enrich the data scope of the platform
  • Design and develop new features based on consumer application requests to ingest data in the different layers of IntelDS
  • Automate the integration and delivery of data objects and data pipelines
The duties and responsibilities of this job are to prepare data and make it available in an efficient and optimized format for consumer analytics, BI, or data science applications. It requires to work with current technologies used by IntelDS and in particular Spark, Presto, and RedShift on AWS environment. This includes:
  • Design and develop new data ingestion patterns into IntelDS raw and/or unified data layers based on the requirements and needs for connecting new data sources or for building new data objects. Working in ingestion patterns allow to automate the data pipelines.
  • Participate to and apply DevSecOps practices by automating the integration and delivery of data pipelines in a cloud environment. This can include the design and implementation of end-to-end data integration tests and/or CI/CD pipelines.
  • Analyse existing data models, identify and implement performance optimizations for data ingestion and data consumption. The objective is to accelerate data availability within the platform and to consumer applications. Target technologies are Apache Spark, Apache Presto and RedShift.
  • Support client applications in connecting and consuming data from the platform, and ensure they follow our guidelines and best practices.
  • Participate in the monitoring of the platform and debugging of detected issues and bugs.

Qualifications

Minimum of 3 years prior experience as data engineer with proven experience on Big Data and Data Lakes on a cloud environment.Bachelor or Master degree in computer science or applied mathematics (or equivalent).
Qualifications include:
  • Proven experience working with data pipelines / ETL / BI regardless of the technology
  • Proven experience working with AWS including at least 4 of: RedShift, S3, EMR, Cloud Formation, DynamoDB, RDS, lambda
  • Big Data technologies: one of Spark, Presto or Hive
  • Python langage: scripting and object oriented
  • Familiar with SQL, datawarehousing is a plus (RedShift in particular)
  • Familiar with GIT, Linux, CI/CD pipelines is a plus
  • Autonomous, agile, takes the initiative and team player

Primary Location

: ES-Catalonia-Barcelona

Schedule

: Full-time

Unposting Date

: Ongoing

    Related Jobs

  • Senior Data Engineer R&D

    • Database
    • SQL
    • Python
    Hays
    Centro-Mad...
    1 day ago
  • Cloud Data Engineer

    • Business Intelligence
    Vodafone
    Madrid
    25 days ago
  • Junior Data Engineer R&D

    • NoSQL
    • SQL
    • Python
    Hays
    Centro-Mad...
    1 day ago
  • Data Engineer

    • Python
    • Scala
    • Spark
    Page Personnel España
    Barcelona
    22 days ago
  • Senior Data Scientist

    • Machine Learning
    Unit4
    Granada
    10 days ago