The objectives of a Staff Data Engineer typically revolve around designing, implementing, and managing robust data architectures and infrastructures. The Staff Data Engineer will play a key role in driving the organization's data strategy, ensuring the efficient flow and storage of data, and supporting data-driven decision-making processes.
Duties & Responsibilities
Duties and Responsibilities:
Data Architecture Design:
Develop and maintain a scalable and efficient data architecture that meets the organization's current and future needs.
Design data models and schemas that support the storage, processing, and retrieval of data.
Implement robust ETL (Extract, Transform, Load) processes to integrate data from various sources into a unified and consistent format.
Ensure seamless data flow between different systems, databases, and applications.
Data Quality and Governance:
Establish and enforce data quality standards and governance policies.
Implement data validation and cleansing processes to ensure the accuracy and reliability of data.
Implement and adhere to data tracking, lineage, and logging strategies for the single pane of glass view for a given data flow.
Stay abreast of emerging technologies and trends in the field of data engineering, machine learning (ML) and artificial intelligence (AI) while assessing their applicability to the organization's data engineering needs.
Manage and optimize database systems, including relational databases (e.g., MySQL, PostgreSQL) and NoSQL databases (e.g., MongoDB, Cassandra).
Implement database security measures and access controls.
Identify and address performance bottlenecks in data processing and storage systems.
Optimize queries, indexing strategies, and data partitioning for improved performance.
Scalability and Reliability:
Ensure that data systems are scalable to handle growing volumes of data.
Implement redundancy and fault-tolerance measures to enhance data system reliability.
Implement and monitor security measures to protect sensitive data.
Ensure compliance with data privacy regulations and industry standards.
Experience with SQL (Preferably Amazon Redshift SQL)
Experience with cloud-based data platforms (e.g., AWS)
Develop, optimize, and maintain ETL processes for data integration.
Solid and current skills in tools like Mulesoft, SSIS, AWS Glue or similar.
Familiarity with real-time data processing technologies.
Understanding of data modeling concepts and techniques.
Design and implement efficient and scalable data models in the cloud.
Establish data quality monitoring and validation processes
Implement and monitor security measures to protect sensitive data
Proficient in writing complex SQL queries.
Familiarity with version control systems like Git for managing codebase changes
Bachelor’s degree (B.A.) from four-year college or university, or equivalent combination of education and experience.
10+ years of experience in data engineering or a related role
Proficient in data modeling, ETL development, and database management.
Experience with big data technologies and distributed computing frameworks