As a Data Engineer you will support GRI’s Global Data & Analytics team and will be responsible for creation and maintenance of databases, tables, data flows, and reporting dashboards covering every aspect of the company’s activities. You will understand the nuances of the data and ensure that each field in each table is properly named, populated, documented, and used. You will maintain high data quality and integrity by identifying and eliminating ambiguity, duplication, data errors, and inefficient data flows. The data that you maintain is a key strategic and competitive advantage for GRI and will form the backbone of analysis that drives the company’s business decisions.
ESSENTIAL DUTIES AND RESPONSIBILITIES
- Design, develop, document, and test advanced data systems that bring together data from disparate sources, making it available to data scientists, analysts, and other users using scripting and/or programming languages (Python, Java, C, etc.)
- Evaluate structured and unstructured datasets utilizing statistics, data mining, and predictive analytics to gain additional business insights
- Design, develop, and implement data processing pipelines at scale
- Present programming documentation and design to team members and convey complex information in a clear and concise manner
- Extract data from multiple sources, integrate disparate data into a common data model, and integrate data into a target database, application, or file using efficient programming processes
- Write and refine code to ensure performance and reliability of data extraction and processing.
- Communicate with all levels of stakeholders as appropriate, including executives, data modelers, application developers, business users, and customers
- Participate in requirements gathering sessions with business and technical staff to distill technical requirements from business requests
- Partner with clients to fully understand business philosophy and IT Strategy; recommend process improvements to increase efficiency and reliability in ETL development
- Collaborate with Quality Assurance resources to debug code and ensure the timely delivery of products.
- Some of our technologies might include: HDFS, Cassandra, Spark, Java, Scala, Informatica, SQL Server, Oracle, Ab Initio, Kafka
KNOWLEDGE, SKILLS, AND ABILITIES
- A successful candidate will have hands-on experience in a multitude of domains; including, but not limited to database design, data warehousing, business intelligence, big data, database tuning, application optimization, security, virtual computing and storage, incident tracking, and general database administration
- Three or more years’ experience including enterprise data warehouses, business intelligence, and MDM
- Knowledge and familiarity with concepts in predictive analytics
- Experience building and optimizing ‘big data’ data pipelines, architectures, and data sets.
- Expert level knowledge of Microsoft SQL Server
- Expert level knowledge of SQL administration, engineering, and monitoring tools
- Expert level knowledge of designing, constructing, administering, and maintaining data warehouses
- Solid experience working with SSIS and SSRS or similar tools
- Solid experience with change control and agile methodologies
- Experience with Performance Tuning
- Passion for using data to effectively support business needs.
- Excellent written and verbal communication skills
- Ability to juggle and prioritize multiple projects simultaneously
- Must be a motivated self-starter who can work independently, but also seeks out opportunities to work collaboratively with others
- Must have experience working directly with senior business leaders to understand objectives and articulate the value of business intelligence solutions
- Must be comfortable dealing with changing priorities and timelines
- Experience with the following tools and technologies preferred:
- Cloudera Hadoop, Spark, Kafka, NiFi, ElasticSearch, Hive, Solr
- Relational SQL and NoSQL databases
- Data visualization tools such as Tableau, Power BI or similar self-service BI products
- AWS cloud services such as EC2, EMR, RDS and Redshift
- Stream-processing systems such as Storm and Spark-Streaming
- No management responsibilities
EDUCATION and TRAINING
Bachelor's degree in Business Management, Computer Information Systems / Data Management, or other related fields is strongly preferred