What You’ll Get to Do:
Creation and support of real-time data pipelines built on AWS technologies including Glue, Redshift/Spectrum, Kinesis, EMR and Athena
Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using SQL and AWS big data technologies
Continual research of the latest big data and visualization technologies to provide new capabilities and increase efficiency
Collaborate with other tech teams to implement advanced analytics algorithms that exploit our rich datasets for statistical analysis, prediction, clustering, and machine learning
Help continually improve ongoing reporting and analysis processes, automating or simplifying self-service support for customers
More About the Role:
The right candidate will help design, build, and modernize an existing high-profile legacy system in a cloud DevOps environment utilizing available C2S services. As a data engineer you will be responsible for taking an existing framework and executing against it that will ultimately take unstructured data and transform into structured, searchable and tagged data that will be more useful to the Program. The candidate will utilize C2S services in combination with 3rd parties - Spark, EMR, DynamoDB, RedShift, Kinesis, Glue, Snowflake, etc.
You’ll Bring These Qualifications:
TS/SCI with Poly clearance is required
Demonstrated strength in data modeling, ETL development, and data warehousing
Experience using big data technologies (Hadoop, Hive, Hbase, Spark etc.)
Knowledge of data management fundamentals and data storage principles
Experience using business intelligence reporting tools (Tableau, Business Objects, Cognos etc.)
Strong analytic skills related to working with unstructured datasets.
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Build processes supporting data transformation, data structures, metadata, dependency, and workload management.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
Experience with relational SQL and NoSQL databases, including Postgres.
Experience with data pipeline and workflow management tools.
These Qualifications Would be Nice to Have:
Experience working with AWS data technologies (Redshift, S3, EMR)
Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of data sets
Experience working with distributed systems as it pertains to data storage and computing
Knowledge of software engineering best practices across the development lifecycle, including agile methodologies, coding standards, code reviews, source management, build processes, testing, and operations
CACI employs a diverse range of talent to create an environment that fuels innovation and fosters continuous improvement and success. At CACI, you will have the opportunity to make an immediate impact by providing information solutions and services in support of national security missions and government transformation for Intelligence, Defense, and Federal Civilian customers. CACI is proud to provide dynamic careers for employees worldwide. CACI is an Equal Opportunity Employer - Females/Minorities/Protected Veterans/Individuals with Disabilities.