Big Data Engineer - (Amazon Deequ required skills) (Contract)
Who We Are
Teranet is Canada’s leader in the delivery and transformation of statutory registry services with extensive expertise in land and commercial registries. We also market insightful property and data solutions, as well as practice management automation to thousands of customers in the real estate, financial services, government, utilities, and legal markets.
Connect. Grow. Thrive Together.
To learn more about who we are visit our website: www.teranet.ca
About the Role
Teranet is seeking an experienced Big Data Engineer with focus on designing and implementing a data quality framework. In this role, you will be working in collaboration with data product experts, data analysts and other data engineers to design and develop a flexible and scalable data quality framework to capture data quality metrics and apply data quality rules on data pipelines to maintain high quality of data into Teranet’s data platform. In addition, you will be working on create/modify data pipelines and data models for data delivery, AI development, and BI data insight visualizations. You will be responsible for configuring tooling and frameworks that support data ingestion from various databases (Oracle, MS SQL server, PostgreSQL), testing to ensure the accuracy and quality of data ingestion and data pipeline input/output as well as the curation of data so that it is available to support various use cases. You will also collaborate with Teranet’s various infrastructure teams to ensure proper data access controls are in place, data is properly secured, and access activities are auditable.
What You’ll Be Doing
- Participate in planning with business product owners, data analysts and identify tasks for the data analytics team.
- Analyze requirements to identify triggers/requirements for capturing data quality metrics and assessing quality of data ingested within data platform.
- Design and develop a flexible and scalable data quality framework for capturing data quality metrics and automated quality check of ingested data.
- Design and develop programs for setting up data pipelines, curate data for the enterprise-wide usage, prepare data models for specific use case.
- Develop test objectives, test plan and success criteria (connectivity, data replication, auto fail-over, peak load performance etc.).
- Work with infrastructure, security, and networking teams to ensure connectivity requirements are met for data pipelines sources and targets.
- Tuning of data ingestion and replication to meet performance targets.
- Configure the CDC framework as required to create daily/weekly/monthly data snapshots within acceptable performance targets.
- Design and implement technology best practices, guidelines, and repeatable processes. Create design, test plan, and confluence documentation.
- Able to self-direct, prioritize and perform assignments with minimum supervision
- 5+ years of experience with Hadoop, Hive, Spark, Python, Bash, Linux
- Familiarity with Cloudera CDP private and public cloud
- Knowledge and experience with Amazon Deequ, PyDeequ or similar data quality framework.
- Knowledge of CDC based data ingestion setup, preferably using HVR
- Deep Hive/Spark/SQL knowledge, development, and testing experience
- Expert Python/Spark/Shell development and coding best practices skills
- Excellent day-to-day working knowledge of Git with exposure to Gerrit
- Extensive experience in developing ETLs and processing large datasets
- Experience with Airflow data pipeline implementation
- Experience building data models to support BI data visualizations using Tableau
- Familiarity with AWS services such as S3, EKS, and Kubectl is highly beneficial
- Experience with Terraform, and CI/CD tools is a plus
- Excellent written and verbal communication skills
We may be a global innovator in electronic services and solutions who operate one of the most advanced and secure registration systems in the world, but we’re so much more than that!
Our Extraordinary People.
Our Work Environment.
Company Culture & Core Values.
At Teranet, we are committed to fostering an inclusive, accessible environment, where all employees and customers feel valued, respected, and supported. We are dedicated to building a workforce that reflects the diversity of our customers and the communities in which we live and serve. If you require accommodation during the recruitment and selection process, please let us know and we will work with you to meet your needs.
Come As You Are. We Like You that Way!