Data Engineer - Remote
Role: Data Engineer
Duration of work order: 9 months
Experience: 5+ years of experience in an enterprise data science or data engineering roleSkills:
- University degree in information technologies, machine learning, data science or related field.
- Expert knowledge of SQL, Python and related ML packages.
- Expert knowledge of solution architecture in Amazon Web Services.
- Expert knowledge of cloud-based data engineering.
- Knowledge of CI/CD tools and analytics model deployment.
- Desirable: Knowledge of provisioning of data APIs.
- Desirable: Experience with Natural Language Processing tools and techniques.
- Desirable: Experience with data visualization tools and methods.
- Desirable: Proven ability to communicate with business experts and present ML results.
Software development languages and tools: Amazon Web Services, SQL, Python, Jupyter, Athena, Glue, Spark, CloudFormation.
Location: Nearshore or Offshore
Duration: 01/04-21/12, 2024Project objectives:
This role focuses on enhancing the existing Enterprise Analytics Platform, by deploying new services and data pipelines for the strategic application of data science, while ensuring customer privacy and security. The Data Engineer is primarily responsible for implementing data pipelines, and using data mining and predictive modelling techniques to gain insights, predict behaviors, and generate value from customer data. The Data Engineer works within the Information and Communication Technology Department, under the supervision of the Senior Data Platform Specialist.Tasks:
The Data Engineer provides the required expertise to design and deploy data products in the Enterprise Analytics Platform, using an interactive/agile approach.
- Extend the Platform by implementing new functionality using infrastructure as code approach.
- Deploy ML models and ETL jobs/workflows using a CI/CD pipeline.
- Identify and address privacy and confidentiality concerns. Collaborate with the establishment of data-handling practices to ensure the Organization’s information security guidelines are followed.
- Establish model drift monitoring strategies.
- Use Natural Language Processing to establish relationships between multiple datasets.
- Identify and communicate how structured and unstructured data can be used to enrich datasets and increase knowledge about the customer.
- Define relevant features for data modelling and participate in its implementation.
- Provide time sensitive analyses on various customer and business questions.
- Prepare reproducible analyses using data science notebooks.
- Prepare concise data reports and present information using visualization techniques to management and stakeholders.
- Perform other related duties as required.
- Issue reports managed via JIRA.
- Data transformation jobs and workflows.
- Advanced models and specialized analyses for data collation, migration and deduplication.
- API endpoints developed to fetch data from the data lake and retrieve inference results.
- Reproducible experiments in Jupyter notebooks.
- Detailed use case diagrams, data flow charts, and architectural schematics.
- User guide manuals and documentation accompanying ad-hoc analyses.