The Data Engineer will design, create, manage and maintain the Adaptimmune enterprise data management environment including all components involved in data ingestion, curation, storage, aggregation and provisioning. The company is designing and building a new data lake environment which will be the single source of truth for all cross-functional data analytics. Primary users range from data scientists, functional analysts, bioinformaticians, biometricians, executives and research scientists. The role will be responsible for understanding the infrastructure, vendor architecture, data models, data lake design and maintenance and working with clients to set up the cross-functional data based ponds to meet their business use cases. The role will be the internal expert on all the data lake and data ponds and how they can best be leveraged to deliver maximum benefit to the business and Adaptimmune as a whole. There will also be scope to investigate the use of AI/ML techniques to drive more value from the data in the lake.
This is a highly technical role that blends traditional data management, database design, sematic modelling, information architecture and data analytics. Scope includes:
- Clear understanding of our current logical and physical architecture for the overall Adaptimmune data ecosystem and how the new data lake environment will deliver value
- Understand the vision, gather requirements and convert business needs to technical designs and manage the implementation of the requirements
- Creation of requirements models for the business use cases and design an optimal delivery model. Work closely with the business SMEs in an agile approach to deliver solutions rapidly
- Create standard services that will enable our Self-Service analytics and BI/Dashboard services
- Work closely with our data scientists/bioinformaticians/biometricians to provide data ponds and analytics tools that meet their specialist requirements
- Vendor management of key platform providers that provide the data management capabilities, the middleware, metadata catalogue and other technology capabilities within the scope of the enterprise data management solution
- Design, Creation and Management of enterprise data management environment from the point of integration with the functional repositories to the delivery of cross functional analytics solutions
- Hands-on database design and engineering activities including and not limited to:
- Design of the database models for the main validated and non-validated data lakes
- Requirement’s definition and solution design and implementation of the data pipeline between the data repositories and the data lake
- Creation of ETL and other transformation algorithms to align the data schemas and models of the repositories to the enterprise data model working closely with the enterprise data architect
- Creation of aggregation routines to collect the required data to fulfil the requests for specific data ponds
- Creation of data pipelines to connect the data ponds to the analytics tools supported by IM such as Spotfire, Tableau, R Shiny, JMP etc
- Vendor management of providers of data lake and ponds capabilities and other solutions in the enterprise data management environment
- Working alongside data scientists, analysts, IT experts and stakeholders to build / improve the data analytics solutions
- Oversight of enterprise data processing platforms and tools (e.g. data lake, data ponds, etl algorithms, integration technologies)
- Optimize the use of the enterprise data management capabilities to provide insight into our data assets and maintain an environment that meets the data access & security requirements
- Enterprise data strategy to drive further value from this new service investigating areas such as leveraging the use of the document model to deliver more value, use of AI/ML techniques to provide insight into the data and creation of solutions from the cross-functional data [such as full patients journey] which are difficult to attain at present.
- Implement solutions to promote the company policies, procedures and business rules around data governance, data access and data security.
- Other duties as assigned by management in support of rapidly growing company
QUALIFICATIONS & EXPERIENCE
- BSc or MSc degree in a relevant field, such as computer science, statistics, applied mathematics, computational biology, data management, data science, information systems, bioinformatics etc.
- Previous experience building a data warehouse or data lake environment and providing data analytics services
- Data and Information modelling and architecture experience at an enterprise level
- Understanding of data lakes, semantic models, graph databases and applicability to data analytics use cases
- Familiar with specialized commercial and open source data visualization and analysis tools
- Knowledge of data governance essentials such as master data, ontologies, terminology standardization and data access & security requirements
- Demonstrated hands-on experience creating data models, data dictionaries, functional requirements, process maps, test plans and associated documentation
At Adaptimmune we embrace diversity and equality of opportunity. We believe that the more inclusive we are, the better our work will be. We welcome applications to join our team from all qualified candidates, regardless of age, colour, disability, marital status, national origin, race, religion, gender, sexual orientation, gender identity, veteran status or other legally protected category. It is our intent that all qualified applicants will receive equal consideration for employment.