Your mission is to build high-quality, automated data pipelines and infrastructure to serve YouGov’s syndicated data products, including YouGov Profiles — one of the largest market research datasets in the world.
Working inside the Syndicated Data Infrastructure team, you will collaborate with data-minded people to convert vast troves of raw consumer data into meaningful insight by developing automated ETL applications, streaming data processors, RESTful microservices and browser-based user interfaces.
Day to day you will:
Build new features for our automated data infrastructure and the associated web interfaces, for example to support new sources of data
Optimise ETL and web applications to increase performance, reliability and visibility
Support existing data pipelines, ensuring every deliverable reaches its intended destination
Collaborate with other teams to design appropriate, automated solutions for new data delivery and integration needs
You must:
Love writing beautiful, idiomatic Python
Be comfortable in the modern realm of test-driven, version-controlled software development
Have experience in building substantial ETL pipelines and web applications
Enjoy solving complex technical problems
Be eager to develop new skills and expertise
Be proactive, positive and professional
Experience with the following technologies is desirable:
Luigi, Flask, SQLAlchemy, Pandas
Docker, Kubernetes
JavaScript, HTML, CSS
SQL, Postgres
Redis, RabbitMQ
Amazon S3, EC2, Redshift
Airflow, dbt
Snowflake
This role can be based in the UK orany EU country in which we have an entity (Spain, Italy, Poland to name a few) and is 100% remote.
To find out how we collect and use your personal data when you apply for a role at YouGov, please read our privacy notice at https://jobs.yougov.com/privacy