Data Engineer

Location: Warszawa, mazowieckie

*** Mention DataYoshi when applying ***

External Job Description

Data Engineer (Python) – Warsaw

Real-World & Analytics Solutions (RWAS Technology).

We are seeking a Python Software Engineer to join our Predictive Analytics (PA) team in Warsaw. You will play an important role in our development activities, working closely with our team who are building innovative machine learning solutions, addressing some of the most pressing issues in healthcare, such as under-diagnosis of rare diseases and identifying patients at high risk of disease progression.

The PA team is currently based in London, Philadelphia and Warsaw. You will be expanding software development and engineering function in Warsaw. There will be an ongoing need to collaborate closely with colleagues in London and Philadelphia on a daily basis.

Working with petabytes of data, modern distributed systems, advanced data science models and challenging requests in an agile environment, you will help to set standards for the development of data-driven analytics products. This crucial role will involve identifying opportunities for better data modelling, including processing, scaling, internal tooling, and other data engineering activities that will help our team maximise our efficiency in delivering data science projects. You will have the opportunity to provide technical guidance to data-scientists in the UK and US delivery teams, and with your team you will set the standards for code quality and software architecture in the packages you maintain.

You will be responsible for creating tools to be used by data scientists as well as non-technical experts. Amongst other things, these tools will include:

  • Data engineering tools to enable complex querying and diverse feature engineering tasks on very large data (hundreds of millions of patients) from a Hadoop environment using Python and PySpark.
  • Creating analytical pipelines to support a range of analytical functions, from data inspection through advanced machine learning solutions.
  • Initially carrying out proof-of-concepts to validate the use of new technologies, and if successful roll out team-wide solutions that will allow us to scale more effectively and tackle more complex problems.


  • Make high quality contributions to software development activities, exhibiting pragmatism to maximise the impact and value of the output that you produce.
  • Help to establish a positive, open and high-performing culture in the new Warsaw team you will be joining.
  • Promote best-practice software development in Python, Spark, Hadoop and related technologies and participate in the full software life-cycle.
  • Comprehensive testing of your own code.
  • Engage in the team’s agile practices such as daily stand ups, sprint planning, sprint refinements, and retrospectives; work to fortnightly sprints and be proactive in suggesting evolution to team process that will make you a stronger unit.
  • Work with your team lead to maintain a healthy scrum backlog, engaging in story refinement sessions and inputting your own ideas.

Our ideal candidate will have:

  • Substantial programming experience, preferably in Python.
  • Experience working with Spark in a commercial environment
  • Ability to develop clean and testable code
  • Experience with python data science ecosystem (pandas, pyspark, scikit-learn, tensorflow)
  • Practical experience with big data processing ecosystem including tools like (Spark, YARN, Hive, Impala, HDFS)
  • Proficiency with relational databases and SQL.
  • Good understanding of code versioning tools such as Git and Linux proficiency.
  • Experience in following Scrum best practices.
  • Fluency in English (spoken and written).

We would also appreciate if you have some of the following:

  • Experience with machine learning workflow and ML Ops tools
  • Experience working in a function as part of or supporting a data-science team with a machine learning focus.
  • Proficient understanding of designing microservices based applications.
  • Experience with workflow managers like Airflow, Prefect

We thank all applicants for their interest; however, only those selected for an interview will be contacted.

IQVIA is a strong advocate of diversity and inclusion in the workplace. We believe that a work environment that embraces diversity will give us a competitive advantage in the global marketplace and enhance our success. We believe that an inclusive and respectful workplace culture fosters a sense of belonging among our employees, builds a stronger team, and allows individual employees the opportunity to maximize their personal potential.



At IQVIA, we believe in pushing the boundaries of human science and data science to make the biggest impact possible – to help our customers create a healthier world. The advanced analytics, technology solutions and contract research services we provide to the life sciences industry are made possible by our 67,000+ employees around the world who apply their insight, curiosity and intellectual courage every step of the way. Learn more at

*** Mention DataYoshi when applying ***

Offers you may like...

  • inovex GmbH

    Data Engineer / Machine Learning Engineer*
    Home Office
  • XOi Technologies

    Data Engineer (Analytics)
    Nashville, TN
  • Cepheid

    Sr Data Engineer
    New York, NY 10001
  • Pacific Biosciences

    Engineer II, Software Data Engineer
  • Health Catalyst

    Data Engineer (Life Sciences)