PriceHubble

SENIOR DATA SCIENTIST - DATA INTELLIGENCE TEAM

Job description

JOB DESCRIPTION

Data is at the core of PriceHubble. We process a wide variety of data from multiple sources. As a data scientist in the data-intelligence team, you will have three main missions:

  • First, to augment the data we have via machine learning prediction.
  • Second, to develop techniques to measure, assert, and improve the quality of the data we have.
  • Third, to develop matching algorithms for linking data from heterogeneous sources.

As a senior data scientist, you are highly motivated by the following questions:

  • Before doing standard machine learning, how do I build a strong labeled data set from scratch?
  • Garbage in = garbage out; then how do I measure the quality of labels in a data set? How do I improve upon this when I have very few labels to start with?
  • How can I go from no labels to the point where state of the art Machine-Learning can finally be leveraged?
  • How to plan research projects spanning 3 months to 1 year in a way that structurally mitigates risk?
  • What should be the next steps in a research project? Where should we focus research efforts? The models, the labels/training data, feature-engineering, post-processing, or elsewhere?

These questions are, in our opinion, the new frontier in data science. You will be joining a team that specializes in this topic, with, amongst other, advanced experience in crowd-sourcing, matching problems, ensembles modeling, and statistical estimation. Our technologies and tools are just getting started; feeling excited about it? Want to be part of the adventure? Hop in!

Responsibilities

  • Mentor more junior team members
  • Define roadmap & approaches for research projects
  • Actively mitigate risks in Machine Learning projects, by attacking high risk items first and making sure projects fail fast if likely to hit a structural blocker
  • Apply machine learning methods to augment data-sets
  • Develop and improve models for cross linking heterogeneous data sources together
  • Analyse and detect problems in our estimators
  • Correct blind spots in our data-labelling
  • Deploy, validate, and fine tune crowd-sourcing jobs for acquiring labels

Requirements

  • MSc or PhD in Computer Science, Applied Mathematics or related fields; with a strong experience in machine learning and/or data science.
  • 3-5 years experience in a data-science, research (incl. PhD), or quantitative role
  • In-depth understanding of basic data structures and algorithms.
  • Strong analytical skills with the ability to collect, organise, and analyse significant amounts of data with attention to detail and accuracy.
  • Strong programming experience with Python, and ability to write quality production code.
  • Experience with crowd-sourcing, active-learning, semi-supervised learning, ensemble-modeling or matching problems is a plus.
  • Experience with ETL and data processing tools we’re using is an advantage (pandas, airflow, PySpark).
  • Experience with standard ML frameworks is also a plus (sklearn, tensorflow, pytorch,...)
  • Comfortable working in English; you have a great read, good spoken command of it.
  • We are interested in every qualified candidate who is eligible to work in the European Union but we are not able to sponsor visas.

Benefits

On top of joining a team of ambitious, qualified people you may also enjoy our benefits:

Flexible work hours

Competitive salary

Casual dress code

L&D program

Well-located offices

Free snacks, fruits, coffee, beers, sodas

ADDITIONAL INFORMATION

  • Contract Type: Full-Time
  • Location: Paris, France (75002)

Please let the company know that you found this position on this Job Board as a way to support us, so we can keep posting cool jobs.