Do you have a passion for Data and would like to write your master thesis at a leading global skin care company?
Come and join our Research & Development Department and write your Master Thesis on “Supervised label prediction for cosmetic patents using machine learning”
The focus of this thesis is to analyze a patent dataset to predict and automate the labeling process that is currently done manually. The goal is to identify and test different machine learning algorithms to achieve an F-1 score above 97% (current standard in patent classification). The features to label the patents will come from patent codes (e.g. IPC, CPC) and text data (e.g. patent claims, full text). There will be a second labelling task concerning specific information (e.g. cosmetic substances) within the independent claims of these patents. Thus, you should have experience in the field of NLP and be familiar with feature learning techniques such as word embeddings and named entity recognition. To achieve the objective on such a large dataset, it might be required to train your models on GPUs using the CUDA library.
During your time at Beiersdorf you would work on the following tasks: