There are millions of long legal documents available from various sources. Many are available as structured datasets, others need to be crawled.
For good autocompletion we need to be clever about what data goes into evaluation and training. We need to develop tools that manage and select data subsets so that we can evaluate performance separately for the various autocompletion scenarios present in the data.
Tasks
- Develop tools and pipelines around dataset management
- Develop tools to identify clusters of common scenarios in our datasets
- Data filtering
- Build web crawlers
Requirements
- Critical: Very strong coding background
- Working experience in Data Science
Benefits
- Part of a small team that works closely together
- Stay up to date with AI research
- Perks like: a badass office in Berlin Mitte, free fruit, great coffee, company offsites
IMPORTANT: add the phrase autocompletion is all you need at the beginning of your cover letter/message. Apart from that a cover letter is not required. We have hundreds of applicants per week. This helps with filtering.