- Develop robust, scalable and maintainable machine learning models to answer business problems against large data sets.
- Build methods for document clustering, topic modeling, text classification, named entity recognition, sentiment analysis, and POS tagging.
- Perform elements of data cleaning, feature selection and feature engineering and organize experiments in conjunction with best practices.
- Benchmark, apply, and test algorithms against success metrics. Interpret the results in terms of relating those metrics to the business process.
- Work with development teams to ensure models can be implemented as part of a delivered solution replicable across many clients.
- Knowledge of Machine Learning, NLP, Document Classification, Topic Modeling and Information Extraction with a proven track record of applying them to real problems.
- Experience working with big data systems and big data concepts.
- Ability to provide clear and concise communication both with other technical teams and non-technical domain specialists.
- Strong team player; ability to provide both a strong individual contribution but also work as a team and contribute to wider goals is a must in this dynamic environment.
- Experience with noisy and/or unstructured textual data.
knowledge graph and NLP including summarization, topic modelling etc
- Strong coding ability with statistical analysis tools in Python or R, and general software development skills (source code management, debugging, testing, deployment, etc.)
- Working knowledge of various text mining algorithms and their use-cases such as keyword extraction, PLSA, LDA, HMM, CRF, deep learning & recurrent ANN, word2vec/doc2vec, Bayesian modeling.
- Strong understanding of text pre-processing and normalization techniques, such as tokenization,
- POS tagging and parsing and how they work at a low level.
- Excellent problem solving skills.
- Strong verbal and written communication skills
- Masters or higher in data mining or machine learning; or equivalent practical analytics / modelling experience
- Practical experience in using NLP related techniques and algorithms
- Experience in open source coding and communities desirable.
Able to containerize Models and associated modules and work in a Microservices environment