Data Scientist - LLM

Job description

TikTok is the leading destination for short-form mobile video. At TikTok, our mission is to inspire creativity and bring joy. TikTok's global headquarters are in Los Angeles and Singapore, and its offices include New York, London, Dublin, Paris, Berlin, Dubai, Jakarta, Seoul, and Tokyo.

Why Join Us

Creation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible. Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day. To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always. At TikTok, we create together and grow together. That's how we drive impact - for ourselves, our company, and the communities we serve. Join us.

Our Data Science team is a diverse group of problem solvers, located in Singapore, China, Canada and US, who are passionate about translating complex data into clear, actionable insights. By pioneering state-of-the-art data science techniques and fostering a culture of data-driven decision-making, we aim to unlock unprecedented growth opportunities and operational excellence. We are responsible for developing innovative methods, models and algorithms to ensure the supply of high-quality and diverse data for both SFT and Pretraining of LLM/VLM.


  • Design and develop data collection pipelines to gather and preprocess diverse datasets from various sources.
  • Design and develop data processing pipelines, including data labeling, data filtering, data cleaning, data visualization, data auditing, etc.
  • Implement machine learning models to improve the quality and diversity of data.
  • Develop machine learning models and algorithms to detect the issues of the current moderation system and also the TikTok ecosystem.


  • Major in computer science, or any other related technical discipline;
  • Strong proficiency in building large-scale data processing pipelines, familiar with distributed workload (e.g., multiprocessing).
  • Proficiency in at least one programming language commonly used in machine learning, such as Python and ability to write clean, maintainable code.
  • Proficiency in at least one deep learning framework, such as PyTorch.
  • At least 3 years of experience in at least one of the following areas: machine learning, pattern recognition, NLP, data mining, multimodality, LLM.

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.

Please let the company know that you found this position on this Job Board as a way to support us, so we can keep posting cool jobs.