As the Lead MLOps Engineer, you will be designing, developing and maintaining the infrastructure and workflows necessary to support the deployment and scaling of multiple concurrent machine learning services. The ideal candidate for this role has expertise in MLOps, automation, optimization, and cloud infrastructure which will be essential in scaling our machine learning capabilities and ensuring the reliable and efficient deployment of our computer vision models.
Responsibilities
Lead the development and implementation of MLOps processes, and best practices for the entire machine learning lifecycle, from model training to deployment and monitoring.
Design, build, and maintain robust, scalable, and secure ML infrastructure in the cloud (AWS), ensuring high availability and performance.
Develop infrastructure for large-scale distributed training and experimentation.
Collaborate closely with the backend team to integrate MLOps processes into the overall CI/CD pipeline and infrastructure.
Requirements
Strong experience working in MLOps with hands-on experience in deploying and maintaining machine learning models in large-scale production environments.
Extensive experience with cloud platforms (AWS) and their machine learning services, including model deployment and monitoring.
Deep understanding of containerization technologies (e. g., Docker, Kubernetes) and experience with container orchestration.
Solid understanding of DevOps principles and experience with CI/CD pipelines, automated testing, and infrastructure-as-code
You've got positive energy. You're optimistic about the future and determined to get there.