Senior MLOps Engineer

Full Time
  • Full Time
  • Toronto

DeepRec.ai


This range is provided by DeepRec.ai. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.

Base pay range



CA$125,000.00 / yr – CA$135,000.00 / yr


Co-Founder and Managing Director USA @ DeepRec.ai


Senior MLOps Engineer – Real-Time AI & Video Applications (Remote or On-Site or Hybrid)

Office Location : Toronto


Job Type : Full-time

We’re hiring for an impressive AI company focused on real-time AI and Video Applications. Their team is made up of leading experts in computer graphics and generative modeling, and they are on a rapid growth trajectory. We’re looking for experienced MLOps Engineers that want to work on real-time AI applications that are shaping the future of media.



The Role

We’re looking for a talented MLOps Engineer to build and maintain robust machine learning pipelines and infrastructure. You’ll be working closely with AI researchers, data scientists, and software engineers to deploy state-of-the-art models into production, optimize real-time inference, and ensure systems scale effectively.



What You’ll Do

  • Design and optimize ML pipelines for training, validation, and inference
  • Automate deployment of deep learning and generative models for real-time use
  • Implement versioning, reproducibility, and rollback capabilities
  • Deploy and manage containerized ML solutions on cloud platforms (AWS, GCP, Azure)
  • Optimize model performance using TensorRT, ONNX Runtime, and PyTorch
  • Work with GPUs, distributed computing, and parallel processing to power AI workloads
  • Build and maintain CI / CD pipelines using tools like GitHub Actions, Jenkins, ArgoCD
  • Automate model retraining, monitoring, and performance tracking
  • Ensure compliance with privacy, security, and AI ethics standards


What You Bring

  • 3+ years of experience in MLOps, DevOps, or AI model deployment
  • Strong skills in Python and frameworks like TensorFlow, PyTorch, ONNX
  • Proficiency with Docker, Kubernetes, and serverless architectures
  • Hands-on experience with ML tools (ArgoWorkflow, Kubeflow, MLflow, Airflow)
  • Experience deploying and optimizing GPU-based inference (CUDA, TensorRT, DeepStream)
  • Solid grasp of CI / CD practices and scalable ML infrastructure
  • Passion for automation and clean, maintainable system design
  • Strong understanding of distributed systems
  • Bachelor’s or Master’s in Computer Science or equivalent work experience


Bonus Skills

  • Experience with CUDA programming
  • Exposure to LLMs and generative AI in production
  • Familiarity with distributed computing (Ray, Horovod, Spark)
  • Basic networking knowledge

Please apply now for more details and next steps. We look forward to hearing from you.



Seniority level

Mid-Senior level



Employment type


Full-time

Technology, Information and Media, Information Services, and Consumer Services


#J-18808-Ljbffr

Source

To apply, please visit the following URL: