Full Time
Toronto
Posted 22 hours ago

DeepRec.ai

This range is provided by DeepRec.ai. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.

Base pay range

CA$125,000.00 / yr – CA$135,000.00 / yr

Co-Founder and Managing Director USA @ DeepRec.ai

Senior MLOps Engineer – Real-Time AI & Video Applications (Remote or On-Site or Hybrid)

Office Location : Toronto

Job Type : Full-time

We’re hiring for an impressive AI company focused on real-time AI and Video Applications. Their team is made up of leading experts in computer graphics and generative modeling, and they are on a rapid growth trajectory. We’re looking for experienced MLOps Engineers that want to work on real-time AI applications that are shaping the future of media.

The Role

We’re looking for a talented MLOps Engineer to build and maintain robust machine learning pipelines and infrastructure. You’ll be working closely with AI researchers, data scientists, and software engineers to deploy state-of-the-art models into production, optimize real-time inference, and ensure systems scale effectively.

What You’ll Do

Design and optimize ML pipelines for training, validation, and inference
Automate deployment of deep learning and generative models for real-time use
Implement versioning, reproducibility, and rollback capabilities
Deploy and manage containerized ML solutions on cloud platforms (AWS, GCP, Azure)
Optimize model performance using TensorRT, ONNX Runtime, and PyTorch
Work with GPUs, distributed computing, and parallel processing to power AI workloads
Build and maintain CI / CD pipelines using tools like GitHub Actions, Jenkins, ArgoCD
Automate model retraining, monitoring, and performance tracking
Ensure compliance with privacy, security, and AI ethics standards

What You Bring

3+ years of experience in MLOps, DevOps, or AI model deployment
Strong skills in Python and frameworks like TensorFlow, PyTorch, ONNX
Proficiency with Docker, Kubernetes, and serverless architectures
Hands-on experience with ML tools (ArgoWorkflow, Kubeflow, MLflow, Airflow)
Experience deploying and optimizing GPU-based inference (CUDA, TensorRT, DeepStream)
Solid grasp of CI / CD practices and scalable ML infrastructure
Passion for automation and clean, maintainable system design
Strong understanding of distributed systems
Bachelor’s or Master’s in Computer Science or equivalent work experience

Bonus Skills

Experience with CUDA programming
Exposure to LLMs and generative AI in production
Familiarity with distributed computing (Ray, Horovod, Spark)
Basic networking knowledge

Please apply now for more details and next steps. We look forward to hearing from you.

Seniority level

Mid-Senior level

Employment type

Full-time

Technology, Information and Media, Information Services, and Consumer Services

#J-18808-Ljbffr

Source ⇲

To apply, please visit the following URL: