xelys jobs xelys jobs

MLOps Engineer

Scale.jobs

midpermanentbackenddevopsdata Atlanta, GA 4 days ago via LinkedIn

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

MLOpsPythonKubernetesTerraformCI/CDML PipelinesFeature StoresModel MonitoringPrometheusGrafana

About the role

Role Overview

As an MLOps Engineer, you will bridge machine learning research with robust, production-grade systems. You’ll own the infrastructure, pipelines, and CI/CD workflows needed to deploy, monitor, and scale ML models across the organization, ensuring reproducible training and production inference that meets latency and reliability SLAs.

Responsibilities

  • Design, build, and maintain ML pipelines for automated retraining and batch inference (e.g., Kubeflow, Airflow, Prefect)
  • Develop and manage feature stores (e.g., Feast or Tecton) to keep feature engineering consistent between offline training and online serving
  • Deploy models as high-throughput, low-latency microservices using NVIDIA Triton, KServe, or FastAPI
  • Implement monitoring and alerting for model drift, data quality, and system performance using Prometheus, Grafana, and Evidently AI
  • Containerize ML workloads with Docker and orchestrate them on Kubernetes across multi-tenant environments
  • Build CI/CD pipelines for automated testing, integration, and deployment (e.g., GitOps, GitHub Actions, GitLab CI)

Requirements

  • 3–6 years of experience in DevOps, MLOps, or Software Engineering, with a strong focus on ML infrastructure
  • Proficiency in Python and familiarity with shell scripting
  • Experience with Infrastructure as Code using Terraform
  • Cloud platform familiarity: AWS, GCP, or Azure
  • Hands-on experience with Kubernetes and containerized application orchestration at scale
  • Familiarity with ML lifecycle tools such as MLflow, Weights & Biases, or SageMaker Pipelines
  • Strong software engineering practices: version control, automated testing, and code review

Bonus

  • Experience deploying LLMs (e.g., vLLM, Ollama)
  • Experience with distributed training frameworks (e.g., Ray, Spark)
  • BS/MS in Computer Science or related field

About Scale.jobs

Scale.jobs is hiring for an MLOps role focused on building production-grade machine learning infrastructure. The position centers on bridging ML research and operational systems by owning pipelines, deployment, monitoring, and scaling of ML models.

Scraped 6/16/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.