MLOps Engineer
Scale.jobs
midpermanentbackenddevopsdata Atlanta, GA 4 days ago via LinkedIn
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
MLOpsPythonKubernetesTerraformCI/CDML PipelinesFeature StoresModel MonitoringPrometheusGrafana
About the role
Role Overview
As an MLOps Engineer, you will bridge machine learning research with robust, production-grade systems. You’ll own the infrastructure, pipelines, and CI/CD workflows needed to deploy, monitor, and scale ML models across the organization, ensuring reproducible training and production inference that meets latency and reliability SLAs.
Responsibilities
- Design, build, and maintain ML pipelines for automated retraining and batch inference (e.g., Kubeflow, Airflow, Prefect)
- Develop and manage feature stores (e.g., Feast or Tecton) to keep feature engineering consistent between offline training and online serving
- Deploy models as high-throughput, low-latency microservices using NVIDIA Triton, KServe, or FastAPI
- Implement monitoring and alerting for model drift, data quality, and system performance using Prometheus, Grafana, and Evidently AI
- Containerize ML workloads with Docker and orchestrate them on Kubernetes across multi-tenant environments
- Build CI/CD pipelines for automated testing, integration, and deployment (e.g., GitOps, GitHub Actions, GitLab CI)
Requirements
- 3–6 years of experience in DevOps, MLOps, or Software Engineering, with a strong focus on ML infrastructure
- Proficiency in Python and familiarity with shell scripting
- Experience with Infrastructure as Code using Terraform
- Cloud platform familiarity: AWS, GCP, or Azure
- Hands-on experience with Kubernetes and containerized application orchestration at scale
- Familiarity with ML lifecycle tools such as MLflow, Weights & Biases, or SageMaker Pipelines
- Strong software engineering practices: version control, automated testing, and code review
Bonus
- Experience deploying LLMs (e.g., vLLM, Ollama)
- Experience with distributed training frameworks (e.g., Ray, Spark)
- BS/MS in Computer Science or related field
About Scale.jobs
Scale.jobs is hiring for an MLOps role focused on building production-grade machine learning infrastructure. The position centers on bridging ML research and operational systems by owning pipelines, deployment, monitoring, and scaling of ML models.
Scraped 6/16/2026