xelys jobs xelys jobs

Machine Learning Ops Engineer, Brand Concierge

Adobe

seniorpermanentdevopsbackend San Jose, CA 9 days ago via LinkedIn

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

MLOpsMachine Learning OperationsKubernetesTerraformCI/CDPythonMonitoringLLMOpsRAGMLflow

About the role

Role Overview

Join Adobe as a Machine Learning MLOps Engineer to improve the operational reliability, scalability, and performance of AI systems across environments. You’ll automate and optimize the full machine learning lifecycle, from data pipelines and model deployment to monitoring, governance, and incident response.

What You’ll Do

Model Lifecycle Management

  • Manage model versioning, deployment strategies, rollback mechanisms, and A/B testing for LLM agents and RAG systems.
  • Coordinate model registries, artifacts, and promotion workflows with ML engineers.

Monitoring & Observability

  • Implement real-time monitoring for accuracy, latency, drift, and degradation.
  • Track conversation quality metrics and user feedback loops for production agents.

CI/CD for AI

  • Build automated pipelines for agent testing, validation, and deployment.
  • Add unit/integration tests for safe model and workflow rollouts.

Infrastructure Automation

  • Provision and manage scalable infrastructure including Kubernetes, Terraform, and serverless stacks.
  • Enable autoscaling, resource optimization, and load balancing for AI workloads.

Data Pipeline Management

  • Build and maintain ingestion pipelines for structured and unstructured sources.
  • Ensure reliable feature extraction, transformation, and data validation.

Performance Optimization

  • Optimize AI stack performance (model latency, API efficiency, GPU/compute utilization).
  • Drive cost-aware engineering across inference, retrieval, and orchestration layers.

Incident Response & Reliability

  • Create alerting and triage systems to detect and resolve production issues.
  • Maintain SLAs and develop rollback/recovery strategies.

Compliance & Governance

  • Enforce model governance, audit trails, and explainability standards.
  • Support documentation and alignment with frameworks such as GDPR and SOC 2.

What You Need To Succeed

  • 3–5+ years in MLOps, DevOps, or ML platform engineering.
  • Cloud infrastructure experience: AWS/GCP/Azure.
  • Container orchestration: Kubernetes.
  • Infrastructure as Code: Terraform (and experience with Helm).
  • ML serving/tooling familiarity: MLflow, Seldon, TorchServe, BentoML.
  • Python and CI/CD automation (e.g., GitHub Actions, Jenkins, Argo Workflows).
  • Monitoring/observability tools (e.g., Prometheus, Grafana, Datadog, ELK, Arize AI).

Preferred Qualifications

  • Experience with LLM apps, RAG pipelines, or AI agent orchestration.
  • Understanding of vector databases, embedding workflows, and retraining triggers.
  • Exposure to privacy, safety, and responsible AI operational practices.
  • BS or equivalent in Computer Science/Engineering or related field.

About Adobe

Adobe is a global software company that empowers people and businesses to create through digital platforms and creative tools. Its products span areas such as creativity, productivity, customer experiences, and enterprise experience management, increasingly powered by AI and personalized experiences.

Scraped 5/21/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.