xelys jobs xelys jobs

MLOps / Infrastructure Engineer

10a Labs

full-remotemidpermanentbackenddevops United States Yesterday via LinkedIn
130,000 - 230,000 USD/annual

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

MLOpsInfrastructure EngineeringGCPAWSTerraformDockerKubernetesCI/CDObservabilityVector Databases

About the role

Role overview

MLOps / Infrastructure Engineer (Remote, U.S.-based) You’ll build and operate infrastructure for a real-time ML-powered content moderation system that detects and triages abuse, threats, and edge-case language. The role is hands-on and sits at the intersection of machine learning, systems, and product delivery, partnering with ML engineers, researchers, and clients.

Responsibilities

  • Design and maintain cloud infrastructure on GCP or AWS for:
    • real-time model serving
    • data ingestion and evaluation workflows
  • Deploy and optimize APIs for low-latency ML model access and embedding search systems
  • Manage the end-to-end training data flow (sourcing, cleaning, preparing for model consumption) with a focus on accuracy, scalability, and efficiency
  • Build observability tooling for production ML pipelines (latency, error rates, request volumes, drift)
  • Automate model deployment, retraining, and evaluation pipelines using CI/CD for ML
  • Help package models for serving alongside ML engineers
  • Manage and optimize vector databases and semantic search infrastructure (e.g., Pinecone, FAISS, Vertex Matching Engine)
  • Ensure security, compliance, and uptime for safety-critical infrastructure

Requirements

  • 3–8 years experience deploying ML systems or high-availability backend systems
  • Shipped and maintained production infrastructure at scale, supporting ML workflows
  • Experience with GCP, AWS, or similar platforms (including managed ML services)
  • Proficient with Terraform, Docker, Kubernetes (or similar infrastructure tools)
  • Understands performance tradeoffs for model serving and embedding search pipelines
  • Ability to collaborate cross-functionally with ML, security, and product teams
  • Builder mindset and comfort working in ambiguous environments

Nice to have

  • Vector databases / ANN systems, ideally on GCP or AWS
  • Experience serving LLMs or embedding-based models via API
  • Monitoring/logging/metrics tools (e.g., Prometheus, Grafana, Sentry)
  • Familiarity with trust & safety, abuse detection, or policy enforcement systems

First 3 months success criteria

  • Deployed and monitored a real-time ML inference system with clear observability
  • Implemented an API with <200ms latency for embedding/classifier inference
  • Streamlined deployment and retraining workflows with ML engineers
  • Built logging/monitoring to understand performance and classifier behavior

Compensation & benefits

  • $130K–$230K base salary (dependent on experience and location)
  • Performance-based annual bonus
  • Professional development support (education, conferences, training)
  • Fully remote, U.S.-based
  • Comprehensive health, dental, and vision; generous PTO
  • 401(k) retirement plan

About 10a Labs

10a Labs provides a safety and threat-intelligence layer for frontier and enterprise AI teams. It supports adversarial red teaming, model evaluations, and intelligence collection to help organizations deploy AI systems safely and reliably.

Scraped 4/11/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.