xelys jobs xelys jobs

Principal MLOps Engineer

Raft

leaddevopsdata United States Yesterday via LinkedIn

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

MLOpsKubernetesDockerAWSAzureCI/CDModel ServingLLMsObservabilitySecure Supply Chain

About the role

Role overview

Principal MLOps Engineer (U.S. based)

Raft is building mission-critical AI and data platforms for the Department of Defense (DoD). You will help design, deploy, and mature Raft’s end-to-end ML platform and the MLOps infrastructure that supports model development, evaluation, deployment, monitoring, and lifecycle management across cloud and constrained environments.

Responsibilities

  • Design, build, and maintain secure, scalable MLOps infrastructure and deployment pipelines for production ML systems
  • Mature internal ML platform capabilities across the model lifecycle (packaging, registry/catalog workflows, deployment, monitoring, operational support)
  • Deploy and manage ML workloads on Kubernetes, including GPU-enabled clusters
  • Build/maintain model serving and inference infrastructure for multiple ML use cases (traditional ML, computer vision, speech/audio, and LLM-based systems)
  • Create and operate CI/CD workflows for ML services, model artifacts, and platform components
  • Improve observability, reliability, security, and maintainability across ML infrastructure and services
  • Standardize runtime patterns, serving frameworks, and deployment architectures for production ML workloads
  • Contribute to infrastructure decisions across edge, on-prem, and cloud deployment environments
  • Support compliance-driven deployment practices and secure software supply chain requirements (defense environment)
  • Partner with ML engineers, software engineers, and product teams to move models from experimentation to reliable production deployment

Requirements

  • 7+ years hands-on experience in software engineering, platform engineering, DevOps, MLOps, or related technical roles
  • 5+ years experience with Docker and Kubernetes in production
  • 5+ years experience supporting enterprise cloud infrastructure/applications in AWS, Azure, or similar environments

Nice-to-haves / additional signals

  • Experience evaluating and standardizing deployment/serving runtime patterns for ML at scale
  • Experience with secure production operations and compliance/supply-chain practices in regulated environments (defense-oriented)
  • Familiarity with GPU infrastructure, model serving, and observability for ML systems

Location / eligibility

  • U.S.-based role requiring U.S. citizenship and work performed within the continental U.S.

About Raft

Raft is a customer-obsessed, non-traditional defense tech company building AI/ML and data solutions for U.S. military and government agencies. The company focuses on autonomous data fusion and agentic AI, delivering cloud-native platforms and mission applications that support time-sensitive decision-making.

Scraped 4/23/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.