xelys jobs xelys jobs

MLOps Engineer — AI/ML Systems & Deployment (TS/SCI Preferred)

Rackner

hybridmidpermanentbackenddevops Dayton, OH Today via LinkedIn

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

MLOpsPythonKubernetesDockerKubeflowAirflowArgoMLflowPrometheusOpenTelemetry

About the role

Role Overview

Rackner is seeking an MLOps Engineer to deploy and manage the full lifecycle of production-grade AI/ML systems in a secure, mission-focused environment. This is not a research role—models must become reliable, deployable, and auditable.

Responsibilities

  • Own the ML lifecycle (end-to-end)
    • Build and operate production ML pipelines
    • Orchestrate workflows with Kubeflow, Airflow, or Argo
    • Implement model versioning, lineage, and reproducibility standards
  • Operationalize AI/ML systems
    • Deploy models into secure, constrained environments
    • Move from experimentation to containerized pipelines and production systems
    • Support batch and real-time inference architectures
  • Engineer for reliability
    • Ensure reproducibility, auditability, stability
    • Monitor model performance and system health using Prometheus, Grafana, and OpenTelemetry
    • Detect and resolve issues like model drift and system degradation
  • Build cloud-native ML infrastructure
    • Deploy and manage Kubernetes-based ML workloads
    • Containerize pipelines with Docker
    • Support scalable training and inference workflows
  • Establish data discipline
    • Feature engineering and dataset preparation
    • Data versioning/governance (e.g., lakeFS)
    • Apply metadata and data management standards
  • Create repeatable systems
    • Produce runbooks, playbooks, and documentation for operational sustainability

Requirements

  • Strong programming skills in Python
  • Experience deploying ML systems into production environments
  • Hands-on experience with:
    • ML pipeline orchestration tools: Kubeflow, Airflow, or Argo
    • Experiment tracking: MLflow or ClearML
  • Infrastructure & systems:
    • Kubernetes and containerized systems (Docker)
    • Familiarity with CI/CD pipelines
    • Understanding of distributed systems and scalable architectures
  • ML application exposure (deployment/integration focus):
    • LLMs / transformer-based models and/or
    • Computer vision systems (e.g., YOLO, Faster R-CNN)
  • Reliability-first mindset and ability to operate in complex, evolving environments

Clearance Requirements

  • Active TS/SCI clearance strongly preferred
  • Secret clearance candidates may be considered and supported for upgrade
  • Non-cleared candidates must be U.S. citizens eligible to obtain/maintain clearance and able to work in a CAC-enabled/secure environment

Why This Role

  • Build production systems rather than prototypes
  • Work across ML, infrastructure, and deployment pipelines
  • Develop high-demand MLOps expertise in constrained, high-trust environments

About Rackner

Rackner is a software consultancy building cloud-native solutions for startups, enterprises, and public sector organizations. The company focuses on distributed systems, DevSecOps, and AI/ML, delivering mission-oriented, outcome-driven systems that scale in real-world environments.

Scraped 4/9/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.