xelys jobs xelys jobs

Staff Machine Learning Systems Engineer (MLOps)

hims & hers

leadpermanentbackenddevops United States 7 days ago via LinkedIn

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

MLOpsKubernetesEKSGitOpsHelmKustomizeCI/CDOpenTelemetryObservabilityLLM Infrastructure

About the role

Role overview

Staff Machine Learning Systems Engineer (MLOps) at Hims & Hers. You will design, build, and operate the production infrastructure that powers AI across the organization—ensuring systems are reliable, observable, secure, and compliant in a regulated healthcare environment.

Responsibilities

  • Own and scale the AI compute & deployment platform

    • Operate and evolve containerized deployment for AI workloads (Kubernetes/EKS), including cluster operations, node lifecycle, autoscaling (Karpenter), storage (EBS CSI), and staging/production isolation.
    • Build and maintain GitOps-based deployment pipelines using Helm/Kustomize overlays and environment promotion.
    • Design ephemeral/preview environments, feature-branched deployments, and nightly release pipelines.
    • Drive efficiency and cost management for compute, autoscaling, and inference.
  • Build and operate inference & model-serving infrastructure

    • Operate and scale inference infrastructure and a multi-provider LLM gateway (e.g., Bedrock, Vertex), including credentials, rate limits, and failover.
    • Implement reliable serving patterns for LLM workflows (routing, grounding, tool execution, context assembly).
    • Create reusable infrastructure abstractions and contracts to standardize how AI services are deployed and consumed.
  • Own observability, tracing, and reliability

    • Provision and scale AI observability and tracing (Langfuse, Datadog/dx-trace, OpenTelemetry/OTLP) and supporting datastores (e.g., ClickHouse).
    • Build monitoring/analytics pipelines for latency, errors, quality, and regression signals.
    • Define SLOs, alerting, on-call runbooks, and incident response for AI infrastructure/services.
  • Partner with stakeholders

    • Collaborate with ML engineers, product engineers, and clinical teams to ensure AI systems remain reliable, observable, secure, and trustworthy.

About hims & hers

Hims & Hers is a health and wellness platform focused on redefining access to care by making it affordable, accessible, and personalized. The company supports diagnosis to treatment to delivery and leverages technology to improve health outcomes. It is a public company listed on the NYSE (HIMS).

Scraped 6/18/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.