xelys jobs xelys jobs

Staff Site Reliability Engineer

SimSpace

leadpermanentdevops United States Yesterday via LinkedIn

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

Site Reliability Engineering (SRE)KubernetesCI/CDDistributed SystemsDevOpsDevSecOpsSLI/SLOError BudgetsJsonnetGrafana Tanka

About the role

Role Overview

Staff Site Reliability Engineer (SRE) at SimSpace. You will define the technical vision, lead the infrastructure architecture, and secure the systems powering the SimSpace cyber range platform. This is a staff-level force multiplier focused on strategic reliability, distributed systems operability at global scale, and long-term automation for varied deployment models (data centers, customer hardware, and air-gapped appliances).

Responsibilities

  • Technical Strategy & Architecture: Architect an infrastructure strategy for consistent, repeatable, and secure deployments across SimSpace-hosted data centers, customer-provided hardware, and air-gapped environments.
  • Platform Evolution & Configuration Management:
    • Lead evolution of CI/CD and Kubernetes platforms.
    • Drive application packaging, templating, and configuration management using Jsonnet and Grafana Tanka (with Kustomize).
    • Architect multi-cluster, multi-environment deployment frameworks to improve developer velocity.
  • Reliability Leadership:
    • Define, measure, and govern SLIs, SLOs, and Error Budgets.
    • Provide overarching technical leadership across SRE, DevOps, and DevSecOps practices.
  • Automation for Scale & Operability: Design long-term automation frameworks to make on-prem and appliance deployments robust, secure, and repeatable.

Requirements

  • Deep, staff-level experience as an SRE and strong software engineering background.
  • Strategic thinking about distributed systems, reliability, and operability at global scale.

Nice-to-haves (implied by responsibilities)

  • Experience operating multi-cluster Kubernetes environments.
  • Hands-on familiarity with CI/CD, configuration management/templating, and reliability engineering (SLIs/SLOs/error budgets).
  • Experience with restricted air-gapped or highly regulated deployment environments.

About SimSpace

SimSpace is an AI proving ground that helps organizations train, test, and validate adaptive, AI-ready cyber defenses through realistic live-fire simulations. Trusted by allied governments, militaries, enterprises, and research institutions, it compresses cyber readiness cycles and supports security investment evaluation and performance optimization. The platform unifies training, testing, and validation across distributed simulation environments.

Scraped 6/13/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.