xelys jobs xelys jobs

Site Reliability Engineer | $70/hr Remote

Crossing Hurdles

full-remotemidcontractdevopsbackend United States 4 days ago via LinkedIn
40,000 - 84,000 USD/annual

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

Site Reliability EngineeringSREDockerKubernetesPythonBashCI/CDAutomationInfrastructure TroubleshootingVersion Control

About the role

Role Overview

Site Reliability Engineer (Hourly Contract) — Remote. You will help deploy, monitor, and recover containerized AI training environments, ensuring stability and strong performance.

Responsibilities

  • Deploy, monitor, and recover containerized AI training environments
  • Troubleshoot infrastructure bottlenecks and resolve system failures in real time
  • Build and manage resilient systems for stability and performance optimization
  • Collaborate with engineering teams to improve CI/CD pipelines and automation
  • Manage filesystem structures, storage, and process scheduling in containerized environments
  • Perform dynamic replanning during runtime issues and system failures
  • Document system processes, solutions, and best practices

Requirements

  • Strong terminal-based system administration and troubleshooting experience
  • Expertise with containerized environments (Docker and/or Kubernetes)
  • Strong Python skills for scripting, automation, and debugging
  • Proficiency in Bash and familiarity with additional programming languages
  • Strong understanding of infrastructure, build systems, and version control
  • Ability to manage dynamic infrastructure recovery under high-pressure scenarios
  • Excellent written and verbal communication skills

Nice-to-haves

  • Not explicitly stated

Hiring Process

  • Apply via LinkedIn Easy Apply
  • Email follow-up for next steps
  • Resume evaluation and interview stage

Scraped 6/16/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.