Site Reliability Engineer | $70/hr Remote
Crossing Hurdles
full-remotemidcontractdevopsbackend United States 4 days ago via LinkedIn
40,000 - 84,000 USD/annual
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
Site Reliability EngineeringSREDockerKubernetesPythonBashCI/CDAutomationInfrastructure TroubleshootingVersion Control
About the role
Role Overview
Site Reliability Engineer (Hourly Contract) — Remote. You will help deploy, monitor, and recover containerized AI training environments, ensuring stability and strong performance.
Responsibilities
- Deploy, monitor, and recover containerized AI training environments
- Troubleshoot infrastructure bottlenecks and resolve system failures in real time
- Build and manage resilient systems for stability and performance optimization
- Collaborate with engineering teams to improve CI/CD pipelines and automation
- Manage filesystem structures, storage, and process scheduling in containerized environments
- Perform dynamic replanning during runtime issues and system failures
- Document system processes, solutions, and best practices
Requirements
- Strong terminal-based system administration and troubleshooting experience
- Expertise with containerized environments (Docker and/or Kubernetes)
- Strong Python skills for scripting, automation, and debugging
- Proficiency in Bash and familiarity with additional programming languages
- Strong understanding of infrastructure, build systems, and version control
- Ability to manage dynamic infrastructure recovery under high-pressure scenarios
- Excellent written and verbal communication skills
Nice-to-haves
- Not explicitly stated
Hiring Process
- Apply via LinkedIn Easy Apply
- Email follow-up for next steps
- Resume evaluation and interview stage
Scraped 6/16/2026