xelys jobs xelys jobs

Site Reliability Engineer

Hydrolix

full-remoteseniorpermanentdevops United States Today via LinkedIn

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

Site Reliability Engineering (SRE)KubernetesCI/CDPrometheusGrafanaLinuxIncident ResponseRoot Cause AnalysisAWSSQL

About the role

Site Reliability Engineer (SRE)

Join Hydrolix’s Services team to improve the reliability, scalability, and operational excellence of its cloud data platform.

Responsibilities

  • Infrastructure Reliability: Deploy, maintain, and ensure a reliable fleet of Kubernetes clusters and Hydrolix deployments across multiple cloud platforms.
  • Service Optimization: Design and maintain systems/processes that improve reliability, availability, and performance.
  • CI/CD Management: Build and optimize CI/CD tools and deployment workflows.
  • Monitoring & Incident Response: Create and manage monitoring, alerting, and incident response to minimize downtime and speed recovery.
  • Root Cause Analysis: Perform thorough root cause analyses and implement long-term preventive measures.
  • Automation & Efficiency: Automate repetitive work and optimize system performance.
  • On-call Support: Cover weekday business hours and once-monthly weekend shifts.

Collaboration

  • Partner with software engineering, infrastructure, and product teams to bake reliability into the development lifecycle.
  • Advocate for SRE best practices and promote operational excellence.
  • Work with a distributed global team for round-the-clock support.
  • Interface with customers to resolve incidents and ensure a seamless user experience.

Requirements

  • 5+ years experience as an SRE (or equivalent) supporting complex distributed systems.
  • Hands-on experience with observability tools such as Prometheus, Vector, Grafana, Superset, or Kibana.
  • Proficiency with a major cloud platform (AWS, GCP, Azure, or Linode).
  • SQL database experience (familiarity with PostgreSQL is a plus).
  • Programming skills in Python, Go, or Rust.
  • Strong Linux expertise, including performance tuning and system-level troubleshooting.
  • Excellent written and verbal communication skills with technical clarity for diverse audiences.

Nice-to-haves

  • Familiarity with PostgreSQL.

About Hydrolix

Hydrolix builds an innovative cloud data platform for petabyte-scale data management and analytics. The company focuses on helping organizations reduce data costs while improving data retention through reliable, scalable infrastructure.

Scraped 4/8/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.