Intermediate Site Reliability Engineer
GitLab
full-remotemidpermanentbackenddevops Full remote Today via WTTJ
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
Site Reliability EngineeringSRECloud Cost ManagementFinOpsGCPAWSTerraformAnsibleGrafanaObservability
About the role
Role Overview
Join GitLab as an Intermediate Site Reliability Engineer focused on cloud cost utilization. You’ll work cross-functionally with Engineering, Finance, and Product to improve cloud usage tracking, cost attribution, and ongoing optimization.
Key Missions / Responsibilities
- Collaborate with Engineering, Finance, and Product to improve cloud usage tracking and optimization.
- Design and maintain cloud resource tagging and labeling strategies across GCP and AWS for accurate cost attribution.
- Develop tooling and pipelines to ingest, normalize, and report on cloud billing data, leveraging the FinOps Open Cost and Usage Specification.
Profile / Requirements
- Experience designing or implementing cloud resource tagging/labeling strategies and driving adoption across teams.
- Familiarity with observability tooling, including Grafana, and an understanding of linking reliability and cost signals.
- Ability to work self-directed in a fully remote and asynchronous environment.
- Experience with infrastructure as code, including Terraform and Ansible.
- Hands-on experience in cloud cost management on GCP and/or AWS, including billing data, pricing models, and optimization approaches.
- Familiarity with (or strong interest in adopting) FinOps FOCUS for multi-cloud cost analysis.
- Ability to clearly communicate technical cost data to non-engineering audiences.
- Comfort partnering across technical and business functions (Engineering, Finance, stakeholders).
Nice-to-Haves
- Interest in the FinOps ecosystem (FinOps FOCUS / Open Cost and Usage Specification).
About GitLab
GitLab is a DevSecOps platform company that helps organizations plan, build, secure, and operate software using a single application lifecycle toolchain. Its offerings support cloud-native engineering workflows and continuous delivery practices across teams. The company operates with a remote-first culture and emphasizes collaboration across functions.
Scraped 5/13/2026