Senior Site Reliability Engineer (SRE) Team Lead
XP Venture Labs
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
About the role
Role Overview
Senior Site Reliability Engineer (SRE) Team Lead at XP Venture Labs. You will ensure the reliability, scalability, performance, and security of production systems while leading and mentoring an SRE team. The role blends hands-on SRE expertise with leadership and cross-functional collaboration to proactively prevent incidents and drive measurable reliability improvements.
Responsibilities
- Own reliability, availability, and performance of production systems
- Define and manage SLAs, SLOs, SLIs, and error budgets
- Build and evolve monitoring, logging, and observability standards and metrics
- Lead incident response, postmortems, and root cause analysis to reduce recurrence and improve MTTR
- Architect and maintain scalable, highly available cloud infrastructure
- Champion Infrastructure-as-Code (IaC), automation, and CI/CD best practices
- Establish capacity planning and performance optimization strategies
- Mentor and develop an SRE team; set on-call and operational excellence standards
- Partner with Engineering, DevOps, Security, and Product to embed reliability into the SDLC
- Evaluate and implement new tools/technologies/frameworks to improve resilience and efficiency
Requirements
- Deep expertise in AWS, including services such as EC2, ECS/EKS, Lambda, RDS, DynamoDB, S3, IAM, VPC, and networking
- Strong experience with Docker and Kubernetes for containerized applications
- Windows Server and IIS administration experience, plus PowerShell for Windows/legacy automation
- Experience with MS SQL Server performance tuning
- Performance monitoring experience in a .NET environment, including Angular + C# applications and backend services
- Advanced IaC experience with Terraform, AWS CloudFormation, and AWS SAM
- Proven ability to architect secure, scalable, highly available AWS environments
- Experience deploying and operating serverless and event-driven architectures using AWS Lambda
Leadership & Collaboration
- Lead incident and reliability processes with measurable outcomes
- Mentor a high-performing SRE team and drive operational standards
- Work closely with cross-functional teams to improve reliability across the SDLC
About XP Venture Labs
XP Venture Labs partners with ambitious companies to solve complex technology challenges and accelerate growth. The firm embeds engineering teams as strategic partners, focusing on scalable systems, platform modernization, reliability improvements, and high-impact technical decisions across cloud and distributed systems.
Scraped 4/16/2026