Principal Site Reliability Engineer (AIOps)
Palo Alto Networks
full-remoteleadpermanentdevopsbackend Full remote Yesterday via WTTJ
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
Site Reliability EngineeringAIOpsKubernetesDockerGCPAWSPythonGolangCI/CDTerraform
About the role
Role overview
You will join Palo Alto Networks as a Principal Site Reliability Engineer (AIOps), supporting services running on the company infrastructure. The focus is on automation, architecture, performance, metrics, troubleshooting, security, and reliability, using modern cloud and container technologies.
Key missions
- Contribute to the success of SRE and DevOps by building expertise in new technologies.
- Design, build, and operate reliable and secure cloud infrastructure so applications are production-ready.
- Develop automation tools and frameworks and automate robust service deployments.
Responsibilities
- Own production engineering activities across reliability, performance, and incident troubleshooting.
- Build tooling and practices for monitoring and reliability (including “monitoring as code”).
- Troubleshoot complex distributed systems that handle high-volume transactions.
Requirements
- Familiarity with CI/CD pipelines (preferred: GitLab and GitHub).
- Strong Linux administration including internals and network troubleshooting.
- Configuration management expertise with tools/frameworks such as Ansible, Terraform, and Helm.
- Ability to diagnose and troubleshoot complex distributed systems.
- Strong written and verbal communication; able to collaborate and rally support.
- Ability to quickly understand and dissect new technology stacks.
- Experience in Production Engineering, DevOps, or Site Reliability.
- Passion for infrastructure and monitoring as code.
- Expertise with private or public cloud.
- Programming skills for automation: Python, Golang, and shell scripting.
- Self-managed, self-motivated, strong ownership and urgency.
Preferred/related technologies mentioned
- Container and orchestration: Kubernetes, Docker
- Cloud providers: GCP, AWS
About Palo Alto Networks
Palo Alto Networks is a cybersecurity company that provides security platforms and services for protecting organizations across cloud, network, and endpoint environments. The role focuses on building and operating reliable, secure infrastructure supporting production services.
Scraped 5/14/2026