This job is closed

We regret to inform you that the job you were interested in has been closed. Although this specific position is no longer available, we encourage you to continue exploring other opportunities on our job board.

Palo Alto Networksposted 2 months ago
$146,000 - $230,000/Yr
Full-time • Senior
Santa Clara, CA
Resume Match Score

About the position

Palo Alto Networks runs a large infrastructure and is one of the largest GCP customers. As a Principal Site Reliability Engineer for the CDL/SLS team, you will be part of a team supporting the services running on this infrastructure. This includes automation, architecture, performance, observability, troubleshooting, security, and reliability. Our Infrastructure Platform stack includes Terraform, Kubernetes, GitLab CI/CD, GitOps, Prometheus, Grafana, Loki, Docker, GCP, Vault, Kafka, MySQL, Python, Bash, and Go.

Responsibilities

  • Contribute to the success of SRE and DevOps
  • Develop expertise in new technologies
  • Work with developers, researchers, data scientists, and security experts
  • Design, build and operate reliable, secure Cloud infrastructure
  • Ensure that applications are production-ready, scalable, and reliable
  • Develop tools and automation frameworks
  • Automate robust deployment of robust services
  • Orchestrate end-to-end monitoring and alerting
  • Participate with SRE and Dev teams in the on-call rotation
  • Lead root cause analysis of critical business and production issues

Requirements

  • 6+ years as an engineer in Infrastructure, Operations, DevOps, or System Engineering
  • 4+ years building high availability, scalable cloud-native applications on AWS and GCP
  • BS or MS in Computer Science, a related field, or equivalent professional experience or equivalent military experience required
  • Expertise in configuration management with a framework such as Ansible, Terraform, Helm
  • Passion for infrastructure and monitoring as code
  • Solid experience in container workloads and Kubernetes
  • Familiarity with PKI concepts, Networking concepts
  • In-depth knowledge of different security controls (app-id, user-id, security profile, url category, content, ssl decryption, firewall MFA etc)
  • Linux administration, internals, and network troubleshooting
  • Proficiency with programming languages like Golang or Python along with shell scripting to automate tasks
  • Proficiency with CI/CD pipelines, ArgoCD and GitLab CI/CD
  • Ability to diagnose and troubleshoot complex distributed systems handling high volume transactions
  • Experience with managing Kafka is a plus
  • Excellent written and verbal communication, able to collaborate and rally support
  • Self-disciplined, self-managed, self-motivated, strong sense of ownership, urgency, and drive
  • Ready to understand and dissect new technology stacks quickly

Benefits

  • FLEXBenefits wellbeing spending account with over 1,000 eligible items selected by employees
  • Mental and financial health resources
  • Personalized learning opportunities

Job Keywords

Hard Skills
  • Ansible
  • Computer Science
  • Go
  • Linux
  • Terraform
  • 15KhMHQzw bWQ4JKByr
  • 18LIVJo9Yf0 zwcfvHmkiXa
  • 9K2On0 hevYGJ5Ok
  • B7srAgZRVIj D0EMzTwRUs
  • Jcjk2uK0PXAyLT CnN1LpQMEdi
  • KOHXw U95Nbc S0iCzsZ7p
  • mPC0xvN3DsYL M5Zowutes4x1
  • qLovyGNEIWU xFrqpv
  • sqKxd7pi dBZrxcAnp
  • xPRcT1FVm fDzsMwkmH
Soft Skills
  • M4pujigfmyXYKw5N
  • nLT5vY21 x2oVSLuG
  • ysHJOCrF liYs027F
Build your resume with AI

A Smarter and Faster Way to Build Your Resume

Go to AI Resume Builder
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service