Palo Alto Networks - Santa Clara, CA

posted 2 days ago

Full-time - Senior
Santa Clara, CA
Publishing Industries

About the position

Palo Alto Networks runs a large infrastructure and is one of the largest GCP customers. As a Principal Site Reliability Engineer, you will be part of a team supporting the services running on this infrastructure. This includes automation, architecture, performance, observability, troubleshooting, security, and reliability. Our Infrastructure Platform stack includes Terraform, Kubernetes, GitLab CI, ArgoCD, Prometheus, Grafana, Loki, Docker, GCP, AWS, Vault, Kafka, MySQL, Python, Bash, and Go.

Responsibilities

  • Contribute to the success of SRE and DevOps
  • Develop expertise in new technologies
  • Work with developers, researchers, data scientists, and security experts
  • Design, build and operate reliable, secure Cloud infrastructure
  • Ensure that applications are production-ready, scalable, and reliable
  • Develop tools and automation frameworks
  • Automate robust deployment of robust services
  • Orchestrate end-to-end monitoring and alerting
  • Participate with SRE and Dev teams in the on-call rotation
  • Lead root cause analysis of critical business and production issues

Requirements

  • 7+ years as an engineer in Infrastructure, Operations, DevOps, or System Engineering
  • 7+ years building high availability, scalable cloud native applications on AWS or GCP
  • BS or MS in Computer Science, a related field, or equivalent professional experience or equivalent military experience required
  • Expertise in configuration management with a framework such as Ansible, Terraform, Helm
  • Expertise in infrastructure automation tasks using Python and shell scripting
  • Experience in Site Reliability Engineering, Production Engineering, or DevOps
  • Expertise in public or private cloud
  • Solid experience in Kubernetes and containers
  • Linux administration, internals, and network troubleshooting
  • Proficiency with programming languages like Python, Java, Golang, and shell scripting to automate tasks
  • Experience with CI/CD pipelines, GitLab and ArgoCD preferred
  • Ability to diagnose and troubleshoot complex distributed systems handling high volume transactions
  • Excellent written and verbal communication, able to collaborate and rally support
  • Self-disciplined, self-managed, self-motivated and strong sense of ownership, urgency, and drive
  • Passion for infrastructure and monitoring as code
  • Ready to understand and dissect new technology stacks quickly

Benefits

  • FLEXBenefits wellbeing spending account with over 1,000 eligible items
  • Mental and financial health resources
  • Personalized learning opportunities
Job Description Matching

Match and compare your resume to any job description

Start Matching
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service