Palo Alto Networksposted about 1 month ago
$146,000 - $230,000/Yr
Full-time • Senior
Santa Clara, CA

About the position

Palo Alto Networks runs a large infrastructure and is one of the largest GCP customers. As a Principal Site Reliability Engineer for the CDL/SLS team, you will be part of a team supporting the services running on this infrastructure. This includes automation, architecture, performance, observability, troubleshooting, security, and reliability. Our Infrastructure Platform stack includes Terraform, Kubernetes, GitLab CI/CD, GitOps, Prometheus, Grafana, Loki, Docker, GCP, Vault, Kafka, MySQL, Python, Bash, and Go.

Responsibilities

  • Contribute to the success of SRE and DevOps
  • Develop expertise in new technologies
  • Work with developers, researchers, data scientists, and security experts
  • Design, build and operate reliable, secure Cloud infrastructure
  • Ensure that applications are production-ready, scalable, and reliable
  • Develop tools and automation frameworks
  • Automate robust deployment of robust services
  • Orchestrate end-to-end monitoring and alerting
  • Participate with SRE and Dev teams in the on-call rotation
  • Lead root cause analysis of critical business and production issues

Requirements

  • 6+ years as an engineer in Infrastructure, Operations, DevOps, or System Engineering
  • 4+ years building high availability, scalable cloud-native applications on AWS and GCP
  • BS or MS in Computer Science, a related field, or equivalent professional experience or equivalent military experience required
  • Expertise in configuration management with a framework such as Ansible, Terraform, Helm
  • Passion for infrastructure and monitoring as code
  • Solid experience in container workloads and Kubernetes
  • Familiarity with PKI concepts, Networking concepts
  • In-depth knowledge of different security controls (app-id, user-id, security profile, url category, content, ssl decryption, firewall MFA etc)
  • Linux administration, internals, and network troubleshooting
  • Proficiency with programming languages like Golang or Python along with shell scripting to automate tasks
  • Proficiency with CI/CD pipelines, ArgoCD and GitLab CI/CD
  • Ability to diagnose and troubleshoot complex distributed systems handling high volume transactions
  • Experience with managing Kafka is a plus
  • Excellent written and verbal communication, able to collaborate and rally support
  • Self-disciplined, self-managed, self-motivated, strong sense of ownership, urgency, and drive
  • Ready to understand and dissect new technology stacks quickly

Benefits

  • FLEXBenefits wellbeing spending account with over 1,000 eligible items selected by employees
  • Mental and financial health resources
  • Personalized learning opportunities

Job Keywords

Hard Skills
  • Ansible
  • Computer Science
  • Go
  • Linux
  • Terraform
  • 2SwIzb0J7PO1cZ vNBaLDw62Gd
  • 4Mc7Ql9V3 qLEOQnAtM
  • AsCmOUZvcjy 8kfBZVcSTG
  • fO5N2Q RBuCaLqxT
  • HOnusG6oVNa dmbtECq1cQ7
  • hWzMr530 BZi5yHvAI
  • SqgcN rUZF1p PAH4GJTjV
  • vqhyI8opUZT PjO2pY
  • vS23bwh5WHnj qLb6ihVtywTH
  • XbDTKx2uW IWKqTkOvZ
Soft Skills
  • lkiXEvLbGU6Jm2PW
  • sHhFMfqo np4VABMu
  • TseU94Wv GC3VMFJk
Build your resume with AI

A Smarter and Faster Way to Build Your Resume

Go to AI Resume Builder
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service