Zip Co Limitedposted 19 days ago
$120,000 - $140,000/Yr
Mid Level

About the position

Join Zip’s Infrastructure Engineering team and play a key role in building the reliability and scalability of our cloud-native platform, which serves millions of customers and processes billions in payments. We value pragmatic problem solvers who use code, systems design, and operational best practices to improve both developer experience and product uptime. As a Site Reliability Engineer at Zip, you'll play a critical role in ensuring the reliability, performance, and scalability of our Azure-based infrastructure. You'll collaborate closely with software engineers to embed SRE best practices across the development lifecycle, define and track SLIs/SLOs, and maintain robust observability systems. This hands-on role involves building automated deployment pipelines using tools like Azure DevOps and Terraform, supporting a Kubernetes-based platform, and contributing to incident response and recovery efforts. We're looking for someone with strong experience in cloud infrastructure, container orchestration, and infrastructure as code. In return, you’ll join a fast-paced, supportive environment where you’re trusted to drive impact, grow your skills, and be yourself.

Responsibilities

  • Ensure service reliability, availability, and performance in a growing Azure-based infrastructure
  • Collaborate with software engineers to integrate SRE practices across the development lifecycle
  • Define and track SLIs/SLOs and contribute to reliability goals using metrics and monitoring tools
  • Build and maintain automated deployment pipelines using Azure DevOps, Terraform, Env0, and Atlantis
  • Support a Kubernetes-based platform including service mesh technologies
  • Help design self-healing infrastructure and automated recovery using metrics and health checks
  • Participate in on-call rotations and contribute to incident response and post-incident reviews
  • Continuously improve observability, monitoring, and alerting systems

Requirements

  • 3+ years of experience in Site Reliability Engineering, DevOps, or Systems Engineering roles
  • 2+ years of hands-on experience with Kubernetes or similar container orchestration platforms
  • Strong experience working with Azure services and cloud infrastructure
  • Familiarity with Infrastructure as Code (IaC) tools such as Terraform, Ansible, or similar
  • Understanding of CI/CD pipelines and experience with tools like Azure DevOps
  • Solid foundation in networking concepts, load balancing, and service communication
  • Experience with observability tools (e.g., Prometheus, Grafana, Azure Monitor) is a plus
  • Willingness to participate in an on-call rotation and respond to incidents

Benefits

  • Flexible working culture
  • Incentive programs
  • 20 days PTO every year
  • Generous paid parental leave
  • Leading family support policies
  • 100% employer covered insurance
  • Beautiful Midtown office with a casual dress code
  • Learning and wellness subscription stipend
  • Company-sponsored 401k match

Job Keywords

Hard Skills
  • Ansible
  • Azure DevOps
  • Azure Monitor
  • Kubernetes
  • Terraform
  • 6tuXEU xmOKnF84lS5
  • CMbeGX R1u89O26P
  • CvBnNsWSLe654hg rR9 1PvkY
  • DyAqjnuEzovX l5AGie3a0IV6
  • GZSAXH rBXUjO
  • jEXHlIvm bsgzvOYfG9yj
  • OmbI0qagTCBY WsSz6lHy2Ren
  • omGfwX8L UXyP5FZ
  • pF03qnVbR 65xbGkn3S
  • RHVsDLCAWya5Zvu Zq6 TunCM
  • UAhKm93n
  • v0euBLThzSEZsjo nPVGmZtvyuE
  • vR1Ipi imp3ga0
  • W0UZMnV Id2czp9Nmofwv
  • xpIi2Z
  • ygHL071xYnF
  • ZLYlhz 0N3OgAWIip8mqVH
Build your resume with AI

A Smarter and Faster Way to Build Your Resume

Go to AI Resume Builder
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service