Zscaler - Boston, MA
posted 3 months ago
As a Staff Site Reliability Engineer - Technical Duty Officer at Zscaler, you will play a pivotal role in leading the transformation of our Site Reliability Engineering (SRE) organization. This position is designed for an experienced professional who is passionate about promoting SRE principles within the Engineering Department. You will be responsible for providing expert leadership during critical outages, coordinating multiple teams to ensure streamlined decision-making and quick resolution of issues. Your focus will be on fostering a customer-centric approach by addressing and mitigating global customer environment issues, while also promoting a culture of continuous learning and technical excellence within the SRE team. In this role, you will develop and implement scalable process frameworks and observability strategies that ensure rapid problem diagnosis, response, and service reliability. Collaboration with product teams will be essential as you analyze failures and integrate insights to improve service reliability, scalability, and operational efficiency. Your expertise will be crucial in guiding the team through complex technical challenges and ensuring that our services remain robust and reliable for our customers. This position requires a strong background in Site Reliability Engineering, with a minimum of 5 years of experience in an Operations or Engineering environment. You will need to have hands-on experience troubleshooting Linux-based systems, as well as a solid understanding of networking concepts and protocols. Coding experience, particularly in Python, will be beneficial as you build tools, scripts, and automation processes to enhance our operational capabilities.