Apple - San Diego, CA
posted 3 months ago
The Apple Information Apps Engineering teams are responsible for powering some of the most widely used applications at Apple, including Apple News, Stocks, Weather, and Books. Operating at a massive global scale, we meet high expectations through a commitment to best practices, enabling us to deliver a vast array of information that people worldwide utilize daily in over 150 countries. We are currently seeking an experienced and dynamic Site Reliability Engineer (SRE) Operator to join our team, focusing on maintaining the reliability, availability, and performance of our systems. The ideal candidate will possess a strong background in production monitoring, a deep understanding of development and operations, and a proven track record in managing large-scale production environments. As part of our highly collaborative team, you will work closely with partner teams to achieve the best results for Apple. We prioritize finding effective solutions while ensuring efficiency in addressing each engineering challenge we encounter. Good ideas are valued and rewarded within our team culture. In your role as an SRE at Apple, you will be responsible for operating, monitoring, and triaging all aspects of our production and non-production environments. You will pioneer and implement the next generation telemetry system for Apple News, Stocks, Weather, and Books, prepare alert handling procedures and runbooks, and collaborate with our off-shore SRE team. Additionally, you will automate the deployment and orchestration of services into the cloud environment, participate in capacity planning and disaster recovery exercises, and support partner teams including engineering, SRE, QA, and project management by creating self-service solutions for them. Building and maintaining relationships with internal and external third-party vendors will also be a key part of your responsibilities.