Apple - Cupertino, CA
posted 2 months ago
As a Site Reliability Engineer (SRE) at Apple, you will play a crucial role in supporting and scaling cloud services that cater to thousands of development and operations engineers. This position is part of Apple's Cloud Service Infrastructure team, where you will be responsible for establishing SRE practices for a private cloud service. Your work will directly impact the reliability and consistency of application delivery across the organization. This hands-on role requires a self-motivated individual with a passion for excellence, quality, and detail. You will not only support operations but also collaborate closely with developers and architects to design and implement solutions that enhance stability, security, and scalability. In this role, you will operate, monitor, and triage all aspects of both production and non-production environments. You will pioneer and implement the next-generation compute platform, preparing alert handling procedures and runbooks while collaborating with off-shore SRE teams. Automation will be a key focus, as you will automate the deployment and orchestration of services into the cloud environment, along with other routine processes. Your participation in workload balancing, scale testing, and disaster recovery exercises will be essential to ensure the robustness of the cloud services. Additionally, you will work closely with partner teams, including engineering, QA, and program management, and nurture relationships with internal and external third-party vendors.