Apple - Austin, TX
posted 4 months ago
As a Site Reliability Engineer (SRE) in the Retail Engineering team, you will play a crucial role in ensuring the reliability, scalability, and performance of our systems and services that integrate Apple Retail Stores and Apple Online Store with major US Carriers for iPhone activations. This position requires a talented individual who can thrive in a dynamic environment and make a meaningful impact through technical expertise and dedication to excellence. You will work closely with engineering and operations teams to design, build, and maintain robust infrastructure and automation solutions. The role demands extensive hands-on experience in working as an SRE engineer for large-scale, customer-facing cloud applications. You should possess a solid understanding of SRE principles, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts. Excellent troubleshooting and problem-solving skills are essential, as you will be expected to represent the SRE organization in design reviews and operational readiness exercises for both new and existing services. Collaboration with technical and non-technical teams will be a key part of your responsibilities, as you analyze statistics to provide a clear picture of the current state of our systems. A passion for automating manual operations and improving processes through repeated iteration is crucial. You should have a good understanding of networking and load balancing concepts and be capable of leading a small team to develop innovative solutions. Self-motivation and the ability to make business-critical decisions in a fast-paced environment are essential. You will also be proactive in addressing critical production issues and ensuring their resolution while collaborating with necessary partners. Participation in an on-call rotation to provide hands-on technical expertise during service-impacting events is also part of the role.