Apple - San Diego, CA
posted 3 months ago
The Atlassian Services Site Reliability Engineer (SRE) role is a critical position within the Software Delivery organization at Apple, which plays a vital role in the software release process. This position is responsible for applying Site Reliability Engineering practices to maintain Atlassian services, which are essential tools for software engineers and project managers involved in developing Apple software for global delivery. The Atlassian Services team focuses on ensuring the reliability and performance of data center applications, enhancing observability of services, responding to incident alerts, and reporting on Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to provide visibility across the organization. The SRE role is crucial for maintaining the production systems of key applications such as Bitbucket, Confluence, and Jira, which are integral to delivering cutting-edge operating systems, applications, and firmware to Apple customers. In this role, the Site Reliability Engineer will be tasked with various responsibilities, including the configuration and monitoring of both on-premises and cloud-based dependencies. The engineer will also automate continuous integration (CI) and continuous delivery (CD) pipelines, maintain staging and production environments with the goal of maximizing uptime, and implement observability systems for effective monitoring, alerting, and metrics reporting. Additionally, the engineer will generate reports on service metrics related to performance, availability, and reliability, and champion best practices in change control management and incident response. A successful candidate will be expected to proactively communicate the status of Atlassian services to stakeholders and follow through on time-sensitive tasks. They should demonstrate a willingness to seek clarification and increase awareness of the larger context, explore solutions to problems while evaluating risk versus reward, and execute the best approach. Effective asynchronous communication with a global team across multiple time zones is essential, as is the ability to document new processes or update existing documentation. The ideal candidate will be eager and curious to learn across multiple technology stacks, contributing to the overall success of the team and the organization.