Salesforce - Denver, CO
posted 4 months ago
As a Site Reliability Engineer at Salesforce, you will play a crucial role in maintaining the performance and availability of customer-facing services. Your primary responsibility will be to ensure the constant health of the supporting systems, which involves proactive incident management and problem resolution. You will act in key support roles during major incidents, such as Sev0 and Sev1, and participate in technical reviews for problem management. This role requires a strong commitment to following internal compliance policies and directives while working collaboratively with the Site Reliability team to stay updated on industry innovations and technologies. In this fast-paced environment, you will be expected to solve complex technical issues quickly and effectively, balancing multiple priorities. Automation will be a key focus, as you will work to automate the detection and resolution of recurring issues in the production environment. Additionally, you will help create and improve processes to reduce operational and engineering toil, contributing to the overall efficiency of the team. Your role will also involve engaging with other technical staff to address customer concerns and technical issues as they arise. You will be part of a 24/7 team managing large data centers, which requires a willingness to work shifts and be on call when necessary. Your expertise in systems engineering, networking protocols, and Unix variants will be essential in supporting the infrastructure and ensuring its reliability.