Splunk
posted 2 months ago
Splunk is dedicated to building a safer and more resilient digital world, and as a Site Reliability Engineer in the TechOps team, you will play a crucial role in maintaining and enhancing our cloud infrastructure. This position is designed for early-career professionals who are eager to contribute to the next generation of our large-scale Cloud offering. You will be part of a team that is responsible for monitoring and resolving issues that affect the availability and performance of Splunk for our cloud customers, ensuring a seamless experience for users. In this role, you will work with various cloud providers and support the infrastructure that powers Splunk's cloud services. Your responsibilities will include providing technical support for the Splunk Cloud fleet, performing impact assessments, documenting issues and remediation steps, and leading support cases. You will also communicate effectively with TechOps engineers and business partners, assist colleagues with complex tasks, and represent the TechOps team in meetings to recommend new procedures and processes. The position requires a commitment to quality customer experience and the ability to work flexible hours, including nights and weekends. You will be expected to lead by example and drive the core values of the company while ensuring that normal service operations are restored quickly during escalated incidents. This is a fully remote position, and candidates must be U.S. citizens working on U.S. soil to be considered. As a Site Reliability Engineer, you will be expected to have a passion for large complex systems and a desire to automate processes across thousands of machines. You will leverage data to make informed decisions and strive to identify issues before they impact customers.