Cribl - Washington, DC
posted 5 months ago
Cribl is on a mission to unlock the value of all observability data, and we are seeking a Staff Site Reliability Engineer (SRE) to join our dynamic team. As a remote-first company, we empower our employees to perform their best work from anywhere. In this role, you will be part of a collaborative engineering organization dedicated to creating, deploying, testing, and shipping high-quality software that meets the needs of our customers. You will have the opportunity to work with some of the biggest names in the industry, helping them solve their most pressing data challenges. As a Staff Site Reliability Engineer, you will engage with various teams to enhance service delivery and reliability throughout the entire lifecycle of our products. Your responsibilities will include measuring and monitoring production systems to ensure availability, latency, and overall system health. You will investigate the root causes of errors and instability in our production cloud services, driving teams towards operational excellence. Your role will also involve collaborating with product and platform teams to advocate for changes that improve reliability, resilience, and observability. We are looking for individuals who are passionate about reliability and have strong opinions on how to improve systems. You will be involved in all aspects of our cloud services, from conception to design to development and production. If you enjoy fixing things and have a creative approach to reducing toil through innovation and automation, this position is for you. You will also have on-call responsibilities, ensuring that our systems remain reliable and efficient.