Unclassified - Salem, OR

posted 4 months ago

Full-time - Mid Level
Remote - Salem, OR

About the position

Cribl Inc is seeking a Staff Site Reliability Engineer (SRE) to join our mission to unlock the value of all observability data. As a remote-first company, we believe in empowering our employees to do their best work, wherever they are. This role is integral to our engineering organization, where you will contribute to envisioning, creating, deploying, testing, and shipping Cribl products. You will be part of a team of technical engineers committed to delivering high-quality software while enjoying a collaborative and fun work environment. In this position, you will engage with various teams to improve service delivery and reliability across their entire lifecycle. Your responsibilities will include measuring and monitoring all production systems with a focus on availability, latency, and overall system health. You will actively seek out the causes of errors and instability in our production cloud services, driving teams towards better operational excellence. Additionally, you will work closely with product and platform teams to improve and evolve systems by advocating for changes that enhance reliability, resilience, and observability. As a Staff Site Reliability Engineer, you will also help identify and reduce toil through creative innovation and automation. This role is not just about fixing issues; it’s about being involved from conception to design to development and all the way through production and beyond. If you are passionate about reliability and have strong opinions on how to improve systems, this opportunity may be the perfect fit for you.

Responsibilities

  • Engage with teams and improve service delivery and reliability across their entire lifecycle
  • Measure and monitor all production systems with an eye towards availability, latency and overall system health
  • Seek out the cause of errors and instability in our production cloud services and drive teams towards better operational excellence
  • Engage with product and platform teams to improve and evolve systems by lobbying for changes that improve reliability, resilience, and observability
  • Help identify and drive down toil with creative innovation and automation

Requirements

  • At least 1 year of experience in a related field
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service