Cribl - Clarksdale, MS

posted 22 days ago

Full-time - Senior
Remote - Clarksdale, MS

About the position

The Staff Site Reliability Engineer (SRE) position at Cribl, Inc. is a remote role focused on enhancing the reliability and performance of cloud services. The SRE will engage in all phases of product development, from conception to deployment, ensuring high-quality software delivery and operational excellence. This role is integral to improving service delivery and reliability across the organization, contributing to the evolution of systems, and driving innovation through automation.

Responsibilities

  • Engage with teams to improve service delivery and reliability across their entire lifecycle.
  • Measure and monitor all production systems focusing on availability, latency, and overall system health.
  • Identify the causes of errors and instability in production cloud services and drive teams towards operational excellence.
  • Collaborate with product and platform teams to enhance systems by advocating for changes that improve reliability, resilience, and observability.
  • Identify and reduce toil through creative innovation and automation.
  • Participate in on-call responsibilities.

Requirements

  • Extensive experience with enterprise scale continuous delivery environments.
  • 5+ years of experience in a DevOps or SRE role.
  • Development experience with JavaScript/Node.js/TypeScript in a Linux/Mac environment.
  • Experience with Configuration Management Tools like Terraform, Puppet, Chef, or Ansible.
  • Knowledge of sustainable incident response in a blameless environment.
  • Familiarity with cloud platforms, preferably AWS, and container orchestration technologies.
  • Experience with APM and observability tools such as New Relic, Splunk, CloudWatch, Prometheus, Grafana/Kibana, and Sentry.
  • Background in Linux Systems Engineering.
  • Experience with incident response tools like PagerDuty, FireHydrant, or Blameless.
  • Ability to work autonomously in a distributed team.

Nice-to-haves

  • Knowledge of cloud and application security.
  • Strong understanding of cloud design patterns for scale, data management, and resiliency.
  • A passion for high quality and testing.
  • Strong opinions about dashboards, metrics, and SLOs.

Benefits

  • Health insurance
  • Dental insurance
  • Vision insurance
  • Short-term disability insurance
  • Life insurance
  • Paid holidays
  • Paid time off
  • Fertility treatment benefit
  • 401(k) plan
  • Equity options
  • Eligibility for a discretionary company-wide bonus
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service