Gm Cruise - Phoenix, AZ

posted 3 months ago

Full-time - Mid Level
Phoenix, AZ
Transportation Equipment Manufacturing

About the position

The Observability team at Cruise is seeking a Staff Site Reliability Engineer to enhance and develop observability systems, tools, and the associated codebase. This role is pivotal in ensuring the reliability, scalability, performance, efficiency, and security of our systems. As a Staff Site Reliability Engineer, you will leverage your software and systems engineering skills to contribute code, conduct code reviews, and create technical designs aimed at improving the performance and reliability of observability systems. You will proactively identify and address challenges, creating new opportunities to enhance engineering through observability. Collaboration is key, as you will partner with Software Engineering teams to understand their use-cases and guide them in effectively utilizing existing tools. Additionally, you will be responsible for building tools that enable engineers to collect and act on observability signals, thereby enhancing the overall system performance and reliability.

Responsibilities

  • Contribute code and perform code reviews to improve observability systems.
  • Create technical designs that enhance performance and reliability of observability systems.
  • Proactively identify challenges and opportunities for improvement in engineering through observability.
  • Collaborate with Software Engineering teams to understand use-cases and guide effective tool usage.
  • Build tools to enable engineers to collect and act on observability signals.

Requirements

  • Previous experience as an SRE, Production Engineer, Systems Engineer, or Software Engineer focusing on distributed systems reliability.
  • Considerable experience with container orchestration systems (e.g., Kubernetes).
  • Proficient in designing and developing sophisticated distributed systems using high-level programming languages such as Go, Python, Rust, C/C++, or NodeJS.
  • Experience in leading or driving a multi-functional effort to implement new technology or service.
  • Experience in designing and implementing large scale systems.
  • Considerable Linux experience.
  • Effective collaboration skills to work closely with team members and various engineering teams.

Nice-to-haves

  • Experience with Cloud Platforms such as Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP).
  • Experience with OpenTelemetry instrumentation.
  • Familiarity with Kubernetes, Docker, Istio, and Terraform.
  • Leadership experience.
  • Skilled in defining and instrumenting SLIs and SLOs.
  • Previous experience working with Prometheus, Grafana, TSDBs, and observability pipelines.

Benefits

  • Competitive salary and benefits
  • Medical / dental / vision, Life and AD&D
  • Subsidized mental health benefits
  • Paid time off and holidays
  • Paid parental, medical, family care, and military leave of absence
  • 401(k) Cruise matching program
  • Fertility benefits
  • Dependent Care Flexible Spending Account
  • Flexible Spending Account & Health Saving Account
  • Perks Wallet program for benefits/perks
  • Pre-tax Commuter benefit plan for local employees
  • CruiseFlex, our location-flexible work policy.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service