QuEra Computingposted about 1 month ago
Senior
Boston, MA
Computer and Electronic Product Manufacturing

About the position

We are seeking a Site Reliability Engineer (SRE) to lead the design, automation, and operation of reliable, scalable systems. This role combines software engineering with infrastructure expertise to ensure high availability and performance across both production and lab environments. You will work closely with development, infrastructure, and operations teams to drive reliability, observability, and continuous improvement.

Responsibilities

  • Design, build, and maintain resilient infrastructure across cloud and Kubernetes (TalOS-based) environments
  • Build and maintain lab infrastructure for development, testing, and validation, including networking, hardware integration, and automation
  • Define and monitor SLIs, SLOs, and error budgets to guide reliability efforts
  • Develop automation tools and scripts in Python, Bash, or Go to reduce manual toil and improve system operations
  • Improve observability using Prometheus, Grafana, OpenTelemetry, and other monitoring/logging solutions
  • Manage incident response, perform root cause analysis, and lead postmortem processes
  • Optimize systems for performance, scalability, and fault tolerance
  • Contribute to infrastructure as code (IaC) using Terraform, Ansible, or Helm
  • Collaborate with engineering teams to ensure systems are designed for operational excellence

Requirements

  • Bachelors degree in Software Engineering or Software Development
  • 8+ years of experience as an SRE, DevOps Engineer, or Systems Engineer
  • Strong expertise in Kubernetes (TalOS preferred), cloud platforms (AWS, GCP, Azure), and Linux
  • Hands-on experience with monitoring, logging, and incident management tools
  • Proficiency in Python, Bash, or Go for scripting and automation
  • Experience with building and maintaining lab environments, including physical and virtual infrastructure
  • Solid knowledge of networking, distributed systems, and performance optimization
  • Familiarity with CI/CD workflows and Infrastructure as Code practices
  • Strong communication skills and ability to work cross-functionally

Nice-to-haves

  • Experience in optical systems (e.g., optical networking, photonic devices)
  • Exposure to or interest in quantum computing platforms and environments
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service