Senior Site Reliability Engineer Seattle

MongoDB - Palo Alto, CA

posted 8 days ago

Full-time - Mid Level

Palo Alto, CA

Professional, Scientific, and Technical Services

About the position

The Site Reliability Engineer (SRE) role at MongoDB involves designing and building the infrastructure for a global cloud service that supports hundreds of thousands of MongoDB clusters. The position focuses on optimizing performance, ensuring resilience, and automating processes across various cloud providers. The SRE team is integral to maintaining low-latency requests and compliance with data sovereignty requirements while minimizing operational burdens and enhancing system visibility.

Responsibilities

Design and build the infrastructure for a global cloud service comprising hundreds of thousands of MongoDB clusters.
Implement and troubleshoot automation and monitoring of services across multiple cloud providers.
Become an expert in infrastructure performance, optimizing from the application level to firmware.
Build resilient systems to minimize operational alerts and participate in a weekly on-call rotation.
Improve infrastructure capabilities, focusing on cost, simplicity, and maintainability.

Requirements

Experience running a mission-critical service at scale.
Understanding of information security issues.
Prior experience running critical production systems in a Linux environment.
Firm grasp of at least one modern programming language beyond basic scripting.
Solid understanding of web and network protocols and standards (HTTP, TLS, DNS, etc.).
Bachelor's degree in Computer Science or equivalent experience.
Experience writing automation tools and eagerness to automate processes.

Nice-to-haves

Experience building large applications from scratch, including CI/CD infrastructure.
Experience in networking, security, hardware, or OS performance tuning.
Experience with at least one major cloud provider (AWS, Google Cloud, Microsoft Azure).
Experience managing Kubernetes clusters or other container orchestration infrastructure.
Experience with observability of large-scale distributed systems.

Benefits

Generous compensation package including equity and benefits.
Opportunities to learn on the job and upskill in new technologies.
High level of independence in day-to-day work.
Supportive and enriching culture with employee affinity groups.
Fertility assistance and generous parental leave policy.

Senior Site Reliability Engineer Seattle

About the position

Responsibilities

Requirements

Nice-to-haves

Benefits

Tools

Career Hubs

Guides

Company