Sr Site Reliability Engineer

$113,600 - $181,600/Yr

Federal Reserve Bank - San Francisco, CA

posted 3 months ago

Part-time,Full-time - Mid Level

Remote - San Francisco, CA

Monetary Authorities-Central Bank

About the position

As a Senior Site Reliability Engineer at the Federal Reserve Bank of San Francisco, you will play a crucial role in the Data & Analytics Services (DAS) Team, where you will have the opportunity to apply your engineering skills across various technology solutions. This position allows you to engage in multiple aspects of product delivery, from inception through design, build, and deployment. You will collaborate with Product Managers, Architects, Engineers, and Customer teams in a dynamic environment, focusing on developing Infrastructure as Code to launch server instances and configure software. Your technical leadership will be essential in planning, designing, and implementing cloud-based infrastructure systems, whether they are traditional or non-traditional. Your responsibilities will include improving and protecting the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of cloud-based software and systems. You will implement, manage, and scale distributed systems in various cloud environments, including public, private, or hybrid clouds. Additionally, you will help implement an automation strategy for cloud services, working closely with architects and developers to reduce toil, minimize human errors, drive scalability, and enhance the reliability of the data platform. You will be responsible for identifying and responding to service failures to ensure compliance with Service-Level Agreements, regularly updating application playbooks to expedite incident mitigation. Collaborating with development teams, you will establish Service-Level Objectives and key Service-Level Indicators, design and deploy Infrastructure-as-Code solutions, and lead postmortem exercises to improve operational readiness. Your role will also involve conducting Production Readiness Reviews and facilitating compliance by rehydrating infrastructure on schedule and empowering developers with self-service capabilities. Incident response, on-call activities, and managing system activities to an error budget will also be part of your responsibilities.

Responsibilities

Develop Infrastructure as Code to launch server instances and configure software.
Provide technical leadership in planning, designing, and implementing cloud-based infrastructure systems.
Improve and protect the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of cloud-based software and systems.
Implement, manage, and scale distributed systems in public, private, or hybrid cloud environments.
Help implement the automation strategy for cloud services to reduce toil and improve reliability.
Identify and respond to service failures to maintain compliance with Service-Level Agreements.
Establish Service-Level Objectives and key Service-Level Indicators with development teams.
Design, develop, engineer, and deploy Infrastructure-as-Code solutions for platform scalability and efficiency.
Conduct Production Readiness Reviews to ensure operational readiness before going live.
Lead postmortem exercises and establish postmortem procedures.

Requirements

Bachelor's degree in computer science, Information Systems, Computer Engineering, Systems Analysis or a related field or equivalent work experience.
5+ years of relevant technical work experience in software development using cloud service provider platforms.
3 or more years of experience using Terraform to manage AWS Programmable Infrastructures.
Experience architecting and implementing Cloud Infrastructure Automation scripts for various environments in AWS.
Experience managing infrastructure including security roles, permissions, and cloud networking assets.
Hands-on programming and scripting skills in languages such as Java, C++, C#, Python, and Bash.
Experience with Continuous Integration, Continuous Delivery, and Continuous Deployment software tools.
Understanding of security design for enterprise software systems.

Nice-to-haves

Experience with advanced features like S3 backends and State file locks in Terraform.
Knowledge of high-availability, load-balancing, and failover configurations.
Experience in the Financial Industry or with Government Agencies.

Benefits

Medical
Dental
Vision
Pre-tax Flexible Spending Account
Backup Childcare Program
Pre-Tax Day Care Flexible Spending Account
Paid Family Care Leave
Vacation Days
Sick Days
Paid Holidays
Pet Insurance
Matching 401(k)
Retirement/Pension

Sr Site Reliability Engineer

About the position

Responsibilities

Requirements

Nice-to-haves

Benefits

Tools

Career Hubs

Guides

Company