Site Reliability Engineer

Staffworthy - Washington, DC

posted 3 months ago

Full-time - Senior

Washington, DC

About the position

As a Site Reliability Engineer (SRE) at Staffworthy Inc., you will be an integral part of our mission to enhance the performance, reliability, and observability of our systems, particularly within the federal government sector. With over two decades of experience in technology services, we pride ourselves on assembling exceptional teams that are dedicated to delivering outstanding solutions. Your role will involve monitoring platform and containerized applications to ensure they operate at optimal performance and availability. You will proactively identify and address any performance and availability risks, contributing to the creation and optimization of essential functions within our core platform to establish a robust infrastructure. Collaboration is key in this position, as you will work closely with both your team and our customers on a daily basis to ensure that we meet and exceed their expectations. In this role, you will leverage your extensive experience as a Site Reliability Engineer, applying your strong understanding of SRE principles to develop highly scalable and reliable systems. You will be expected to bring a minimum of 8 years of relevant experience, along with a bachelor's degree and an active TS//SCI clearance. Your proficiency in a DevSecOps environment, including experience with source code repositories and CI/CD pipeline solutions such as Team Foundation Server/Azure DevOps, Bitbucket, and GitHub, will be crucial. Additionally, familiarity with container orchestration tools like Rancher and OpenShift, as well as experience with Infrastructure as Code (IaC), containerization, Kubernetes (K8), and CI/CD automation, will be essential for your success in this position. You will also need to be available to work on-site in downtown Washington, DC, at least three days per week, allowing for effective collaboration with your team and clients.

Responsibilities

Monitor platform and containerized applications to ensure optimal performance and availability.
Identify and address performance and availability risks and issues proactively.
Contribute to the creation and optimization of all necessary functions within the core platform to establish a robust infrastructure.
Collaborate closely with the team and the customer on a daily basis.

Requirements

Minimum of 8 years of experience as a Site Reliability Engineer, demonstrating a strong understanding of SRE principles for highly scalable and reliable systems.
Possess a bachelor's degree and an active TS//SCI clearance.
Proficiency in working within a DevSecOps environment, with experience using Source Code repositories and CI/CD pipeline solutions such as Team Foundation Server/Azure DevOps, Bitbucket, and GitHub.
Familiarity with container orchestration tools such as Rancher and OpenShift.
Ability to work effectively both within a team and independently.
Experience with Infrastructure as Code (IaC), containerization, K8, and CI/CD Automation.

Site Reliability Engineer

About the position

Responsibilities

Requirements

Tools

Career Hubs

Guides

Company