Site Reliability Engineer

PRISM+ - Reston, VA

posted 3 months ago

Full-time

Remote - Reston, VA

About the position

The Site Reliability Engineer (SRE) position is a critical role within a Fortune 100 Pharma Company, focusing on ensuring the reliability and performance of production systems. This role is fully remote, allowing for flexibility while adhering to Eastern Standard Time (EST) working hours. The SRE will be responsible for deploying builds into production and leveraging their programming background to read and understand existing code, although no code remediation is required. A significant aspect of this role involves automating routine tasks to eliminate manual intervention, particularly in areas such as access management, which is essential for maintaining operational efficiency. In addition to automation, the SRE will ensure platform performance and stability, establishing and enhancing operational capabilities from the ground up. This includes triaging and troubleshooting issues, such as identifying root causes of errors like 403 errors, and managing incidents effectively to minimize downtime and impact on services. The role also encompasses overseeing development, testing, and staging environments, ensuring that all systems are functioning optimally and ready for production deployment. The ideal candidate will possess a strong technical background and a mindset focused on automation and efficiency, which is crucial for the continuous improvement of operational processes. This position offers an opportunity to work in a dynamic environment, contributing to the reliability of systems that support critical pharmaceutical operations.

Responsibilities

Deploy builds into production
Leverage programming background to read and understand code
Automate routine tasks to eliminate manual intervention
Ensure platform performance and stability
Establish and enhance operational capabilities from the ground up
Triage and troubleshoot issues
Manage incidents effectively
Oversee development, testing, and staging environments

Requirements

Relevant education and experience in Site Reliability Engineering
Proficiency in AWS
Experience with Kubernetes
Strong programming skills in Python
Knowledge of Shell scripting
Familiarity with GitHub Actions / Jenkins for automated test scripts

Site Reliability Engineer

About the position

Responsibilities

Requirements

Tools

Career Hubs

Guides

Company