PRISM+ - Reston, VA

posted 2 months ago

Full-time
Remote - Reston, VA

About the position

The Site Reliability Engineer (SRE) position is a critical role within a Fortune 100 Pharma Company, focusing on ensuring the reliability and performance of production systems. This role is fully remote, allowing for flexibility while adhering to Eastern Standard Time (EST) working hours. The SRE will be responsible for deploying builds into production and leveraging their programming background to read and understand existing code, although no code remediation is required. A significant aspect of this role involves automating routine tasks to eliminate manual intervention, particularly in areas such as access management, which is essential for maintaining operational efficiency. In addition to automation, the SRE will ensure platform performance and stability, establishing and enhancing operational capabilities from the ground up. This includes triaging and troubleshooting issues, such as identifying root causes of errors like 403 errors, and managing incidents effectively to minimize downtime and impact on services. The role also encompasses overseeing development, testing, and staging environments, ensuring that all systems are functioning optimally and ready for production deployment. The ideal candidate will possess a strong technical background and a mindset focused on automation and efficiency, which is crucial for the continuous improvement of operational processes. This position offers an opportunity to work in a dynamic environment, contributing to the reliability of systems that support critical pharmaceutical operations.

Responsibilities

  • Deploy builds into production
  • Leverage programming background to read and understand code
  • Automate routine tasks to eliminate manual intervention
  • Ensure platform performance and stability
  • Establish and enhance operational capabilities from the ground up
  • Triage and troubleshoot issues
  • Manage incidents effectively
  • Oversee development, testing, and staging environments

Requirements

  • Relevant education and experience in Site Reliability Engineering
  • Proficiency in AWS
  • Experience with Kubernetes
  • Strong programming skills in Python
  • Knowledge of Shell scripting
  • Familiarity with GitHub Actions / Jenkins for automated test scripts
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service