Sr Site Reliability Engineer

$124,350 - $155,000/Yr

McGraw-Hill

posted 2 months ago

Full-time - Mid Level
Remote
Publishing Industries

About the position

As a Sr Site Reliability Engineer at McGraw Hill, you will be responsible for designing and maintaining high-capacity systems that ensure the reliability, performance, and security of customer platforms. This role involves collaborating with product teams within a DevOps framework to implement automation tools and processes that enhance predictability, accelerate time-to-market, and optimize costs, contributing to operational excellence and the delivery of reliable services.

Responsibilities

  • Design, deploy, and manage automation tools in a DevOps model.
  • Collaborate with product development teams to optimize systems for reliability and performance.
  • Continuously learn and stay updated on the AWS ecosystem.
  • Own the reliability, uptime, system security, cost, capacity, resiliency, and performance of applications and platforms.
  • Act as the primary contact during major incidents, resolving issues and managing on-call alarms.
  • Maintain and enhance telemetry systems to improve visibility into application performance and business metrics.
  • Support healthy software development practices and comply with agile methodology.
  • Partner with CyberSecurity to develop plans and automation for new risks and vulnerabilities.
  • Collaborate with development teams to identify system failure points and validate monitoring configurations.
  • Plan and forecast for seasonal growth and enhance infrastructure scaling plans.
  • Mentor and nurture engineers across varying levels of experience.

Requirements

  • Minimum of 5 years of applicable Site Reliability Engineering (SRE) experience.
  • Hands-on experience with AWS services including CloudFront, S3, EC2, ECS, SES, SQS, SNS, Load Balancing, VPC, Config, Systems Manager, Lambda, API Gateway, and DB services.
  • Experience with Terraform for infrastructure as code.
  • Proficiency in programming languages such as Python, Golang, and Bash.
  • Experience with containerization tools like Ansible and AWS ECS.
  • Familiarity with security and web platforms including Rapid7, WAF, Apache httpd, and Apache Tomcat.
  • Experience with telemetry tools like NewRelic, CloudWatch, and DataDog.
  • Knowledge of DevSecOps tools such as Artifactory, Jenkins, CircleCI, and GitHub.

Nice-to-haves

  • Experience with automation tools and software development.

Benefits

  • Annual bonus plan based on performance.
  • Full range of medical benefits.
  • Opportunities for professional development and growth.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service