Site Reliability Engineer

$72,100 - $158,620/Yr

Aetna - Woonsocket, RI

posted 3 months ago

Full-time - Mid Level
Woonsocket, RI
Insurance Carriers and Related Activities

About the position

Bring your heart to CVS Health. Every one of us at CVS Health shares a single, clear purpose: Bringing our heart to every moment of your health. This purpose guides our commitment to deliver enhanced human-centric health care for a rapidly changing world. Anchored in our brand — with heart at its center — our purpose sends a personal message that how we deliver our services is just as important as what we deliver. This position will be part of the PCW Pharmacy Technology Site Reliability Engineering team with focus on improving reliability and stability of the application portfolio. The ideal candidate will be a highly technical visionary, who is committed to ensuring seamless experiences for consumers and will be passionate about continuous improvement through automation, performance enhancements, and innovation. As a Site Reliability Engineer, you will identify, maintain, and manage SLOs, SLIs, and operational KPIs. You will establish and maintain strong partnerships with Product, Engineering, Infrastructure, and Service Management teams, influencing key decisions. Your proactive review of the existing environment, along with engagement on enhancements and/or new services, will help identify and remediate stability, reliability, and performance improvement opportunities. Continuous review of system telemetry and alerting will ensure actionable engagement by operations teams. You will also identify and develop automation solutions to address potential problems before they result in a service interruption, investigate root causes of major incidents, and share knowledge across platforms. Additionally, you will provide technical coaching and direction to organizational resources, stay current with emerging technologies and market trends, and review capacity models frequently to ensure production results are within expected bounds. Ensuring incident response processes and associated playbooks are current and effective will also be part of your responsibilities.

Responsibilities

  • Identify, maintain, and manage to SLOs, SLIs, and operational KPIs.
  • Establish and maintain strong partnerships with Product, Engineering, Infrastructure, and Service Management teams with the ability to influence key decisions.
  • Proactive review of the existing environment as well as engagement on enhancements and/or new services to identify and remediate stability, reliability, and performance improvement opportunities.
  • Continuous review of system telemetry and alerting to ensure actionable engagement by operations teams.
  • Identify and develop automation solutions to address potential problems before they result in a service interruption.
  • Investigate root cause of major incidents, identify remediation plans, and share knowledge across platforms.
  • Provide technical coaching and direction to organizational resources.
  • Stay current with emerging technologies and market trends to best position the organization.
  • Review capacity models frequently to ensure production results are within expected bounds.
  • Ensure incident response processes and associated playbooks are current and effective.

Requirements

  • 3+ years of experience in a Site Reliability Engineer or Application Operations role
  • 2+ years of experience demonstrated scripting or developing software in languages such as Java and Python
  • 2+ years of experience managing and improving cloud deployed services on platforms such as AKS & GCP as well as monolith systems
  • 2+ years of experience with configuring, customizing, and extending monitoring platforms such as AppDynamics, Splunk, Grafana, ELK, or similar.

Nice-to-haves

  • Experience managing version control systems such as GIT.
  • Experience with tools such as Jenkins and Harness.
  • Continuous improvement oriented ranging from ideation to implementation.
  • Ability to engage cross functional teams to champion the resolution of issues and design solutions.
  • Strong communication, organizational, analytical, and problem solving skills.
  • Knowledge of IT Service Management best practices such as change management and problem management.

Benefits

  • Full range of medical, dental, and vision benefits.
  • 401(k) retirement savings plan.
  • Employee Stock Purchase Plan.
  • Fully-paid term life insurance plan.
  • Short-term and long-term disability benefits.
  • Numerous well-being programs.
  • Education assistance and free development courses.
  • CVS store discount and discount programs with participating partners.
  • Paid Time Off (PTO) or vacation pay, as well as paid holidays throughout the calendar year.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service