Site Reliability Engineer

$120,000 - $135,000/Yr

Evolent Health - Santa Fe, NM

posted 17 days ago

Full-time - Mid Level
Santa Fe, NM
Professional, Scientific, and Technical Services

About the position

The Site Reliability Engineer at Evolent Health plays a crucial role in managing the application suite and cloud infrastructure, focusing on transforming cloud infrastructure management and application reliability. This position is part of a high-performance team dedicated to ensuring optimal performance and security of containerized workloads while collaborating with various teams to enhance internal developer platforms.

Responsibilities

  • Implement and manage observability solutions using OpenTelemetry to monitor and trace application performance.
  • Implement and manage containerization solutions using platforms such as Docker and Kubernetes, focusing on Azure Kubernetes Service (AKS) and Azure Container Apps (ACA).
  • Monitor the health and performance of containers and resolve any issues that arise.
  • Follow security best practices to ensure containerized workloads are fully secure.
  • Partner with DevOps in advancing Infra as Code and Config as Code discipline.
  • Collaborate with the Platform Architecture team to continuously improve the Internal Developer Platform (IDP).
  • Participate in Root Cause Analysis (RCA) to identify corrective action plans (CAP).

Requirements

  • 3+ years of hands-on Azure experience and 5+ years of overall cloud-native experience.
  • Strong understanding of OpenTelemetry and experience in implementing observability solutions.
  • Proven experience in implementing and managing container orchestration platforms such as Kubernetes and Docker.
  • Deep understanding of deployment methodologies for Kubernetes, preferably ArgoCD and Helm.
  • Experience with other Azure services such as Azure Functions, Azure Logic Apps, and Azure Service Fabric.
  • Passion and creativity for Automation using tools such as Ansible and Terraform.
  • Experience in working with GitHub Actions or Jenkins.
  • Expertise in at least one of these scripting/configuration languages: PowerShell, YAML, HCL, Python.
  • Expertise in at least one of the APM tools: Prometheus, Dynatrace, DataDog.
  • Experience leveraging agile methodology (i.e., Scrumban) to manage project work.
  • Highly effective communicator with a strong commitment to transparency.

Nice-to-haves

  • PostGresSQL experience
  • Fast Healthcare Interoperability Resources (FHIR API) experience.

Benefits

  • Comprehensive health insurance benefits
  • Bonus component based on performance factors
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service