This job is closed
We regret to inform you that the job you were interested in has been closed. Although this specific position is no longer available, we encourage you to continue exploring other opportunities on our job board.
Delivers services at high scale, high availability with resilience by using automation and Infrastructure Code. Builds reliability into ecosystem by applying best practices in Resiliency Engineering, Automation, Observability, and Chaos Testing. Manages systems using infrastructure as code tools (IAM, ARM, Terraform, and Chef). Utilizes modern monitoring tools (Datadog, Prometheus, and Splunk). Automates with various scripting languages – Python and Shell scripting. Helps teams scale through production insights, operational automation, developer guidance, real-time metrics, and automation.