Staff Cloud DevOps/Site Reliability Engineer (SRE) - USA

$180,000 - $280,000/Yr

Inworld AI - Mountain View, CA

posted 1 day ago

Full-time - Senior

Mountain View, CA

Publishing Industries

About the position

Inworld is the leading AI engine for games and interactive media, recognized for its innovative technology and strong partnerships with industry leaders. The Technical Operations team is responsible for managing the infrastructure, DevOps, and Site Reliability of the platform. We are seeking a Staff Cloud DevOps/Site Reliability Engineer to enhance our team and contribute to our mission of delivering exceptional AI gaming experiences.

Responsibilities

Maintain and contribute to Infrastructure-as-Code (Terraform)
Orchestrate pipelines using Github Actions, Helm, ArgoCD
Administer Kubernetes for microservices scalability
Manage cloud infrastructure
Measure and monitor service availability, latency, and overall health; drive incident management and post-mortem analysis

Requirements

Bachelor's degree in Computer Science, Engineering, or a related field
7+ years of experience as a DevOps, Infrastructure, Operations, or Site Reliability Engineer (or as a software engineer with relevant experience)
At least 2 years experience with Terraform
At least 2 years experience with Helm
At least 2 years experience with Kubernetes
At least 2 years experience with AWS, Azure, or GCP
At least 2 years experience with CI/CD using modern tools (GitOps)

Nice-to-haves

MLOps (building, orchestrating, and maintaining Machine Learning Pipelines)
Experience with Prometheus / Grafana
Experience with multi-cloud deployments (2 or more)
Experience with ArgoCD
Knowledge of network management and VPNs

Benefits

Equity
Comprehensive benefits package

Match and compare your resume to any job description

Start Matching

Staff Cloud DevOps/Site Reliability Engineer (SRE) - USA

About the position

Responsibilities

Requirements

Nice-to-haves

Benefits

Tools

Career Hubs

Guides

Company