Intrepid Studios - San Diego, CA

posted about 1 month ago

Full-time
San Diego, CA
Professional, Scientific, and Technical Services

About the position

Intrepid Studios' Ashes of Creation team is seeking a highly motivated and talented DevOps Engineer to enhance our MMORPG experience. This role involves managing cloud services and Kubernetes clusters, ensuring system performance, and implementing best practices in a collaborative environment.

Responsibilities

  • Manage existing Kubernetes clusters deployed on both GCP (GKE) and AWS (EKS).
  • Manage deployment of cloud services, including databases, message brokers, caching components, and custom applications.
  • Manage Kubernetes and cloud service updates.
  • Maintain and improve local development environment and cloud deployment scripts.
  • Manage cloud monitoring, metrics & reporting systems for observability and alerting.
  • Manage system backup and failure recovery.
  • Deploy data analysis tools and visual dashboards to assess system health.
  • Identify and troubleshoot system issues, providing timely updates to stakeholders and ensuring system uptime.
  • Provide analysis of system performance during playtests and other tests.
  • Document deployment, operating, and troubleshooting procedures.
  • Coordinate internal and 3rd-party maintenance windows, providing status updates.
  • Implement DevOps best practices with a focus on automation.

Requirements

  • BS or MS degree in Computer Science, Engineering or related field, or equivalent industry experience.
  • Expertise managing complex production Kubernetes clusters at scale.
  • Passion for Kubernetes and expertise in modern usage.
  • Experience with both GCP and AWS cloud providers.
  • Experience with Helm and Docker.
  • Experience administering SQL and NoSQL databases like CockroachDB and MongoDB.
  • Experience with metrics frameworks like Prometheus and Grafana.
  • Experience with enterprise logging services like Datadog.
  • Experience with common programming environments like Python, Go, and Javascript.
  • Experience with advanced Linux administration, including network stack, TCP/IP, DNS, filesystems, resource scheduling, and process management.
  • Ability to solve service issues under pressure on a production system.
  • Excellent verbal and written communication skills.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service