Cloudcover - Mountain View, CA

posted 5 days ago

Full-time - Senior
Mountain View, CA
Administrative and Support Services

About the position

The Staff Cloud DevOps/Site Reliability Engineer will be a key member of the Technical Operations team, responsible for managing the infrastructure, DevOps, and Site Reliability of the platform. This role involves maintaining and contributing to Infrastructure-as-Code, orchestrating CI/CD pipelines, and ensuring the reliability and performance of services through monitoring and incident management.

Responsibilities

  • Maintain and contribute to Infrastructure-as-Code using Terraform.
  • Orchestrate CI/CD pipelines using modern tools such as GitHub Actions, Helm, and ArgoCD.
  • Measure and monitor service availability, latency, and overall health.
  • Drive incident management and conduct post-mortem analysis.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or a related field.
  • 7+ years of experience as a DevOps, Infrastructure, Operations, or Site Reliability Engineer, or as a software engineer with relevant experience.
  • At least 2 years of experience with Terraform.
  • At least 2 years of experience with CI/CD using modern tools (GitOps).

Nice-to-haves

  • Experience with MLOps, including building, orchestrating, and maintaining Machine Learning Pipelines.
  • Experience with multi-cloud deployments (2 or more).
  • Familiarity with ArgoCD.
  • Knowledge of network management and VPNs.

Benefits

  • Equity compensation
  • Comprehensive benefits package
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service