Tbwa Chiat/Day - Walnut Creek, CA

posted 5 days ago

Full-time - Senior
Walnut Creek, CA

About the position

The Senior Site Reliability Engineer (SRE) at Network Optix will lead the deployment, support, and optimization of the NX Private Cloud across various infrastructures, including on-premise, hybrid, and major public cloud environments. This role focuses on developing solutions for deploying NX Cloud services, managing CI/CD pipelines, and ensuring the reliability and scalability of NX deployments. The ideal candidate will have extensive experience with Kubernetes and cloud platforms, and will be responsible for maintaining a robust infrastructure that supports business-critical services.

Responsibilities

  • Lead the deployment, scaling, and optimization of Kubernetes infrastructure across any cloud or hybrid environment.
  • Employ Kubernetes networking principles to ensure robust and secure interactions within our infrastructure.
  • Design, implement, and optimize CI/CD pipelines (Jenkins, GitLab) to automate build, test, and deployment processes across various environments.
  • Leverage service tracing tools to monitor, analyze, and optimize microservices performance and interactions.
  • Manage and maintain a diverse infrastructure (bare metal and cloud environments) comprising hundreds of servers, multiple Kubernetes clusters, and dozens of business-critical services.
  • Collaborate on the development and testing of NX Private Cloud, ensuring it can be deployed seamlessly in customer environments.
  • Rapidly respond to unexpected downtimes, perform root-cause analysis, and ensure timely customer notifications.
  • Conduct post-mortem reviews and design preventative measures to avoid future incidents.

Requirements

  • Ability to work seamlessly across multiple cloud platforms, without reliance on any single provider.
  • Proven ability to design and implement Kubernetes networking and microservices architectures.
  • Experience with service mesh technologies like Istio and service tracing tools for enhanced observability.
  • Ability to design and maintain cloud-agnostic Helm charts for Kubernetes environments.
  • Experience with CI/CD pipelines in hybrid or multi-cloud environments, particularly with Jenkins, GitLab, Artifactory, OpenSearch, and Graylog.
  • Experience in Ansible, Terraform, or other tools for multi-cloud infrastructure automation.
  • Hands-on experience with cross-platform infrastructure (Linux, Windows).
  • Commitment to continuous learning and staying updated with the latest industry trends and best practices.

Benefits

  • Competitive compensation
  • Paid time off
  • Onsite work in our brand-new comfortable office
  • Employer-sponsored health coverage
  • Working with top industry experts in our international team
  • Hybrid or Remote work options
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service