SRE Engineer -58301

$135,200 - $145,600/Yr

Primus Global Services - Sunnyvale, CA

posted 2 months ago

Full-time - Mid Level
Sunnyvale, CA
Professional, Scientific, and Technical Services

About the position

The SRE Engineer position is a long-term opportunity with one of our largest clients, located in either Durham, NC or Sunnyvale, CA. This role is critical for ensuring the reliability and performance of our systems, and it requires a strong background in site reliability engineering principles. The ideal candidate will have extensive experience with various Linux distributions, particularly RHEL and CentOS, and will be adept at using shell scripting, managing filesystems, and utilizing various utilities to maintain system health and performance. In this role, you will be responsible for working with distributed computing systems and container orchestration frameworks, including Kubernetes and Rancher. A solid understanding of Kubernetes objects is essential, as you will be tasked with deploying and managing applications in a cloud-native environment. Additionally, experience with storage solutions, particularly ONTAP, is preferred, as you will be involved in managing volumes, aggregates, backups, and disaster recovery planning. Automation is a key focus of this position, and you will be expected to create and support automation scripts using shell, Ansible, and Python to streamline infrastructure deployments, validations, and monitoring processes. Familiarity with scheduling monitoring scripts using cron and Airflow is also required. You will work with various monitoring tools such as Dynatrace, Apica, and Grafana to ensure system performance and reliability. A good understanding of both SQL and NoSQL databases is necessary, as is experience in building CI/CD pipelines, particularly in cloud environments like AWS. Incident handling and problem management will also be part of your responsibilities, ensuring that any issues are resolved promptly and effectively.

Responsibilities

  • Ensure the reliability and performance of systems as an SRE Engineer.
  • Work with various Linux distributions, particularly RHEL and CentOS.
  • Utilize shell scripting, manage filesystems, and use utilities for system maintenance.
  • Manage distributed computing systems and container orchestration frameworks, including Kubernetes and Rancher.
  • Deploy and manage applications in a cloud-native environment using Kubernetes objects.
  • Handle storage solutions, particularly ONTAP, including volumes, aggregates, backups, and disaster recovery planning.
  • Create and support automation scripts using shell, Ansible, and Python for infrastructure deployments and monitoring.
  • Schedule monitoring scripts using cron and Airflow.
  • Utilize monitoring tools such as Dynatrace, Apica, and Grafana to ensure system performance.
  • Work with SQL and NoSQL databases as part of system management.
  • Build CI/CD pipelines in cloud environments, specifically AWS.
  • Handle incidents and manage problems effectively.

Requirements

  • Extensive experience with Linux flavors like RHEL and CentOS.
  • Strong knowledge of distributed computing and container orchestration frameworks.
  • Experience with Kubernetes and Rancher, including knowledge of Kubernetes objects.
  • Familiarity with storage solutions, preferably ONTAP, including volume and aggregate management.
  • Experience in creating automation scripts using shell, Ansible, and Python.
  • Knowledge of scheduling monitoring scripts using cron and Airflow.
  • Experience with monitoring tools such as Dynatrace, Apica, and Grafana.
  • Database knowledge, including SQL and NoSQL databases.
  • Experience building CI/CD pipelines, preferably in AWS.
  • Strong incident handling and problem management skills.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service