Vdart - San Jose, CA

posted 4 days ago

Full-time - Senior
San Jose, CA
1,001-5,000 employees
Professional, Scientific, and Technical Services

About the position

The Lead Site Reliability Engineer will play a crucial role in maintaining and enhancing the reliability, performance, and availability of software systems. This position acts as a bridge between traditional IT operations and software development, applying a software engineering approach to system administration. The role requires extensive experience in coding, scripting, and cloud computing, particularly in automation and monitoring tools.

Responsibilities

  • Maintain and improve the reliability, performance, and availability of software systems.
  • Create and support automation scripts (shell/ansible/python) for infrastructure deployments, validations, and monitoring.
  • Schedule monitoring scripts using cron and airflow.
  • Monitor systems using tools including Dynatrace, Apica, and Grafana.
  • Handle databases and ensure their performance and reliability.
  • Build CI/CD pipelines for efficient software delivery.
  • Manage incidents and perform problem management.

Requirements

  • 14+ years of IT Infrastructure experience.
  • Experience in Ansible and Python for automation.
  • Extensive experience with Linux flavors like RHEL/CentOS OS, shells, filesystems, and utilities.
  • Knowledge of distributed computing and experience with container orchestration frameworks, including Kubernetes.
  • Experience with storage solutions, preferably ONTAP, including volume, aggregates, backups, and DR planning.
  • Experience scheduling monitoring scripts using cron and airflow.
  • Proficiency with monitoring tools such as Dynatrace, Apica, and Grafana.
  • Database knowledge, including SQL and NoSQL databases.
  • Experience building CI/CD pipelines is preferred.
  • Cloud platform knowledge, specifically AWS, is required.

Nice-to-haves

  • Experience with additional programming languages or tools related to SRE.
  • Familiarity with other cloud platforms beyond AWS.

Benefits

  • Contract position with potential for future opportunities.
  • Access to a global network and industry expertise through VDart Group.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service