This job is closed

We regret to inform you that the job you were interested in has been closed. Although this specific position is no longer available, we encourage you to continue exploring other opportunities on our job board.

ARMposted 9 months ago
Full-time • Mid Level
Hybrid • Austin, TX
Resume Match Score

About the position

We are looking for a highly skilled Staff Infrastructure Automation Engineer to join our distributed team of Cloud and DevOps engineers. In this role, you will be responsible for designing, implementing, and maintaining scalable and reliable environments that optimize operations and ensure system uptime. You will collaborate with software and hardware engineering, as well as IT teams, to build robust systems that support our technology initiatives and solutions. Your expertise will be crucial in creating infrastructures that are not only efficient but also secure, as you will be working across various cloud platforms including AWS, Azure, and Google Cloud Platform. As a Staff Engineer, you will take on the role of a Technical Lead on quarterly prioritized features, supporting project managers and coordinating with IT teams, scrum masters, and the wider business to deliver projects effectively. You will also be expected to adopt a continuous learning mentality, staying updated with industry trends and new technologies to improve operational performance. Your contributions will directly impact the efficiency and reliability of our systems, making this a vital role within our organization. In addition to your technical responsibilities, you will participate in on-call rotations to ensure 24/7 system availability, and maintain detailed documentation of infrastructure, processes, and procedures to facilitate learning and operational continuity. Your ability to analyze system performance and implement improvements will enhance both cost efficiency and user experience, making you an integral part of our team.

Responsibilities

  • Design, implement, and run scalable, reliable, and secure infrastructures on-premise, in AWS, Azure, and Google Cloud Platform, including multi-cluster/multi-regional Kubernetes environments.
  • Develop and maintain automation scripts (Python, Bash, Shell, etc.) and tools (GitLab, Hashicorp Terraform, Hashicorp Vault, etc.) to streamline & improve deployment, monitoring, and management processes, using Infrastructure as Code (IaC).
  • Define and maintain infrastructure automation principles, collaborating with infrastructure teams to embrace & cultivate continuous integration and continuous delivery/deployment (CI/CD).
  • Implement and integrate with monitoring and observability solutions, such as AIOps, to proactively detect and respond to system issues.
  • Analyze system performance and implement improvements to enhance cost efficiency and user experience.
  • Participate in on-call rotations to ensure 24/7 system availability.
  • Maintain detailed documentation (HLDs and LLDs) of infrastructure, processes, and procedures to facilitate learning and operational continuity.
  • Act as a Technical Lead on quarterly prioritized features, supporting project managers and coordinating with IT teams, scrum masters, and the wider business to deliver projects.
  • Adopt a continuous learning mentality to stay updated with industry trends and new technologies to improve operational performance.

Requirements

  • Experience in a DevOps or SRE role, with a confirmed focus on infrastructure.
  • Extensive knowledge of cloud platforms (AWS, Azure, or GCP), containerization technologies (Docker, Kubernetes, Rancher, and Cloudbees, etc.), automation tools (Terraform, Ansible), and monitoring solutions (Prometheus, Grafana).
  • Strong scripting and programming skills (Bash, Python, and Go).
  • Experience in deploying, maintaining, and integrating Hashicorp Vault, GitLab, Jenkins, Ansible and Terraform Enterprise platforms with automation pipelines.
  • Experience in implementing IAM controls using SPIFFE and SPIRE to securely integrate authentication and authorization within engineering workflows, ensuring a secure environment.
  • Excellent analytical and problem-solving abilities with a proactive approach to identifying and resolving issues.
  • Good communication and collaboration skills, with the ability to work efficiently in a team-oriented environment.
  • Experience working in Agile delivery environment integrated with Atlassian Jira and Confluence applications.

Nice-to-haves

  • Experience with microservices architecture and serverless computing in a large enterprise scale environment, including the automated deployment and management EKS clusters.
  • Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD) certifications and Specialist or Architect level certifications in AWS, GCP, and Azure.
  • Familiarity with ITIL practices and incident management frameworks.

Benefits

  • Hybrid working environment that supports high performance and personal wellbeing.
  • Equal opportunity employer committed to diversity and inclusion.
  • Support for accommodations during the recruitment process.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service