Tyler Technologiesposted about 2 months ago
Yarmouth, ME
Publishing Industries

About the position

The position involves implementing tooling to monitor AWS EKS-based systems with a focus on performance, reliability, and scalability. The role requires ensuring that architecture and deployment models are sufficient to support SLA commitments and are well prepared for future problems of scale. The candidate will leverage cloud technology and platform capabilities to provide operationally sustainable solutions that are robust and cost-effective. Additionally, applying software engineering best practices to comprehensively address and resolve problems is essential. The role also includes collaborating with product support teams to drive efficiency and enhance customer experience through self-service tools and automation, ensuring timely response to incidents and support requests, and conducting root cause analysis to implement preventative measures to minimize toil and impact on customers. Leading and participating in incident retrospectives to enhance future response efforts and participating in on-call rotations to provide critical support as needed are also key responsibilities.

Responsibilities

  • Implement tooling to monitor AWS EKS-based systems focusing on performance, reliability, and scalability.
  • Ensure that architecture and deployment models are sufficient to support SLA commitments and are well prepared for future problems of scale.
  • Leverage cloud technology and platform capabilities to provide operationally sustainable solutions that are robust and cost effective.
  • Apply software engineering best practices to comprehensively address and resolve problems.
  • Collaborate with product support teams to drive efficiency and enhance customer experience through self-service tools and automation.
  • Ensure timely response to incidents and support requests, collaborating effectively on solutions.
  • Conduct root cause analysis and implement preventative measures to minimize toil and impact on customers.
  • Lead and participate in incident retrospectives to enhance future response efforts.
  • Participate in on-call rotations, providing critical support as needed.

Requirements

  • A successful technical career within reputable technology firms, particularly with large-scale cloud applications.
  • Expertise in Site Reliability Engineering concepts and practices, including the use of observability platforms and monitoring tools.
  • Experience deploying and supporting containerized applications on cloud platforms, preferably EKS on AWS.
  • Proficiency in infrastructure as code technologies, such as Terraform.
  • Strong software engineering skills in languages like Python, JavaScript, or Go.
  • Familiarity with DevOps and CI/CD methodologies.
  • Bachelor's degree in Computer Science or related field.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service