Amazon - Seattle, WA

posted about 2 months ago

Full-time - Senior
Seattle, WA
1,001-5,000 employees
Sporting Goods, Hobby, Musical Instrument, Book, and Miscellaneous Retailers

About the position

The System Development Manager for AWS Resilience and Incident Response is responsible for managing automated tooling roadmaps and delivery for detecting and resolving issues within AWS and Amazon infrastructure. This role involves leading a team that ensures efficient incident resolution, driving improvements in automation and tooling, and coordinating across project teams to enhance service monitoring and alarming processes. The position requires a strong focus on performance management and team health, while contributing to the strategic goals of the global AWS Incident Response team.

Responsibilities

  • Define and deliver business priorities for the global AWS Incident Response team.
  • Coordinate with counterparts to ensure clear communication between AWS Operations teams.
  • Work closely with systems and product teams to maintain processes for monitoring and alarming on services.
  • Act as the point of contact for inquiries regarding engagement processes and issues within the global Amazon platform.
  • Delegate emergent engagement issues to team members and drive initiatives for tool and process improvements.
  • Own all facets of performance and career management for the team.

Requirements

  • 5+ years of direct experience with cloud hosting technologies (AWS, Azure, etc.)
  • 5+ years experience managing an engineering team operating at scale.
  • Deep understanding of infrastructure delivered through the software development lifecycle in an API-enabled environment.
  • Experience in implementing, supporting, and evaluating tools and services with a security, scalability, and performance mindset.
  • Ability to handle multiple competing priorities in a fast-paced environment.
  • Excellent written and verbal communication skills.

Nice-to-haves

  • Strong understanding of fundamental operational best practices such as monitoring, alerting, deployment and change policies (ITIL a plus).
  • Experience running agile frameworks or other workflow methodologies in a DevOps setting.
  • Experience dealing with customers during issue resolution and operating under pressure.

Benefits

  • Comprehensive medical, financial, and other benefits package.
  • Equity and sign-on payments as part of total compensation package.
  • Flexible working culture to support work-life balance.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service