Riot Games - Los Angeles, CA

posted 3 months ago

Full-time - Senior
Los Angeles, CA
Miscellaneous Manufacturing

About the position

Software Reliability Engineering at Riot is tasked with addressing the most complex technology challenges that arise as the company expands into a multi-game ecosystem. As a Staff Engineer on this team, you will collaborate with various engineering teams across Riot, engaging with a diverse range of technical stacks. This role demands a deep understanding of Riot's architecture, allowing you to prioritize and deploy your team effectively to ensure players enjoy consistent and reliable engagement with Riot's games. You will be responsible for building alignment among multiple technology stakeholders and fostering the growth of your engineers. Your role will involve coordinating with technical leads across the organization while aligning your priorities with Riot's strategic objectives. If you thrive on tackling high-scale service development challenges and enjoy seeing plans come to fruition, this position is designed for you. You will be expected to maintain and evolve Riot's technical understanding of its multifaceted architectures, ensuring that central technology teams have the necessary insights into the performance of live services. Additionally, you will help shape and lead your team into a competent Tier 1 Site Reliability group, design and implement services to enhance reliability and visibility, and establish long-lasting standards across various technical stacks. Your responsibilities will also include providing critical support and maintenance for existing platforms, being on rotational on-call for live product support, conducting meaningful code reviews, producing comprehensive user documentation, and mentoring a junior engineering team to become subject matter experts in observability, triage, and incident response.

Responsibilities

  • Maintain and evolve Riot's technical understanding of its multifaceted technical architectures
  • Ensure Riot central technology teams have the necessary vision into how our live services are performing
  • Help craft and lead the team into a competent Tier 1 Site Reliability capable group
  • Design, implement and modify services to enhance reliability and visibility
  • Establish meaningful, long lived, standards across multiple technical stacks
  • Provide emergent, critical support and maintenance to existing platforms
  • Be on rotational on-call for live product support and operational assessment
  • Provide meaningful code review for other members of the team
  • Produce comprehensive user documentation around your implemented solutions
  • Mentor, guide and level up a junior engineering team to be subject matter experts in observability, triage and incident response

Requirements

  • Bachelor's or Master's degree in Computer Science or a related field or relevant professional experience
  • 5+ years of relevant experience
  • Experience with designing, prioritizing and maintaining high-capacity, high-availability, and high-performant software, especially back-end services
  • Demonstrated ability to work across multiple organizations and generate alignment on technical standards
  • Demonstrated experience mentoring engineers to grow technically on your teams
  • Demonstrated experience working in container-based ecosystems and with a container scheduler (e.g. Marathon, Mesos, Kubernetes, GKE, Amazon ECS)
  • Experience with distributed systems, specifically microservices
  • Experience with API design, preferably using REST
  • Understand networking - HTTP down to the network layer (TCP/IP, routing, etc)
  • Understand relational databases like MySQL

Nice-to-haves

  • 2+ Years working in a high performance Site Reliability capacity
  • Experience building high-quality software in languages like Go, Java, Python, or Javascript
  • Familiarity with Site Reliability best practices
  • Experience building teams from the ground up
  • Experience with CI/CD pipelines, ideally Jenkins and/or Github Actions
  • Understand software performance and the influence of latency in online games
  • Experience with AWS (or comparable cloud environments)

Benefits

  • Open paid time off policy
  • Flexible work schedules
  • Medical insurance
  • Dental insurance
  • Life insurance
  • Parental leave for you, your spouse/domestic partner and children
  • 401k with company match
  • Short and long-term disability insurance
  • Vision insurance
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service