Lead Site Reliability Engineer

$74,000 - $160,000/Yr

Zenimax Media - Rockville, MD

posted 5 months ago

Full-time - Mid Level
Rockville, MD
Publishing Industries

About the position

We are seeking a Lead Site Reliability Engineer to join our corporate IT Enterprise Services (ES) Team at ZeniMax Media Inc. In this pivotal role, you will be responsible for managing a team of site reliability engineers dedicated to delivering enterprise-class reliability and automation across ZeniMax Media and its studios. The environments you will work with include both Windows and Linux systems, requiring a versatile skill set and the ability to adapt to different technologies and workflows. Your leadership will be crucial in mentoring team members, fostering a culture of continuous improvement, and ensuring that our systems are robust and reliable. As a Lead Site Reliability Engineer, you will play a key role in reducing operational toil throughout the enterprise by applying thoughtful Site Reliability Engineering (SRE) and DevOps principles. This involves not only managing your team effectively but also contributing to strategic decision-making regarding the prioritization and planning of work. You will participate in an on-call rotation, providing support for our systems and ensuring that any issues are addressed promptly. Additionally, you will help establish Service Level Agreements (SLAs) for critical business systems, ensuring that we meet the reliability expectations of our stakeholders. Collaboration is essential in this role, as you will actively work with other enterprise services teams to identify areas for improvement in service delivery. Your ability to communicate effectively and work cross-functionally will be vital in driving initiatives that enhance our operational efficiency and reliability.

Responsibilities

  • Reduce toil throughout the enterprise through thoughtful application of SRE and DevOps principles
  • Effective management and mentorship of team members
  • Contribute to decision making on the prioritization and planning of work
  • Participate in an on-call rotation
  • Help establish SLA for critical business systems
  • Actively work with other enterprise services teams to identify improvements to service delivery

Requirements

  • Proven experience in Site Reliability Engineering or a related field
  • Strong understanding of both Windows and Linux environments
  • Experience with SRE and DevOps principles
  • Demonstrated ability to manage and mentor a team
  • Excellent problem-solving skills and attention to detail
  • Strong communication and collaboration skills

Nice-to-haves

  • Experience with cloud services and infrastructure
  • Familiarity with automation tools and scripting languages
  • Knowledge of monitoring and logging tools
  • Experience in gaming or entertainment industry

Benefits

  • Competitive salary
  • Health insurance
  • 401(k) plan
  • Paid time off
  • Professional development opportunities
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service