Learfield - Syracuse, NY
posted 3 months ago
As a Senior Site Reliability Engineer at Learfield, you will play a crucial role in ensuring the reliability, availability, and performance of our services. You will work in cross-discipline teams, collaborating closely with our domain engineering and Site Reliability Engineering teams to architect and maintain live services. Your responsibilities will include planning and forecasting service capacity and demand, analyzing software performance, and tuning systems and software to meet our high standards. You will also be tasked with solving mission-critical incidents and building automation to prevent problem recurrence, effectively automating away all toil associated with operational tasks. In this role, you will identify root causes of production issues and recommend permanent solutions, ensuring that our systems are robust and resilient. You will set up and improve monitoring systems, including metrics, logs, and alerts, to quickly identify and address issues as they arise. Additionally, you will develop effective documentation, tooling, and alerts to mitigate risks and enhance our operational capabilities. Security is a top priority, and you will actively participate in efforts to keep our environment secure by reviewing compliance and internal scans, working with development teams to stay ahead of security vulnerabilities. You will also be responsible for developing Run Books for our Level I NOC team to reduce Mean Time to Detection (MTTD) and Mean Time to Recovery (MTTR) for alerts. Participation in an on-call rotation with other members of the Site Reliability Engineering team will be expected, ensuring that we maintain high service levels even during off-hours. This position offers a unique opportunity to influence the technology growth that impacts millions of customers across the entertainment space, allowing you to grow both our products and your career.