Overdrive - Cleveland, OH
posted 3 months ago
The Site Reliability Engineer (SRE) position at OverDrive is a critical role that focuses on ensuring the availability, performance, and efficiency of our services. This position requires a hybrid work schedule, with two days on campus in Cleveland, Ohio, and three days working from home. The SRE will be responsible for various aspects of service management, including change management, monitoring, emergency response, and capacity planning for both existing and future services. The role demands a proactive approach to predicting performance issues and implementing solutions before they impact end-users. Collaboration with application developers is essential to ensure that applications meet their uptime requirements, and SREs will also participate in an on-call rotation, which may require incident response during off-hours. In this role, you will engage in small projects and individual tasks, receiving regular guidance from more senior developers. You will provide day-to-day support for development teams, which includes building and running deployments, answering questions, and monitoring Service-Level Indicators for applications and systems. Continuous learning is encouraged, and you will independently train in the systems and technologies utilized by the team. Your feedback will be invaluable to application developers, helping them meet performance objectives from a systems perspective. Additionally, you will work with applications in various programming languages within a Linux environment, contributing to the overall reliability and performance of our services.