Site Reliability Engineer (Hybrid)

Enova Agency - Chicago, IL

posted 2 months ago

Full-time

Chicago, IL

Credit Intermediation and Related Activities

About the position

As a Site Reliability Engineer (SRE) at Enova, you will play a crucial role in maintaining the reliability of our consumer business from both a technology and operational perspective. Your primary focus will be on driving rapid improvements and enhancing efficiency through the implementation of automated tools, evaluation of processes, and troubleshooting of complex problems. You will collaborate closely with IT, Software Engineering, and product teams to resolve operational issues and develop innovative solutions that benefit our teams. By applying DevOps principles, you will manage the platform and applications, fostering functionality and adoption through continuous improvement, simplification, and automation. You will be joining a dedicated team of SRE and Observability engineers, all working together to ensure that Enova's reliability is recognized as best in class. In this role, you will be responsible for troubleshooting incidents and service requests, working with the appropriate individuals, teams, or vendors. This will involve prioritizing tasks based on business impact and urgency, diagnosing issues, investigating root causes, restoring services, and maintaining clear communication with stakeholders throughout the process. You will handle operational requests on a daily basis and monitor Service Level Objectives (SLOs) related to the availability, latency, scalability, and efficiency of services. Additionally, you will participate in the team's periodic on-call rotation, champion reliability best practices, and create frameworks that drive the development of stable and scalable products. Your continued curiosity about new technologies and evolving best practices will be essential as you work closely with Software Engineering and Product teams to advise on sound operational strategies.

Responsibilities

Troubleshoot incidents and service requests with the appropriate individuals, teams, or vendors, prioritizing according to business impact and urgency.
Diagnose, investigate, and restore service while communicating with stakeholders throughout the process.
Handle operational requests on a day-to-day basis.
Monitor SLOs on the availability, latency, scalability, and efficiency of services.
Participate in the team's periodic on-call rotation.
Champion reliability best practices and create frameworks to drive stable and scalable products.
Maintain a continued curiosity regarding new technologies and evolving best practices.
Work closely with Software Engineering and Product teams to advise on sound operational strategies.

Requirements

Experience with DevOps as a culture.
Working knowledge of Ruby, Python, Java, or Go.
Ability to write accurate and efficient SQL queries.
Experience troubleshooting ambiguous problems and performing root cause analysis.
Strong understanding of IT infrastructure (Linux, network technologies, relational databases, web technologies, etc.).
A structured approach to problem solving, and excellent written and verbal communication skills.
Ability to read and understand software application code.

Site Reliability Engineer (Hybrid)

About the position

Responsibilities

Requirements

Tools

Career Hubs

Guides

Company