LPL Financial - Fort Mill, SC
posted 6 months ago
LPL Financial is seeking a Lead Site Reliability Engineer (SRE) to join our Technology organization. In this pivotal role, you will be responsible for ensuring the stability, availability, and reliability of our applications and platforms. As a Lead SRE, you will provide engineering leadership in key areas such as Observability, Resilience Architecture, Resilience Testing, and Incident Management across LPL Systems. Your expertise will be crucial in identifying opportunities to enhance system performance, efficiency, scalability, fault tolerance, and self-healing capabilities. You will apply Chaos Engineering principles to creatively experiment with LPL systems, uncovering hidden weaknesses and improving overall observability through the implementation of monitoring, metrics, logs, and Service Level Objectives (SLO). In addition, you will work closely with developers to foster a NoOps culture, empowering them with next-generation self-service options. Your role will also involve activating a high-performing, continuously improving 24x7 L2 organization, ensuring that customer experience is fanatically protected through capabilities and processes that swiftly detect and restore system issues. You will relentlessly drive towards identifying real root causes, reducing incidents, and improving application availability. Furthermore, you will develop strategies, roadmaps, and metrics that communicate and enable transparency into operational KPIs, key risks, and progress. Leading, coaching, and developing a team of high-performing outcome-oriented engineers, including onshore, nearshore, and offshore personnel, will be a key aspect of your responsibilities.