Comcast - York, PA
posted 2 months ago
FreeWheel, a Comcast company, is seeking a Site Reliability Engineer (SRE) to join our team. In this role, you will be responsible for ensuring the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning for the FreeWheel platforms. You will engage in designing, analyzing, and troubleshooting large-scale distributed systems, debugging and optimizing code, and automating routine tasks. As part of a team with a mix of software and technology infrastructure backgrounds, you will provide subject matter expertise, resolve complex break/fix scenarios, and collaborate with engineering, vendors, and client services to deliver successful technical solutions. This position requires a high degree of independence and the ability to develop approaches for non-routine solutions while following operational practices. Your core responsibilities will include overseeing the reliability and technical operations of the FreeWheel TV Platform Ad-Serving components, leading technical solutions to measure and improve reliability, quality, and efficiency of FreeWheel platforms. You will conduct complex analytical duties in the planning, deployment, testing, and evaluation of FreeWheel products, and support high-profile live events such as the Super Bowl, Olympic Games, March Madness, and FIFA World Cup. You will also be involved in the software release cycle, working closely with developers and tech leads to ensure that software releases are well designed, planned, implemented, released, and monitored. Additionally, you will lead the design and implementation of infrastructure as code, focusing on best practices, tool use, and quality assurance. As an SRE, you will advocate for engineering and technical operations procedures, policies, processes, and SRE best practices. You will partner with developers and vendors to identify and drive improvements in production quality, operational efficiency, and engineering productivity. You will also provide support for the Cybersecurity program, including patching, vulnerability cleanup, secure server configuration, and incident remediation efforts. Training and coaching of peers and junior SRE team members will also be part of your responsibilities. This role requires regular attendance and the ability to work nights and weekends as necessary, including on-call shifts.