Comcast - York, PA
posted 3 months ago
As the Lead Software Engineer - Site Reliability Engineering (SRE) at FreeWheel, a Comcast company, you will play a crucial role in ensuring the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning for the FreeWheel platforms. This position requires you to engage in designing, analyzing, and troubleshooting large-scale distributed systems, as well as debugging and optimizing code while automating routine tasks. You will be part of a diverse team that combines software and technology infrastructure expertise, providing subject matter expertise and resolving complex break/fix scenarios. Collaboration with engineering, vendors, and client services will be essential to deliver successful technical solutions. You will work with limited supervision, following operational practices and independently determining approaches for non-routine solutions. In this role, you will be responsible for the reliability and technical operations of the FreeWheel TV Platform Ad-Serving components. You will lead technical solutions aimed at measuring and improving the reliability, quality, and efficiency of FreeWheel platforms. Your responsibilities will include conducting complex analytical duties in the planning, deployment, testing, and evaluation of FreeWheel products. You will need to possess an in-depth working knowledge of FreeWheel platforms, infrastructure, internal processes, and teams/partners. Additionally, you will support FreeWheel-powered live events such as the Super Bowl, Olympic Games, March Madness, and FIFA World Cup. You will be involved in the software release cycle, working closely with developers and tech leads to ensure that software releases are well designed, planned, implemented, released, and monitored. A significant portion of your time, approximately 30%, will be dedicated to tools development, primarily using Python or Golang. This may include tools for Continuous Delivery, Infrastructure Scaling, and more. You will also lead technical solutions for infrastructure and application management, monitoring, and operations with a focus on standardization and automation. Your role will require you to leverage engineering methodologies and technical knowledge in specific areas of focus, lead code-level debugging on escalated issues, and manage on-call shifts, incident prevention, response, and retrospectives. Advocacy for engineering and technical operations procedures, policies, processes, and SRE best practices will be a key part of your responsibilities.