Comcast - York Haven, PA

posted 3 months ago

Full-time - Mid Level
York Haven, PA
Broadcasting and Content Providers

About the position

As the Sr. Software Engineer in Site Reliability Engineering (SRE) at FreeWheel, a Comcast company, you will play a crucial role in ensuring the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning for the FreeWheel platforms. Your responsibilities will include designing, analyzing, and troubleshooting large-scale distributed systems, debugging and optimizing code, and automating routine tasks. You will be part of a diverse team that combines software and technology infrastructure expertise, providing subject matter expertise and resolving complex break/fix scenarios. Collaboration with engineering, vendors, and client services will be essential to deliver successful technical solutions. You will work with limited supervision, following operational practices while independently determining approaches for non-routine solutions. In this role, you will be responsible for the reliability and technical operation of the FreeWheel TV Platform UI and API components. You will implement technical solutions aimed at measuring and improving the reliability, quality, and efficiency of FreeWheel platforms. Your duties will involve complex analytical tasks in the planning, deployment, testing, and evaluation of FreeWheel products, leveraging your in-depth knowledge of the platforms, infrastructure, and internal processes. You will support high-profile live events powered by FreeWheel, such as the Super Bowl, Olympic Games, March Madness, and FIFA World Cup, ensuring that the technical operations run smoothly. You will also engage in the software release cycle, collaborating closely with developers to ensure that software releases are well-designed, planned, implemented, released, and monitored. A significant portion of your time, approximately 30%, will be dedicated to tools development using Python or Golang, focusing on Continuous Delivery and Infrastructure Scaling. Your role will require you to engineer technical solutions for infrastructure and application management, monitoring, and operations, with an emphasis on standardization and automation. You will perform code-level debugging on issues escalated to the team and participate in on-call shifts to support incident prevention, response, and retrospectives. As an advocate for engineering and technical operations procedures, you will work closely with developers and vendors to identify and drive improvements in production quality, operational efficiency, and engineering productivity. Additionally, you will provide support for the Cybersecurity program, including patching, vulnerability cleanup, secure server configuration, and incident remediation efforts. Training and coaching peers and junior SRE team members will also be part of your responsibilities, requiring consistent exercise of independent judgment and discretion in significant matters.

Responsibilities

  • Be responsible for reliability and technical operation of FreeWheel TV Platform UI & API component(s).
  • Implement technical solutions for measurement and improvement on reliability, quality, and efficiency of FreeWheel platforms.
  • Perform a variety of complex analytical duties in the planning, deployment, testing, and evaluation of FreeWheel products.
  • Support FreeWheel powered live events such as Super Bowl, Olympic Games, March Madness, and FIFA World Cup.
  • Plug into software release cycle, working closely with developers to ensure software releases are well designed, planned, implemented, released, and monitored.
  • Participate in design and implementation in authoring infrastructure as code with best practices, tool use, and quality assurance.
  • Dedicate approximately 30% of time to tools development, written in Python or Golang.
  • Engineer technical solutions for infrastructure and application management, monitoring, and operations with a focus on standardization and automation.
  • Perform code level debugging on issues escalated to the team.
  • Work on-call shifts, supporting incident prevention, response, and retrospectives.
  • Act as an advocate for engineering and technical operations procedures, policies, processes, and SRE best practices.
  • Work closely with developers and vendors to identify and drive improvements including production quality, operational efficiency, and engineering productivity.
  • Provide support for the Cybersecurity program needs such as patching, vulnerability cleanup, secure server configuration, testing and validation, technical controls implementation, and cybersecurity incident remediation efforts.
  • Provide training and coaching to peers and more junior SRE team members.

Requirements

  • Bachelor's degree in computer science, a related engineering field, or equivalent practical experience.
  • 5 years of experience in software engineering with programming languages such as Python, Golang, or JavaScript.
  • 3 years of technical operation experience for business-critical applications over public cloud services, preferably AWS.
  • Experience with SDLC tools including Containers, Kubernetes, Docker, Salt/Ansible/Chef/Puppet, Jenkins, and Git.
  • Experience in Linux administration, network security, and system infrastructure.
  • Good communication and collaboration skills within and across teams.

Nice-to-haves

  • Prior experience in supporting business-critical services before they go live through system design consulting, developing software platforms, and frameworks.
  • Demonstrated technical leadership and influence in focused product/tech areas and practices.
  • Prior experience providing technical solutions at an internet company.

Benefits

  • Comprehensive health insurance coverage.
  • 401k retirement savings plan.
  • Paid time off and holidays.
  • Tuition reimbursement for further education.
  • Flexible work hours and remote work options.
  • Employee discounts on Comcast services.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service