Comcast - Reston, VA

posted 3 months ago

Full-time - Mid Level
Reston, VA
501-1,000 employees
Broadcasting and Content Providers

About the position

As the Lead Software Engineer - Site Reliability Engineering (SRE) at FreeWheel, a Comcast company, you will play a crucial role in ensuring the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning for the FreeWheel platforms. This position requires you to engage in designing, analyzing, and troubleshooting large-scale distributed systems, as well as debugging and optimizing code. You will also be responsible for automating routine tasks, working within a team that combines software and technology infrastructure expertise. Your role will involve providing subject matter expertise, resolving complex break/fix scenarios, and collaborating with broader teams as necessary. You will partner with engineering, vendors, and client services to deliver successful technical solutions, all while working with limited supervision and direction. In this position, you will be responsible for the reliability and technical operations of the FreeWheel TV Platform Ad-Serving components. You will lead technical solutions aimed at measuring and improving the reliability, quality, and efficiency of FreeWheel platforms. Your duties will include conducting complex analytical tasks in the planning, deployment, testing, and evaluation of FreeWheel products. You will need to possess an in-depth working knowledge of FreeWheel platforms, infrastructure, internal processes, and teams/partners. Additionally, you will support FreeWheel-powered live events such as the Super Bowl, Olympic Games, March Madness, and FIFA World Cup. You will be plugged into the software release cycle, working closely with developers and tech leads to ensure that software releases are well designed, planned, implemented, released, and monitored. A significant portion of your time, approximately 30%, will be dedicated to tools development, primarily using Python or Golang. This may include tools for Continuous Delivery, Infrastructure Scaling, and more. You will also lead technical solutions for infrastructure and application management, monitoring, and operations with a focus on standardization and automation. Your role will require you to leverage engineering methodologies and technical knowledge in specific areas of focus, lead code-level debugging on issues escalated to the team, and take on-call shifts for incident prevention, response, and retrospection. Furthermore, you will advocate for engineering and technical operations procedures, policies, processes, and SRE best practices, while partnering with developers and vendors to identify and drive improvements in production quality, operational efficiency, and engineering productivity. You will also provide support for the Cybersecurity program needs, including patching, vulnerability cleanup, secure server configuration, testing and validation, and technical controls implementation. Training and coaching peers and junior SRE team members will also be part of your responsibilities, along with exercising independent judgment and discretion in significant matters.

Responsibilities

  • Be responsible for reliability and technical operations of FreeWheel TV Platform Ad-Serving component(s).
  • Lead technical solutions in measuring and improving reliability, quality, and efficiency of FreeWheel platforms.
  • Conduct complex analytical duties in the planning, deployment, testing, and evaluation of FreeWheel products.
  • Possess in-depth working knowledge of FreeWheel platforms, infrastructure, internal processes, and teams/partners.
  • Support FreeWheel powered live events such as Super Bowl, Olympic Games, March Madness, and FIFA World Cup.
  • Engage in the software release cycle, working closely with developers and tech leads to ensure software releases are well designed, planned, implemented, released, and monitored.
  • Lead in design and implementation of infrastructure as code with best practices, tool use, and quality assurance.
  • Dedicate approximately 30% of time to tools development, primarily in Python or Golang.
  • Lead technical solutions for infrastructure and application management, monitoring, and operations with a focus on standardization and automation.
  • Leverage engineering methodologies and technical knowledge in specific areas of focus.
  • Lead code level debugging on issues escalated to the team.
  • Lead on-call shifts, incident prevention, response, and retrospection.
  • Advocate for engineering and technical operations procedures, policies, processes, and SRE best practices.
  • Partner with developers and vendors to identify and drive improvements including production quality, operational efficiency, and engineering productivity.
  • Provide support and influence for the Cybersecurity program needs such as patching, vulnerability cleanup, secure server configuration, testing and validation, and technical controls implementation.
  • Provide training and coaching to peers and more junior SRE team members.

Requirements

  • Bachelor's degree in computer science, a related engineering field, or equivalent practical experience.
  • Prior 7 years of experience in software engineering with programming languages such as Python, Golang, or JavaScript.
  • Prior 5 years of technical operation experience for business-critical applications over public cloud services, preferably AWS.
  • Prior 5 years of experience with SDLC tools including Containers, Kubernetes, Docker, Salt/Ansible/Chef/Puppet, Jenkins, and Git.
  • Prior experience in Linux administration, network security, and system infrastructure.
  • Excellent communication and collaboration skills within and across teams.

Nice-to-haves

  • Prior experience in supporting business-critical services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
  • Demonstrated technical leadership and influence in focused product/tech areas and practices.
  • Prior experience in providing technical solutions at an internet company.

Benefits

  • Comprehensive health insurance coverage.
  • 401(k) retirement savings plan with company matching.
  • Paid time off and holidays.
  • Tuition reimbursement for further education.
  • Flexible work hours and remote work options.
  • Employee discounts on Comcast services.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service