Lead Site Reliability Engineer

$112,151 - $262,854/Yr

Comcast - Philadelphia, PA

posted 6 days ago

Full-time - Mid Level
Philadelphia, PA
Broadcasting and Content Providers

About the position

The Site Reliability Engineer (SRE) at FreeWheel is responsible for ensuring the availability, performance, and reliability of the FreeWheel platforms. This role involves designing, analyzing, and troubleshooting large-scale distributed systems, as well as automating routine tasks. The SRE will work closely with engineering teams, vendors, and client services to deliver effective technical solutions, while also leading efforts in incident response, capacity planning, and operational efficiency.

Responsibilities

  • Be responsible for reliability and technical operations of FreeWheel TV Platform Ad-Serving component(s).
  • Lead technical solutions in measuring and improving reliability, quality, and efficiency of FreeWheel platforms.
  • Lead in a variety of complex analytical duties in the planning, deployment, testing, and evaluation of FreeWheel products.
  • Support FreeWheel powered live events such as Super Bowl, Olympic Games, March Madness, and FIFA World Cup.
  • Plug into software release cycle, working closely with developers and tech leads to ensure software releases are well designed, planned, implemented, released, and monitored.
  • Lead in design and implementation in authoring infrastructure as code with best practices, tool use, and quality assurance.
  • Lead technical solutions for infrastructure and application management, monitoring, and operations with standardization and automation focus.
  • Leverage engineering methodologies and technical knowledge in specific areas of focus.
  • Lead code level debugging on issues escalated to the team.
  • Lead on-call shifts, incident prevention, response, and retrospect.
  • Advocate for engineering and technical operations procedures, policies, processes, and SRE best practices.
  • Partner with developers and vendors to identify and drive improvements including production quality, operational efficiency, and engineering productivity.
  • Provide support and influence for the Cybersecurity program needs such as patching, vulnerability cleanup, secure server configuration, testing and validation, technical controls implementation, and cybersecurity incident remediation efforts.
  • Provide training and coaching to peers and more junior SRE team members.

Requirements

  • Bachelor's degree in computer science, a related engineering field, or equivalent practical experience.
  • 7 years of experience in software engineering with programming languages such as Python, Golang, or JavaScript.
  • 5 years of technical operation experience for business-critical applications over public cloud services, preferably AWS.
  • 5 years of experience with SDLC tools including Containers, Kubernetes, Docker, Salt/Ansible/Chef/Puppet, Jenkins, and Git.
  • Experience in Linux administration, network security, and system infrastructure.
  • Excellent communication and collaboration skills.

Nice-to-haves

  • Prior experience in supporting business-critical services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
  • Technical leadership and influence demonstrated in focused product/tech areas and practices.
  • Prior experience in providing technical solutions at an internet company.

Benefits

  • Comprehensive health insurance coverage
  • 401k retirement savings plan
  • Paid time off and holidays
  • Flexible scheduling options
  • Professional development opportunities
  • Employee discounts on services and products
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service