Site Reliability Engineer

$130,000 - $155,000/Yr

Major League Soccer - New York, NY

posted 2 months ago

Full-time - Senior
Remote - New York, NY
Performing Arts, Spectator Sports, and Related Industries

About the position

We are seeking a Site Reliability Engineer (SRE) to lead and mentor our SRE and TechOps teams with a focus on automation to drive accountability, efficiency, and continuous improvement. This role involves building and maintaining observability frameworks to ensure system reliability, performance, and scalability, while fostering a culture of innovation through iterative enhancements. The SRE will ensure smooth platform operations, drive automation, and streamline workflows to reduce manual interventions, supporting both gameday and non-gameday activities.

Responsibilities

  • Develop and implement observability frameworks to monitor the health and performance of services, ensuring uptime and reliability.
  • Be the first line of defense in troubleshooting and resolving incidents without relying on runbooks, using strong problem-solving skills.
  • Perform thorough API testing for published content using tools like Postman and Cypress to ensure accuracy and performance.
  • Utilize Terraform for managing infrastructure, including ServiceNow integrations, and automate workflows.
  • Leverage Datadog or equivalent tools to set up monitoring, logging, and alerting systems.
  • Work closely with cross-functional teams to ensure seamless integration and deployment of services.
  • Manage and optimize AWS resources, including EKS and ECS, to ensure scalability and cost-efficiency.
  • Use GitLab pipelines for continuous integration and deployment, ensuring smooth and automated delivery of code changes.
  • Integrate tools like ServiceNow with Slack or Asana to streamline workflows and enhance team communication.
  • Lead and manage a team of highly skilled consultants and full-time professionals, cultivating a culture of innovation, accountability, and continuous improvement.

Requirements

  • Bachelor's degree in Computer Science, Information Technology, or a related field.
  • 7+ years of experience, with 5+ in Cloud Expertise and Technical Operations.
  • Proven background in architecting and managing cloud solutions (AWS, Azure, Google Cloud).
  • Hands-on experience in complex technology operations environments, including infrastructure, network, security, and incident management.
  • 2+ years managing or mentoring roles within technology operations (ITSM/ITOM) or a related field.
  • Proficiency in implementing automation tools and driving automation excellence within the organization.

Nice-to-haves

  • Advanced degrees or certifications (e.g., ITIL, AWS, Azure).
  • Familiarity with GCP and Azure.
  • Experience with Go, React/React Native.
  • ETL experience between third parties.

Benefits

  • Comprehensive and competitive medical, dental, and vision benefits.
  • $500 Wellness Reimbursement.
  • Generous PTO offering.
  • Hybrid Office/Remote Work Schedule.
  • On-the-job training and ongoing educational opportunities.
  • Office perks, discounts, and employee events.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service