Sirius XM Canada - Irving, TX

posted 3 months ago

Full-time - Mid Level
Irving, TX

About the position

SiriusXM is seeking a Senior DevOps Engineer to join our Operational Engineering (OE) Team, which is responsible for maintaining the stability, performance, and reliability of our 24x7 streaming infrastructure. This role is crucial in ensuring that our systems operate at peak efficiency and are available to our users at all times. The ideal candidate will possess a strong background in Linux, AWS, and Python programming, along with experience using APM tools such as Datadog. The position requires proactive monitoring, optimization, and troubleshooting of our systems to maintain operational excellence. As part of the OE Team, you will be expected to participate in an on-call rotation, ensuring coverage for after-hours deployments, maintenance windows, and incident responses. This is an individual contributor role that focuses on streaming infrastructure primarily within AWS, and it includes responsibilities for application monitoring, software build machines, and software support servers. The successful candidate will have a passion for solving complex technical problems and a commitment to ensuring high availability of our services.

Responsibilities

  • Provide design, development and support of API deployments, infrastructure changes and automated validations
  • Maintain and manage Linux-based infrastructure using best practices for system configuration, security and performance
  • Manage and optimize cloud-based services, ensuring high availability through monitoring and proper scaling
  • Utilize APM tools to identify and address service performance issues in a timely manner
  • Collaborate with development teams to ensure the reliability of RESTful APIs
  • Create automated tasks using shell scripts (bash, batch, python)
  • Monitor security events involving WAFs, IDS/IPS and access logs and manage network (TCP/IP) configurations including firewall ACLs
  • Document and maintain all technical design, code, build and release artifacts

Requirements

  • Minimum of 5 years IT/Engineering experience. 1 year in a 24x7 HA support environment
  • AWS Certified DevOps Engineer, Certified Solutions Architect, AWS Certified Data Analytics Specialty is preferred to demonstrate a good AWS knowledge base
  • BS Computer Science/Engineering, Information Sciences Technology or equivalent experience
  • Proven experience as a Site Reliability Engineer or DevOps Engineer
  • Excellent knowledge of Linux systems and administration
  • Expertise in Python or shell script programming for automation and tool development
  • In-depth understanding of APM tools, particularly Datadog
  • Thorough knowledge of RESTful service architecture and best practices
  • Exceptional troubleshooting and problem-solving skills
  • Strong communication and collaboration skills to work effectively and professionally with cross-functional teams
  • Familiarity with containerization and orchestration tools (e.g., Docker)
  • Exhibit excellent time management skills, with the ability to prioritize and multi-task with a high attention to detail under shifting deadlines in a fast-paced environment
  • This position requires 24x7 availability for support and after hours work in order to support the availability and uptime requirements of the business
  • Must have legal right to work in the U.S
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service