DraftKingsposted 20 days ago
$148,000 - $185,000/Yr
Full-time • Senior
Boston, MA
Performing Arts, Spectator Sports, and Related Industries

About the position

We're defining what it means to build and deliver the most extraordinary sports and entertainment experiences. Our global team is trailblazing new markets, developing cutting-edge products, and shaping the future of responsible gaming. Here, "impossible" isn't part of our vocabulary. You'll face some of the toughest but most rewarding challenges of your career. They're worth it. Channeling your inner grit will accelerate your growth, help us win as a team, and create unforgettable moments for our customers. As a Lead Site Reliability Engineer, you will drive key initiatives to enhance the reliability, scalability, and efficiency of our infrastructure. You'll collaborate across teams to architect infrastructure automation while mentoring other Engineers to foster a culture of continuous learning and innovation. In this role, you will shape deployment strategies, performance tuning, and monitoring frameworks to support our rapid growth.

Responsibilities

  • Lead SRE initiatives across multiple projects and products, collaborating with cross-functional teams to shape platform and infrastructure engineering efforts across the organization.
  • Drive technical excellence by mentoring and guiding engineers, fostering a culture of continuous learning and innovation.
  • Architect and automate self-healing, fault-tolerant infrastructure with declarative configurations, GitOps, and event-driven automation for scalable deployments across public clouds and on-premise.
  • Design, develop, and maintain software-driven infrastructure automation to build internal tools and eliminate repetitive operational tasks.
  • Own and drive decisions on product deployment, performance tuning, monitoring, and alerting to ensure high availability and system efficiency in production.
  • Define key metrics and SLAs around new web services being created to support our rapid traffic growth.
  • Design and implement monitoring and alerting strategies to enforce application SLAs.

Requirements

  • At least 6 years of experience managing distributed cloud environments (GCP, AWS, vSphere, Nutanix) and platform automation at scale.
  • Deep expertise in container orchestration (Kubernetes) and container runtimes (Docker, containerd), with the ability to design, scale, and troubleshoot complex workloads.
  • Expert-level understanding of networking and web concepts, with the ability to debug issues down to the packet level.
  • Strong experience developing software for automation and infrastructure tooling (Go, Python).
  • Strong understanding of Linux-based operating systems, including performance tuning, kernel debugging, and low-level system optimizations.
  • Experience with Infrastructure as Code (IaC) and configuration management tools (Terraform, Ansible, Chef, etc.), ensuring scalable and repeatable infrastructure provisioning.
  • Understanding of applications written in object-oriented languages (C#/.NET, Java).
  • Experience leading engineering teams and guiding technology roadmaps in large-scale, distributed environments.

Benefits

  • Bonus
  • Equity

Job Keywords

Hard Skills
  • Ansible
  • Chef
  • Docker
  • Go
  • Java
  • AUozORT5lWLJEc2 UpXhxjESLiw
  • aV4LrD5
  • EYJUDy7g4mFn 713EaFY
  • fjHsxFQK9cO2vb WG4k7zuLRV1
  • fNgJudt64hs9anI Te9EDycMlnk
  • gHhtD63 aoTzJm
  • gPXAbF
  • hUfI2YenOc8E 5TbmIHA91Gnf
  • lo7DLWbVAz
  • M480mqkA
  • mVfj0FD jSy4kgfK5RQhM
  • N7Yr1nd5 JlUfMrQ13
  • O0I6WEZMezqK pJN6dt
  • O0rC2hnIoiA
  • tyRZwdXuQ6O 3Mi6SIY5T
  • VEF1 an3YSf0Aw
  • vZFdUNy04fKIMwE Fp7 CFZQs
  • Wi7GQk wAaOI2tE4Q
  • xNEqWL6 eQNUZXibqzCV
  • yL1ca9 RLobaOfZ1ep
Build your resume with AI

A Smarter and Faster Way to Build Your Resume

Go to AI Resume Builder
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service