Senior Software Engineer, DevOps

$358,000 - $408,000/Yr

Nextpit Gmbh - San Jose, CA

posted 5 months ago

Full-time - Mid Level
San Jose, CA
Broadcasting and Content Providers

About the position

Roku is changing how the world watches TV, and we are looking for a skilled Engineer with exceptional DevOps skills to join our team. As the #1 TV streaming platform in the US, Roku has set its sights on powering every television in the world. Our mission is to connect consumers to the content they love, enable content publishers to build and monetize large audiences, and provide advertisers unique capabilities to engage consumers. From your first day at Roku, you'll make a valuable contribution in a fast-growing public company where no one is a bystander. This role offers the opportunity to delight millions of TV streamers around the world while gaining meaningful experience across a variety of disciplines. In this position, you will be responsible for the automation and scaling of Big Data and Analytics tech stacks on Cloud infrastructure. Your responsibilities will include building CI/CD pipelines, setting up monitoring and alerting for production infrastructure, and keeping our tech stacks up to date. You will develop best practices around cloud infrastructure provisioning and disaster recovery, guiding developers on adoption. Collaboration with developers on system architecture will be essential for optimal scaling, resource utilization, fault tolerance, reliability, and availability. You will also conduct low-level systems debugging, performance measurement, and optimization on large production clusters and low latency services. Additionally, you will create scripts and automation that can react quickly to infrastructure issues and take corrective actions. Participation in architecture discussions, influencing the product roadmap, and taking ownership and responsibility over new projects will be key aspects of your role. You will collaborate and communicate with a geographically distributed team, ensuring that all systems are running smoothly and efficiently.

Responsibilities

  • Develop best practices around cloud infrastructure provisioning and disaster recovery, guiding developers on adoption.
  • Collaborate on system architecture with developers for optimal scaling, resource utilization, fault tolerance, reliability, and availability.
  • Conduct low-level systems debugging, performance measurement, and optimization on large production clusters and low latency services.
  • Create scripts and automation that can react quickly to infrastructure issues and take corrective actions.
  • Participate in architecture discussions, influence product roadmap, and take ownership and responsibility over new projects.
  • Collaborate and communicate with a geographically distributed team.

Requirements

  • Bachelor's degree or equivalent experience.
  • 4+ years of experience in DevOps and/or Reliability Engineering.
  • Experience working with monitoring and alerting tools (such as Datadog and PagerDuty) and being part of call rotations.
  • Experience with system engineering around edge cases, failure modes, and disaster recovery (GCP background preferred).
  • Strong background in Linux/Unix shell scripting (or equivalent programming skills in Python).
  • Experience scaling production systems running Big Data tools like Spark, Hadoop, Apache Druid, Looker.
  • Understanding of automation tools like Ansible, Terraform, AWS Opswork, Apache Airflow.

Nice-to-haves

  • Experience with Kubernetes and Terraform.
  • Experience with at least 3 of the technologies/tools mentioned: HAProxy, Kafka, Big Data/Hadoop, Presto, Spark, Airflow, Pinot, Druid, Opensearch, GCP, Data Proc.

Benefits

  • Health insurance (medical, dental, and vision)
  • Life insurance
  • Disability benefits
  • Parental leave
  • Wellness benefits
  • Paid time off
  • 401(k)/pension options
  • Mental health and financial wellness support and resources
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service