GitLabposted about 2 months ago
$103,600 - $222,000/Yr
Full-time • Mid Level

About the position

As a Site Reliability Engineer at GitLab, you are responsible for keeping all user-facing services and other GitLab production systems running smoothly. Our SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our operating environments and the GitLab codebase. As an SRE within the US Public Sector Services team, you primarily focus on operating a large number of GitLab environments, through automating all workflows - from provisioning new environments to daily operating tasks. This is the main difference between Environment Automation SRE and other SRE’s at GitLab; your day to day will be occupied by automation tasks of a large number of GitLab environments (and the environments used by GitLab), and operational tasks across many environments.

Responsibilities

  • Automate every operational task, such as package updates and configuration changes across all customer platforms without interruptions.
  • Develop a good early warning system and reliable maintenance tasks, such as library upgrades and version migrations.
  • Develop monitoring and alerting systems that predict capacity needs based on customer usage patterns.
  • Respond to user emergencies, platform alerts, and support requests.
  • Implement new and update existing security measures for the protection of GitLab infrastructure.
  • Collaborate with other engineering stakeholders to resolve larger architectural bottlenecks and establish strong operational readiness across teams.

Requirements

  • US Citizen or Permanent Resident.
  • Experience in running and operating production workloads.
  • Strong programming skills, preferably with Ruby and/or Go.
  • Strong background with Infrastructure as Code technologies, such as Terraform and Ansible.
  • Ability to reason about large systems and their operational behaviors.
  • Enjoy working with peers and collaborating across teams.
  • Experience regularly interacting with customers and a focus on resolving their requests with urgency.

Nice-to-haves

  • Experience with data templating tools such as Jsonnet.
  • Familiarity with cloud provider systems like GCP and AWS.

Benefits

  • All remote, asynchronous work environment.
  • Flexible Paid Time Off.
  • Team Member Resource Groups.
  • Equity Compensation & Employee Stock Purchase Plan.
  • Growth and development budget.
  • Parental leave.
  • Home office support.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service