Senior Site Reliability Engineer

$135,400 - $250,600/Yr

Apple - Seattle, WA

posted 5 days ago

Full-time - Mid Level
Seattle, WA
Computer and Electronic Product Manufacturing

About the position

The Site Reliability Engineer (SRE) role within the Apple Services Engineering Cloud Service Infrastructure team focuses on supporting and scaling cloud services for millions of Apple users. This position involves building and maintaining critical infrastructural systems that support various services, including storage, caching, and queueing, at a massive scale. The engineer will have significant responsibility and influence over the core platform that powers many of Apple's internet services, impacting hundreds of millions of users.

Responsibilities

  • Support and scale cloud services for millions of Apple users.
  • Build and maintain critical infrastructural systems and frameworks.
  • Develop automations and instrument reliability tools.
  • Respond to alerts and incidents that may affect platform reliability.
  • Improve the reliability and efficiency of systems at scale.

Requirements

  • Bachelor's or Master's in Computer Science, Computer Engineering, or equivalent experience.
  • 5+ years of experience developing platform services.
  • Experience with large scale server provisioning and maintenance (OpenStack Ironic, Metal3, MAAS, xCat, Netbox, Tinkerbell).
  • Experience with development within the Kubernetes ecosystem, including operator framework, controllers, and CRDs.
  • Understanding of base internet infrastructure services including DNS, DHCP, LDAP, server virtualization, and server monitoring in critical, large scale distributed systems.
  • Understanding of SRE principles, including monitoring, alerting, error budgets, and fault analysis.

Nice-to-haves

  • Experience with hardware bootstrap and associated security (PXE, BIOS, TPM, secure boot, trusted computing).
  • Experience with hyperscale server provisioning and maintenance.
  • Knowledge of structured or unstructured storage and caching.
  • Experience in automating operations processes via services and tools.
  • Familiarity with configuration management and fleet orchestration via Puppet, Chef, Ansible, or others.
  • Experience with cloud services (AWS S3/EC2/CloudFront or equivalent).

Benefits

  • Comprehensive medical and dental coverage.
  • Retirement benefits.
  • Discounted products and free services.
  • Reimbursement for certain educational expenses, including tuition.
  • Discretionary bonuses or commission payments.
  • Relocation assistance.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service