Apple - San Francisco, CA

posted about 2 months ago

Full-time - Senior
San Francisco, CA
Computer and Electronic Product Manufacturing

About the position

The Lead Systems Engineer / Systems Architect role at Apple focuses on enhancing Machine Learning (ML) infrastructure services and applications. This position involves collaborating with global teams to support ML compute platforms and model lifecycles, addressing both frontend customer stories and backend infrastructure challenges in a multi-cloud environment. The engineer will design architectural solutions, improve system reliability, and enhance operational stacks, ultimately contributing to the high-quality user experience that Apple is known for.

Responsibilities

  • Support internal customers of ML platforms and tools to improve development velocity.
  • Respond to interruptions or delays caused by failures in the system.
  • Create performance profiles for ML platforms and services, defining key performance indices (KPIs).
  • Drive engineering implementation and deployment for visibility of KPIs.
  • Design and enhance automation of operations for infrastructure and platforms.
  • Analyze network, load balancing, and throughput issues to design improvements.
  • Engage with platform and compute infrastructure engineering teams to enhance system reliability.
  • Collaborate with Apple Infrastructure teams to improve infrastructure primitives across environments.

Requirements

  • 15+ years of software development experience for platform operations or operational stack components.
  • Design-level understanding of system architecture and large-scale service operations.
  • Proficiency in scripting and programming languages such as Bash, Python, Golang, and Rust.
  • High-level networking knowledge and practical application across multiple cloud environments.
  • Demonstrable team lead or management experience.
  • Ability to identify problems independently and collaborate with partner teams.

Nice-to-haves

  • Demonstrated understanding of service level management, including configuration, security, performance, and troubleshooting.
  • Experience in developing large-scale ML jobs and knowledge of ML, including LLM.
  • Experience with analytics methods and pipelines for visualizing platform KPIs.
  • Experience in designing and implementing systems to support ML applications.
  • Experience with orchestration frameworks like Kubernetes for large-scale projects.

Benefits

  • Comprehensive medical and dental coverage.
  • Retirement benefits.
  • Discounted products and free services.
  • Reimbursement for certain educational expenses, including tuition.
  • Opportunity to participate in Apple's discretionary employee stock programs.
Job Description Matching

Match and compare your resume to any job description

Start Matching
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service