Apple - Austin, TX

posted 3 months ago

Full-time - Mid Level
Austin, TX
Computer and Electronic Product Manufacturing

About the position

Apple's Applied Machine Learning team is at the forefront of building platforms for large-scale data science applications that have a significant impact across various lines of business within the company. This role involves working with a talented group of software engineers, data scientists, SRE/DevOps engineers, and managers to manage and extract value from Apple-scale data. The team is dedicated to pushing the envelope by utilizing the latest open-source technologies and contributing to these projects. We are seeking a passionate and dedicated senior engineer who will focus on infrastructure and distributed systems to develop world-class data platforms and products across cloud environments at a very large scale. In this position, you will be responsible for automating processes and documenting them for the benefit of the team. You will need to be an independent problem-solver, capable of managing multiple competing priorities and delivering timely solutions. Your role will include providing incident resolution for technical production issues, maintaining accurate documentation, and training users on complex topics. You will also provide guidance to enhance the stability, security, efficiency, and scalability of systems, while determining future capacity needs and investigating new products or features. Strong troubleshooting skills will be essential, as you will be expected to isolate issues and resolve root causes through investigative analysis, even in environments where you may have limited prior knowledge or documentation. Additionally, you will administer backup systems and provide 24x7 on-call support for urgent critical issues. The team is committed to building a culturally diverse and pluralistic environment that reflects the multicultural variety of our customers.

Responsibilities

  • Automate processes and document them for team benefit.
  • Provide incident resolution for all technical production issues.
  • Create and maintain accurate documentation reflecting configuration.
  • Write justifications and status reports, and document procedures.
  • Interact with other Apple staff and management.
  • Provide guidance to improve system stability, security, efficiency, and scalability.
  • Determine future capacity needs and investigate new products/features.
  • Administer and ensure proper execution of backup systems.
  • Provide 24x7 on-call support for urgent critical issues.

Requirements

  • Experience operating and developing infrastructure and services in public cloud environments (AWS or GCP).
  • Experience with containers and container orchestration platforms such as Docker, Kubernetes, or equivalent.
  • Strong proficiency with Helm and Kustomize for managing Kubernetes applications and configurations through GitOps practices.
  • Experience with configuration management or Infrastructure as Code (IaC) tools such as Ansible, Terraform, and Crossplane is desired.
  • Passionate about operational excellence through proper automation and engineering processes using programming languages such as Go, Python, Java, or other JVM languages.
  • Proficient in working with Linux or other POSIX operating systems, shell scripting, and networking technologies.
  • BS in computer science with 5-7 years or MS plus 3-5 years experience or related experience.

Nice-to-haves

  • Familiarity with logging and observability technologies such as Splunk and Prometheus or similar.
  • Validated software engineering experience in design, testing, source code management, and CI/CD practices.
  • Experience in the design, implementation, and benchmarking of ML/deep learning algorithms.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service