AIML - Senior Software Engineer, On-Device Machine Learning

$170,700 - $300,200/Yr

Apple - Cupertino, CA

posted 4 months ago

Full-time - Senior

Cupertino, CA

Computer and Electronic Product Manufacturing

About the position

At Apple, the AIML - On-Device Machine Learning group is responsible for accelerating the creation of amazing on-device ML experiences, and we are looking for a tenured software engineer to help define and implement features that accelerate and compress large state of the art (SoTA) models (e.g., LLMs) in our on-device inference stack. We are a dedicated team working on groundbreaking technology in the field of natural language processing, computer vision, and artificial intelligence. We are designing, developing, and optimizing large-scale language/vision/multi-modal models that power on-device inference capabilities across various Apple products and services. This is a unique opportunity to work on powerful new technologies and contribute to Apple's ecosystem, with a commitment to privacy and user experience impacting millions of users worldwide. Are you someone who can write high-quality, well-tested code and collaborate cross-functionally with partner HW, SW, and ML teams across the company? If so, come join us and be part of the team that is helping Machine Learning developers innovate and ship enriching experiences on Apple devices!

Responsibilities

Build features for our on-device inference stack to support the most relevant accuracy preserving, general purpose techniques that empower model developers to compress and accelerate SoTA models (e.g., LLMs) in apps.
Convert models from a high-level ML framework to a target device (CPU, GPU, Neural Engine) for optimal functional accuracy and performance.
Diagnose performance bottlenecks and work with HW Arch teams to co-design solutions that further improve latency, power, and memory footprint of neural network workloads.
Analyze impact of model optimization (compression/quantization etc) on model quality by partnering with modeling and adaptation teams across diverse product use cases.
Focus on optimizing our software stack for efficient execution on Apple GPUs, ANEs, and CPUs.

Requirements

5+ years proven programming skills using standard ML tools such as C/C++, CUDA/Metal, PyTorch, Tensorflow.
Hands-on experience working on LLVM, compiler technologies, optimization techniques like quantization and sparsity-induction is a huge plus.
Solid understanding of state-of-the-art DNN optimization techniques and how they translate to hardware acceleration architectures, and a general ability to reason about system performance (compute/memory) tradeoffs.
Experience building APIs and/or core components of ML frameworks and strong attention to detail.
Capacity to iterate on ideas, work with a variety of partners from all parts of the stack - from Apps to Compilation, HW Arch, and Power/Performance analysis.
Excellent problem-solving (e.g. via building forward-looking prototype systems), critical thinking, strong communication, and collaboration skills.

Nice-to-haves

Experience with additional ML frameworks and tools beyond those listed in the requirements.
Familiarity with Apple's hardware architecture and performance optimization techniques.

Benefits

Comprehensive medical and dental coverage
Retirement benefits
Discounted products and free services
Reimbursement for certain educational expenses, including tuition
Discretionary bonuses or commission payments
Relocation assistance
Employee stock purchase plan with discounted stock options
Stock grants for employees at all levels.

AIML - Senior Software Engineer, On-Device Machine Learning

About the position

Responsibilities

Requirements

Nice-to-haves

Benefits

Tools

Career Hubs

Guides

Company