There are still lots of open positions. Let's find the one that's right for you.
The AIML - On-Device Machine Learning group is at the forefront of creating exceptional on-device machine learning experiences. This team is dedicated to building foundational machine learning frameworks and tools that optimize large language, vision, and multi-modal models, which are essential for powering on-device ML features across Apple's diverse range of products and services. We are currently seeking a senior software engineer who will play a pivotal role in defining and implementing features that enhance and compress state-of-the-art (SoTA) models, such as large language models (LLMs), within our on-device inference stack. This position offers a unique opportunity to engage with cutting-edge technologies and contribute significantly to Apple's ecosystem, all while maintaining a strong commitment to user privacy and experience that impacts millions of users globally. In this role, you will be responsible for building features that support accuracy-preserving, general-purpose techniques that enable model developers to compress and accelerate SoTA models in applications. This includes developing machine learning compilers, runtimes, execution kernels, and optimization tools for ML models, as well as creating tooling for debugging and visualizing these models. You will also convert models from high-level ML frameworks to target devices, such as CPUs, GPUs, and Neural Engines, ensuring optimal functional accuracy and performance. Writing unit and system integration tests will be crucial to ensure functional correctness and prevent performance regressions. Additionally, you will diagnose performance bottlenecks and collaborate with hardware and software architecture teams to co-design solutions that enhance latency, power consumption, and memory footprint of neural network workloads. Analyzing the impact of model optimization techniques, such as compression and quantization, on model quality will also be a key aspect of your role, requiring close partnership with modeling and adaptation teams across various product use cases.