Apple - Cupertino, CA
posted 4 months ago
The System Intelligence and Machine Learning team at Apple is responsible for creating and refining datasets that power many of Apple's intelligent software applications. This role is crucial as it involves working with datasets that can range from small targeted sets to massive Petabyte-scale datasets. We are seeking an expert Machine Learning engineer or Data Scientist who possesses a deep understanding of machine learning and statistics to enhance the datasets utilized in Generative AI. As a senior member of the team, you will leverage Apple technologies to refine datasets, perform machine learning-based quality assurance, eliminate toxicity, and select appropriate images, videos, or texts through active selection and model-in-the-loop methodologies. Your focus will encompass various areas, including text processing across multiple languages, toxic language detection and removal, and the understanding and processing of images and videos. In this position, you will play a pivotal role in shaping Apple's datasets for generative AI by removing irrelevant or toxic assets, selecting the right assets using various asset selection algorithms, and synthesizing new datasets through the application of proprietary Apple machine learning models. Your statistical and machine learning expertise will be essential in building models and algorithms that can effectively select the right assets for machine learning experiences from a vast pool of available assets. Collaboration with data engineers will be key to integrating your models into data pipelines for large-scale datasets. You will also work closely with other AIML product stakeholders and partners to understand their needs and design machine learning models that enhance our understanding of data and automate the selection of assets for machine learning training. Regular evaluation and presentation of your work's progress will be part of your responsibilities, and your creative decision-making will be applied daily.