Apple - Cupertino, CA

posted 4 months ago

Full-time - Senior
Cupertino, CA
Computer and Electronic Product Manufacturing

About the position

The System Intelligence and Machine Learning team at Apple is responsible for creating and refining datasets that power many of Apple's intelligent software applications. This role is crucial as it involves working with datasets that can range from small targeted sets to massive Petabyte-scale datasets. We are seeking an expert Machine Learning engineer or Data Scientist who possesses a deep understanding of machine learning and statistics to enhance the datasets utilized in Generative AI. As a senior member of the team, you will leverage Apple technologies to refine datasets, perform machine learning-based quality assurance, eliminate toxicity, and select appropriate images, videos, or texts through active selection and model-in-the-loop methodologies. Your focus will encompass various areas, including text processing across multiple languages, toxic language detection and removal, and the understanding and processing of images and videos. In this position, you will play a pivotal role in shaping Apple's datasets for generative AI by removing irrelevant or toxic assets, selecting the right assets using various asset selection algorithms, and synthesizing new datasets through the application of proprietary Apple machine learning models. Your statistical and machine learning expertise will be essential in building models and algorithms that can effectively select the right assets for machine learning experiences from a vast pool of available assets. Collaboration with data engineers will be key to integrating your models into data pipelines for large-scale datasets. You will also work closely with other AIML product stakeholders and partners to understand their needs and design machine learning models that enhance our understanding of data and automate the selection of assets for machine learning training. Regular evaluation and presentation of your work's progress will be part of your responsibilities, and your creative decision-making will be applied daily.

Responsibilities

  • Refine datasets used in Generative AI through machine learning techniques.
  • Perform ML-based quality assurance to ensure dataset integrity.
  • Remove toxic content from datasets and select appropriate assets for training.
  • Synthesize new datasets across various modalities including image, text, video, and audio.
  • Collaborate with data engineers to integrate models into data pipelines for large-scale datasets.
  • Work with AIML product stakeholders to understand needs and design machine learning models.
  • Evaluate and present progress of work to stakeholders.

Requirements

  • Proven track record in a Machine Learning Engineering or Applied Scientist role, preferably in a technology company.
  • Familiarity with a broad range of Machine Learning techniques and relevant statistical packages.
  • Experience in contributing to production code and rapid prototyping of algorithmic ideas.
  • Proficient in state-of-the-art ML techniques, particularly in Generative AI and Large Language Models.
  • Strong proficiency with Python, PyTorch, SQL-based languages, and Git.
  • Proven experience in data science and analytics, including statistical data analysis.
  • Outstanding communication and presentation skills.

Nice-to-haves

  • Experience in synthetic data generation for videos, images, text, and audio.
  • Strong analytical product intuition to guide product development using data.
  • Ability to understand complex technical products and collaborate with engineering leads.

Benefits

  • Comprehensive medical and dental coverage
  • Retirement benefits
  • Discounted products and free services
  • Reimbursement for certain educational expenses including tuition
  • Discretionary bonuses or commission payments
  • Relocation assistance
  • Participation in Apple's Employee Stock Purchase Plan
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service