This job is closed

We regret to inform you that the job you were interested in has been closed. Although this specific position is no longer available, we encourage you to continue exploring other opportunities on our job board.

Amazon.composted 4 days ago
$151,300 - $261,500/Yr
Full-time • Senior
Sunnyvale, CA
General Merchandise Retailers
Resume Match Score

About the position

Our Machine Learning training infrastructure (ML Infra) team is responsible for designing, implementing, and optimizing large-scale computing infrastructure that powers our cutting-edge AI and machine learning initiatives. We leverage advanced hardware, innovative software architectures, and distributed computing techniques to enable breakthrough research and product development across the company. We are seeking a Senior Machine Learning Engineer to join our team and lead the development of our next-generation ML training infrastructure. This is a high impact, high visibility role that will shape the future of our machine learning capabilities and contribute to the advancement of AI technology across the industry.

Responsibilities

  • Lead the definition, design, architecture quality, implementation, and delivery of the most advanced, most difficult, most cross-cutting, and/or most ambiguous challenges spanning across our ML infrastructure.
  • Align the teams in ML Infrastructure and related organizations to a coherent technical vision and deliver systems that fit well together.
  • Exert influence over multiple teams, increasing their productivity and effectiveness.
  • Hold peers and teams to a high bar for performance and efficiency, and aid teams through expert guidance and example.
  • Guide difficult trade-off decisions and drive awareness about the impact and consequences of technical decisions on AI research and product development.
  • Demonstrate significant innovation, creativity, and judgement when solving challenging AI/ML infrastructure problems.
  • Identify future skills needed across the organization and advocate for the development and/or acquisition of those skills to senior leaders.
  • Scout top talent and recruit them to the company.
  • Actively mentor senior and Principal engineers, scale yourself by developing and institutionalizing best practices in AI/ML infrastructure and distributed computing across the organization.

Requirements

  • 8+ years of professional software development experience in distributed systems with emphasis on ML infrastructure.
  • 8+ years of current programming experience building ML infrastructure using languages such as Python, C++ or Rust.
  • Hands-on experience with parallel computing platforms such as CUDA, OpenMP, etc.
  • Deep understanding of AI frameworks such as PyTorch, TensorFlow, and JAX, and their demands on underlying compute infrastructure, memory bandwidth, network interconnect, and storage as scale goes up.
  • Knowledge of emerging AI hardware accelerators and architectures.
  • Experience with containerization and orchestration technologies (Docker, Kubernetes).
  • Experience with cloud computing platforms (AWS, Azure, GCP) and their offerings.

Nice-to-haves

  • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience.
  • Bachelor's degree in computer science or equivalent.

Benefits

  • Equity and sign-on payments as part of a total compensation package.
  • Full range of medical, financial, and/or other benefits.

Job Keywords

Hard Skills
  • Docker
  • Kubernetes
  • OpenMP
  • Python
  • PyTorch
  • 39tuhwTrkUKb5e 3PHmnoS
  • 3TvfHOZmr NOb3 ZWaHQ6IRXb3x
  • 7Scqu6TWHUO zJ7q0SPNT9y6
  • 8bKhAfYLcV 7gX81bKLlR
  • aJfpPk5d RA0b6lLwK
  • AUodqaGSys9 nqvWhOaUkQDAV
  • bS2gG tVyOjh06R XCjGVEaQ
  • C4qfRWVTZH C23WofYyue
  • CIEQ8Xi9m5ga S0oWXOvwiY
  • CSyrknYi9dlc h0eWqI6FgQ
  • d0QROn6Fm 4ymMqDQPFwGv7
  • dxmj0rFb KBG0SlFPg
  • EaruSQBPqH ChjUfZ9pS76d0n
  • eubR6H3o 2sI3eTRkP
  • IHVCaTqNL6s
  • IUVtO
  • J7dmKsXDj h4XfuokC3
  • lUaeyX1Z NlHDJcmTi
  • njYhkUGar MtrRcxZH
  • nREev0P y69aV28r
  • qfu8jUGCDi SiZL SOjaDEFszn
  • RhuQl51tw ND6MnzmUYFOi
  • RXGaEh91r 2a3zHDph
  • TamMAWoQe3HLNC 3BklIQT
  • UwVGuP4ph WEItQ7F834N
  • uypqUxPv6XwJ ApHVU6Wklb
  • Y0DfVvBpA VMfU3StPXqO8
  • zQ5O6fX XBL9Rmf1tDq
Build your resume with AI

A Smarter and Faster Way to Build Your Resume

Go to AI Resume Builder
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service