Annapurna Labs - Pflugerville, TX

posted 3 months ago

Full-time
Pflugerville, TX
Professional, Scientific, and Technical Services

About the position

Amazon Web Services (AWS) is seeking highly experienced Hardware Test Engineers, System Test Engineers, Manufacturing Test Engineers, and System Validation Engineers to join the Machine Learning Acceleration team. This team is responsible for enabling high-quality and efficient testing for the next generation of cloud server platforms. As a member of this team, you will play a crucial role in developing tests that ensure the functionality and capability of custom hardware used in the AWS server fleet. Your work will involve developing expertise in the entire system's functionality, as well as understanding the intended customer applications to stress the system from a customer perspective. In this role, you will collaborate with other engineering teams to develop, maintain, and improve manufacturing test code for both new and existing products. You will work with high-level and low-level operating system constructs to create first-boot images for products in manufacturing. Additionally, you will be responsible for developing and maintaining the deployment and distribution system to ensure that manufacturing partners have access to the appropriate versions of software as soon as they are available. You will also respond to new issues raised by manufacturing partners, analyze logs and failures, and develop and deploy solutions to those issues. Furthermore, you will create documentation and testing/debug procedures for manufacturing partners to follow. Key responsibilities include enabling and maintaining mass volume production testing, working with Original Design Manufacturers (ODMs) and Joint Design Manufacturers (JDMs) to verify stable high-quality execution, driving ODM and JDM deliveries to ensure production manufacturing quality, identifying and developing tests to enhance coverage and increase failure granularity, debugging test hardware and software used for system-level and server-level mass production, and developing manufacturing tests to exercise hardware components and collect data for large-scale analysis.

Responsibilities

  • Enable and maintain mass volume production testing, working with our ODMs and JDMs to verify stable high-quality execution
  • Drive ODM and JDM deliveries to ensure production manufacturing quality
  • Identify and develop tests needed to enhance coverage and increase failure granularity
  • Debug test hardware and software used for system level and server level mass production
  • Develop manufacturing tests to exercise H/W components and collect data for large scale analysis

Requirements

  • Bachelor's degree in Electrical Engineering or Computer Engineering
  • 4+ years of experience developing Embedded systems code and hardware interfaces (I2C, UART, SPI, JTAG, PCIe, etc.)
  • Experience with Python, BASH or other Scripting language
  • Experience analyzing yield and bin pareto
  • Experience working with system management components (BMC, BIOS, CPLD, etc.)
  • Experience with debugging and root cause investigations using hardware schematics and tools such as logic analyzers
  • Strong background working in UNIX environments

Nice-to-haves

  • Experience with C/C+
  • Experience working with fully integrated software/hardware systems
  • Experience exercising server level, PCB level and SOC level components
  • Experience with Embedded Linux device drivers and operating system
  • Experience interfacing with JDM or ODMs operations
  • Experience debugging high speed interfaces such as PCIe
  • Experience flashing H/W components in server chassis (BMC, BIOS, CPLD, etc.)
  • Experience working with U-BOOT environment
  • Ability to travel internationally up to 10% of the time
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service