Nvidia - Santa Clara, CA

posted 24 days ago

Full-time - Senior
Santa Clara, CA
Computer and Electronic Product Manufacturing

About the position

The Senior Firmware Architect - Server Manageability at NVIDIA is responsible for designing, implementing, and delivering innovative solutions for managing GPU-based AI servers. This role focuses on out-of-band management, firmware development, server architecture, and building systems for enterprise applications. The architect will lead design concepts, contribute to open standards, and work closely with global teams to ensure high-quality program success.

Responsibilities

  • Designing, implementing, and delivering innovations for managing GPU based AI servers.
  • Leading design for server manageability and security design concepts.
  • Designing system level solutions including complex hardware and firmware interactions.
  • Developing solutions using industry standard APIs such as Redfish, OpenBMC, DMTF PLDM/MCTP and OCP standards.
  • Driving a global team of firmware developers to achieve program success with high quality.
  • Presenting to partners on current and future design concepts.
  • Providing hands-on technical oversight and support to early NVIDIA technology adopters.
  • Working with security team to ensure developed code aligns with product security goals.
  • Influencing hardware design and reviewing hardware architecture & schematics.
  • Collaborating with QA/Test architects to develop proper test tools and automation for qualifying the system software and firmware stack.

Requirements

  • Domain expertise in Firmware development on X86 or ARM Platforms including BMC-BIOS communication, thermal management, power management, firmware update, device monitoring, firmware security.
  • Solid experience of end-to-end delivery of high-end enterprise servers from definition to customer deployment.
  • Solid understanding of low-level interfaces between SBIOS, BMC and OS like I2C/SPI/PCIe/JTAG.
  • Experience with PCIe enumeration and IO at platform level for enterprise systems.
  • Expertise in designing and developing solutions using industry standard APIs such as Redfish, OpenBMC, DMTF PLDM/MCTP and OCP standards.
  • Experience working closely with global partners and customers.
  • Proficiency in C/C++ development, bash/python for scripting, and debugging skills in embedded Linux operating environments.
  • Excellent written and oral communication skills, strong work ethics, and a commitment to quality work.

Nice-to-haves

  • Contributor to industry standards like Open Compute, IPMI, DMTF Standards, and open source.
  • Proven record in delivering BMC or equivalent manageability stack for enterprise servers.

Benefits

  • Equity options
  • Comprehensive health benefits
  • Flexible work hours
  • Diversity and inclusion programs
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service