Nvidia - Santa Clara, CA

posted about 1 month ago

Full-time - Senior
Remote - Santa Clara, CA
Computer and Electronic Product Manufacturing

About the position

The Software Architect for Data Center Platform Simulation and Virtualization at NVIDIA will be responsible for designing and owning the system architecture of simulators for DGX and HGX Server platforms. This role involves collaborating with engineering teams and cloud service providers to enhance product offerings and ensure scalability and performance in data center systems.

Responsibilities

  • Drive requirements, architecture, and roadmap of NVIDIA DGX Simulation platforms.
  • Engage with major customers to understand their requirements and align with their roadmap and adoption strategy.
  • Work closely with hardware modeling, kernel & platform driver teams distributed globally.
  • Build and deliver full server simulation platform to internal and external NVIDIA partners.
  • Mentor architects and engineering teams to grow them into future leaders.
  • Make key technical decisions and mitigate execution risks by following left shift strategy.

Requirements

  • BS degree or higher in Computer Science or related field, or equivalent experience.
  • 10+ years of relevant experience in virtualization and HW simulation/emulation technologies.
  • Proven experience in designing architecture for scalable and performant server systems, particularly at the SW/HW interface.
  • Previous experience with hardware interfaces such as PCIe, SPI, I3C, etc., with Linux boot solutions on x86 & ARM platforms.
  • Good understanding of hypervisors & HW emulators, like Qemu, KVM, VDK, SIMICs, etc.
  • Experience in Out of Band and Inband management architectures.
  • Proficient in C/C++ with strong software development, optimization, user & kernel mode debugging skills.
  • Strong interpersonal & communication skills to work with a globally distributed engineering team.

Nice-to-haves

  • Experience in building left shift strategy around HW & SW stack bringup using Simulators & Emulators.
  • Contribution in Qemu/KVM open-source repositories.
  • Experience in Verilog and SystemC.
  • Knowledge of device management protocols such as MCTP, PLDM, and RDE.
  • Knowledge of system management protocols such as Redfish and IPMI.

Benefits

  • Equity options
  • Comprehensive health benefits
  • Diversity and inclusion programs
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service