Nvidia - Santa Clara, CA
posted 3 months ago
NVIDIA is seeking an outstanding individual to join our platform SWQA team, where you will be responsible for the development and execution of test plans for the NVIDIA HGX/DGX/MGX platforms. This role involves working with the OS, firmware, and CUDA software stack, starting from design documentation. You will be tasked with installing and testing various systems, including operating systems and system firmware, while ensuring the reliability and performance of our products. Your responsibilities will include driving support for root cause analysis on reliability and validation test failures, identifying root causes, and implementing mitigation strategies. In this position, you will build, develop, and debug both system and OS-level automation frameworks and tests. You will also review test results from partners and suppliers, prescribing additional reliability testing on components, systems, and packaging as necessary. Working within an agile software development team, you will uphold very high production quality standards and manage the bug lifecycle, collaborating with inter-groups to drive solutions. This role is ideal for someone who thrives in a diverse work environment and possesses strong interpersonal skills, along with a commitment to continuous process improvement.