Nvidia - Santa Clara, CA
posted 3 months ago
NVIDIA is seeking a Senior Software QA Test Development Engineer to join our platform SWQA team. This role is pivotal in ensuring the reliability and performance of our cutting-edge HGX/DGX/MGX platforms. The successful candidate will be responsible for the development and execution of comprehensive test plans that encompass the OS, firmware, and CUDA software stack, derived from design documentation. This position requires a deep understanding of enterprise system integration and a strong background in operating systems, as well as experience in reliability testing utilizing various telemetry methods. The ideal candidate will thrive in a diverse work environment and possess exceptional interpersonal skills, demonstrating a commitment to continuous process improvement. In this role, you will install and test various systems, including operating systems and firmware, while driving support for root cause analysis on reliability and validation test failures. You will be tasked with identifying root causes and implementing effective mitigation strategies. Additionally, you will build and develop both front-end and back-end automation frameworks and tests at the system and OS levels. Collaboration is key, as you will review partner and supplier test results and recommend additional reliability testing for components, systems, and packaging as necessary. Working within an agile software development team, you will uphold high production quality standards and manage the bug lifecycle, collaborating with inter-groups to drive solutions. This position is ideal for a dedicated, forward-thinking individual who is passionate about technology and eager to contribute to NVIDIA's mission as the leading AI computing company. If you are looking for a challenging and rewarding opportunity to work with some of the most experienced professionals in the industry, this role is for you.