Microsoft - Raleigh, NC

posted 10 days ago

Full-time - Principal
Remote - Raleigh, NC
Publishing Industries

About the position

The Principal Hardware Quality Engineer will be a key member of the Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) team, responsible for ensuring the quality and reliability of hardware infrastructure that supports Microsoft's cloud services. This role focuses on leading quality initiatives, conducting failure analysis, and driving continuous improvement processes to enhance product quality and operational efficiency in hardware manufacturing.

Responsibilities

  • Develop and implement a robust supplier quality management strategy to ensure high-quality standards in data center hardware manufacturing.
  • Lead quality issues at the system level, conducting debug and failure analysis for issues including GPU in the Azure fleet, and drive resolution with partners and suppliers.
  • Provide system-level technical guidance to stakeholders and lead through complex problems.
  • Drive continuous improvement processes based on Root Cause Analysis (RCA) and identified opportunities.
  • Responsible for quality readouts based on telemetry data analysis, clarifying status, actions, and next steps for issue resolution.
  • Establish Critical-to-Quality performance metrics to measure and improve product quality.
  • Act as the voice of quality in the hardware change management process, ensuring quality requirements are met and improved.
  • Mentor and develop team members, fostering a culture of excellence and innovation.

Requirements

  • Bachelor's Degree in Reliability Engineering, Electrical Engineering, or related field AND 8+ years technical engineering experience OR Master's Degree in Reliability Engineering, Electrical Engineering, or related field AND 7+ years technical engineering experience OR Doctorate Degree in Reliability Engineering, Electrical Engineering, or related field AND 5+ years technical engineering experience.
  • 5+ years of experience working with modern server architectures and/or their subsystems, including GPU, CPU, AI hardware, Memory, Motherboards, and methods for root cause analysis and debugging.
  • 3+ years of experience leading a large-scale taskforce to resolve technical problems and solutions.
  • Ability to meet Microsoft, customer, and/or government security screening requirements.

Nice-to-haves

  • Master's degree in Electrical Engineering, Computer HW, or System Engineering.
  • Leadership skills and ability to collaborate with diverse teams and drive a call to action.
  • 10+ years of experience working with modern server architectures and/or their subsystems, including GPU, CPU, AI hardware, Memory, and methods for root cause analysis and debugging.
  • 5+ years of experience leading a large-scale taskforce to resolve technical problems and solutions.

Benefits

  • Competitive salary range of USD $137,600 - $267,000 per year, with higher ranges in specific locations.
  • Ongoing professional development opportunities.
  • Flexible work arrangements, including up to 100% work from home.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service