Nvidia - Santa Clara, CA
posted 12 days ago
The Site Reliability Engineering (SRE) position at NVIDIA focuses on designing, building, and maintaining large-scale production systems with high efficiency and availability. This role combines software and systems engineering practices to ensure maximum reliability and uptime of GPU cloud services, while enabling developers to implement changes effectively. SREs at NVIDIA are responsible for automating processes, optimizing performance, and fostering a culture of continuous improvement and collaboration.