Nvidia - Santa Clara, CA

posted 20 days ago

Full-time - Senior
Santa Clara, CA
Computer and Electronic Product Manufacturing

About the position

The position involves crafting the strategic vision for NVIDIA's internal cloud services and database platforms, focusing on optimizing operations and fostering innovation. The role requires collaboration with various teams to improve and oversee multi-cloud platforms, modernize internal database services, and ensure high performance and reliability across these services.

Responsibilities

  • Craft the strategic vision for internal cloud services and database platforms.
  • Collaborate closely with Engineering, IT, Security, and Product teams to optimize operations.
  • Improve, expand, and oversee the multi-cloud, Cloud Foundation Platform to support NVIDIA's internal applications.
  • Build and modernize internal database services including MySQL, PostgreSQL, MongoDB, Microsoft SQL Server, Oracle, and Milvus DB.
  • Lead efforts to automate and improve the scalability, reliability, and performance of cloud and database platforms.
  • Attract, develop, and retain top engineering talent while encouraging a culture of innovation and operational excellence.
  • Drive product execution plans and align cross-functional teams to achieve technical achievements.
  • Ensure high levels of performance, efficiency, and quality across database and cloud services.
  • Establish and lead processes for incident management, root cause analysis, and continuous improvement.

Requirements

  • A Bachelors, Masters, or related computationally focused science degree (or equivalent experience).
  • 15+ years of overall experience in software development, distributed systems, and cloud computing.
  • 5+ years in a senior leadership role managing SRE, DevOps, databases, or related teams.
  • Experience in people management, hiring outstanding talent, and building great infrastructure.
  • Deep understanding of public cloud computing challenges with an emphasis on security, operational excellence, and cost efficiency.
  • Background with incident management, post-incident analysis, and continuous improvement processes.
  • Experience in Python, C/C++, or a similar language.
  • Excellent verbal, written, and communication skills.
  • Strategic vision with good problem-solving skills.
  • Exhibit strong leadership and management skills.

Nice-to-haves

  • Background in building and operating enterprise-grade database services at scale.
  • Experience building and managing services in a multi-cloud environment with Kubernetes.
  • Proven track record to drive cultural and organizational change with automation towards a reliability-focused approach.

Benefits

  • Equity and benefits eligibility based on location and experience.
Job Description Matching

Match and compare your resume to any job description

Start Matching
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service