Site Reliability Engineer

$127,200 - $208,800/Yr

Microsoft - Redmond, WA

posted 2 months ago

Full-time - Mid Level
Redmond, WA
Publishing Industries

About the position

We are looking to hire a Site Reliability Engineer to join our team. Are you a customer-obsessed, AI-curious problem-solver who thrives in an inclusive, collaborative global team? The Azure Customer Experience Platform (CXP) team's mission is to transform Microsoft Cloud customers into fans. Through our deep engineering engagements with customers and teams across Microsoft, we analyze and amplify customer needs and drive the vision to improve Cloud quality, security, and reliability. Our culture of growth mindset and empowerment are central to who we are and how we work. We are part of the Azure engineering organization and consider great customer experiences critical to the overall success of Azure. We create, define, and lead product offerings that set our customers up for success, empower them to solve problems, and ensure they have a phenomenal experience if they need support. We empower 200+ product groups across Azure with apps, platforms, intelligent insights, and all the capabilities needed to enable consistent and excellent customer experiences across Azure services. Every day, our customers stake their business and reputation on our cloud. You can help CXP provide our customers with the world-class cloud services they need to succeed. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.

Responsibilities

  • Participate in on call coverage rotation (approx. 15% of time) for platform incident/crisis management.
  • Collaborate closely with Engineering/Program Management (PM) to drive product improvements based on customer signals.
  • Improve Customer experience by analysing signals from various sources, driving Root Cause Analysis (RCA's) and Service improvements involving bug fixes.
  • Work on continuously improving the Azure platform incorporating feedback from internal/external customers.
  • Identify requirements for increased customer resiliency and platform reliability.
  • Identify implementation of customer centric mitigation levers and playbooks for Operations.
  • Participate in the design of the next architecture for Cloud infrastructure services, focusing on strategic customer scenarios.
  • Embody our culture and values.

Requirements

  • 4+ years technical experience in software engineering, network engineering, or systems administration.
  • Bachelor's Degree in Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in software engineering, network engineering, or systems administration.
  • Master's Degree in Computer Science, Information Technology, or related field.
  • Ability to meet Microsoft, customer and/or government security screening requirements.

Nice-to-haves

  • 5+ years technical experience in software engineering, network engineering, or systems administration.
  • Bachelor's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration.
  • Master's Degree in Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in software engineering, network engineering, and 1+ year(s) people management experience.
  • Proven experience in Service Engineering experience in 24 x 7 x 365 enterprise environments.
  • Technical experience on Azure services and capabilities and/or cloud platforms.
  • Fluency in one or more automation languages (PowerShell, CLI etc.).
  • Understand High Availability, Disaster Recovery, Business Continuity, Performance Tuning.
  • Proven knowledge of Windows Platform or Linux, developer tools and ability to diagnose and debug user code.

Benefits

  • Health insurance coverage
  • Dental insurance coverage
  • 401k benefit for retirement savings plan
  • Paid holidays
  • Flexible scheduling
  • Professional development opportunities
  • Employee discount programs
  • Life insurance coverage
  • Mental health days
  • Paid volunteer time
  • Tuition reimbursement
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service