Talentburst - San Diego, CA

posted 3 months ago

Full-time - Mid Level
San Diego, CA
Administrative and Support Services

About the position

As part of SIE's Platform Experience group, the Platform Support team is a tight-knit group that operates and supports the core infrastructure foundation of PSN. The team works directly with software engineering teams to deliver services and configurations to enable our company to deliver new experiences and functionality to our millions of PlayStation customers. This Site Reliability Engineer (SRE) role will focus on providing direct, level one and two support to internal engineering teams. It will require collaborating with multiple global teams to ensure each customer request is addressed in a way that is reliable, secure, and supportable. The SRE will be responsible for building, deploying, and operating a combination of open source, custom written, and vendor provided software to support the PlayStation Network platform infrastructure. Additionally, the role involves contributing to automation and testing for service deployments to improve deployment processes, working towards 100% automation. Engaging directly with engineering customers on troubleshooting requests and guiding them on solutions is a key aspect of this position. The SRE will also identify opportunities for process improvement to reduce customer queue time, perform monthly service deployments for cloud platform services, and provide Tier 1/2 support for all foundational platform services. On-call duties for general troubleshooting of core services will also be part of the responsibilities.

Responsibilities

  • Build, deploy and operate a combination of open source, custom written, and vendor provided software to support the PlayStation Network platform infrastructure
  • Contribute to additional automation and testing for service deployments to improve deployment processes, working towards 100% automation
  • Engage directly with engineering customers on troubleshooting requests and guiding them on solutions
  • Identify opportunities for process improvement to reduce customer queue time
  • Perform monthly service deployments for cloud platform services
  • Perform on-call duties for general troubleshooting of core services
  • Provide Tier 1/2 support for all foundational platform services

Requirements

  • Ability to design and provide operational and infrastructural requirements that promote uptime, speed and security at all phases of the software lifecycle on a global scale
  • Excellent troubleshooting skills that span code, system, and network
  • Hands on experience in working with distributed systems and availability, reliability, scalability
  • Proven experience at building, deploying and operating services at scale in public cloud environments
  • Strong ability to troubleshoot complex issues ranging from system resources to application stack traces
  • Technical certifications or other demonstrations of passion in security and technology (e.g., CISSP, AWS Associate, open source projects, or equivalent)
  • Experience in developing tools for system configuration, deployment, and monitoring
  • Solid grounding in information security principles
  • Experience building and operating various core infrastructure services (prefer experience with multiple of these or similar technologies): Cloud Networking, Certificate Management, Software Delivery, Configuration Management, DNS, Traffic Management, Identity & Access Management, Network Access Management, Observability, Remote Access Solutions, Secure Images
  • Experience in public cloud services and deployment (AWS experience preferred)
  • Strong software development experience in Python, JavaScript, or Go (Python preferred)
  • Experience operating in regulated environments such as SOX/PCI

Nice-to-haves

  • Experience with open source projects
  • Experience in regulated environments such as SOX/PCI

Benefits

  • Inclusive work environment
  • Diversity and empowerment initiatives
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service