Vizio Group - Denver, CO
posted 4 months ago
We live and breathe big data. On a daily basis, we ingest and extract useful information from hundreds of live TV channels as well as collect, analyze and report on information from millions of TVs. Today, with over 23 million devices and operating at a massive scale leveraging modern architecture, design, and technologies, there is a lot to manage and an appetite for tech modernization. This means you will have the opportunity to propose, design, and influence our current stack while helping to reimagine how we approach specific operational challenges such as administration, monitoring, logging, configuration management, and automation. We are actively seeking an experienced and highly skilled Senior Site Reliability Engineer (SRE) to join our dynamic team. As a key player on the Operations team, the ideal candidate will demonstrate senior-level proficiency in leveraging cloud technologies to ensure the reliability, scalability, and performance of our platform. In this role, you will play a crucial part in designing, building, and reviewing key SRE metrics, managing on-call responsibilities, and exceeding expectations of platform availability and incident response. You are graceful under pressure and ready to jump in when needed. You should have a foundational understanding of end-to-end architecture with modern microservice-based architectures and pipelines. You've been in the trenches designing, building, and the instrumentation and observability of highly-scalable and resilient modern-based applications from code to backend systems. Leveraging Golden signals as key indicators of system health and performance, you will establish and drive SLOs/SLIs across cross-functional teams and all levels of the organization. You'll work with others leveraging your ability to influence and provide site-reliability best practices to cross-functional teams to gain buy-in and support. Your responsibility will be to help anticipate every postmortem question about "whose job was that?" or "why don't we have this operational capability?" You should exhibit senior-level expertise estimating, building, and deploying large-scale systems deployed in a cloud environment which showcase a deep understanding of cloud architecture and infrastructure. This requires extensive experience in designing, implementing, and managing complex AWS solutions, leveraging services such as Route53, SQS, EC2, S3, and EK.