Tiktok - San Jose, CA
posted 3 days ago
TikTok is the leading destination for short-form mobile video, and our mission is to inspire creativity and bring joy. As a Site Reliability Engineer (SRE) within the Server Architecture team, you will play a crucial role in ensuring the reliability and performance of our services. The SRE team at TikTok combines software and systems engineering to build and run large-scale, massively distributed, and fault-tolerant systems. In this position, you will have the opportunity to manage complex challenges of scale while utilizing your expertise in coding, algorithms, complexity analysis, and large-scale system design. Your responsibilities will encompass the entire lifecycle of services, from inception and design through development, capacity planning, launch reviews, deployment, operation, and refinement. You will design and implement software platforms and monitoring frameworks that facilitate efficient, automated, and intelligent service-oriented architecture (SOA) governance. Additionally, you will focus on scaling systems sustainably through automation and evolving system reliability, efficiency, and velocity by advocating for necessary changes. You will also practice sustainable user support, incident response, and conduct blameless postmortems to learn from incidents and improve our systems continuously. At TikTok, we believe that every challenge is an opportunity to learn, innovate, and grow as a team. We are committed to creating an inclusive environment where employees are valued for their skills, experiences, and unique perspectives. Join us in our mission to inspire creativity and bring joy to our users around the globe.