Tiktok - Mountain View, CA
posted 3 days ago
As a Site Reliability Engineer (SRE) within TikTok's U.S. Data Security (USDS) division, you will play a crucial role in ensuring the reliability and performance of the TikTok platform. This position is designed for individuals who are passionate about maintaining high service levels and enhancing the user experience through robust system management. You will gain a comprehensive understanding of the various components and services that power TikTok, allowing you to effectively monitor and maintain these systems to meet established service-level agreements (SLAs) and service-level objectives (SLOs). Your responsibilities will include measuring and monitoring system availability, performance, and overall health, ensuring that services are reliable, fault-tolerant, and efficiently scalable. In this role, you will collaborate with a global team to address site-up issues, providing user support and incident responses while conducting postmortems to learn from any incidents. You will also be tasked with scaling systems sustainably through automation and advocating for changes that enhance system reliability, efficiency, and velocity. This position requires a proactive approach to problem-solving and a commitment to continuous improvement, as you will be expected to evolve the systems you manage to better serve TikTok's user base. TikTok's mission is to inspire creativity and bring joy, and as part of the USDS team, you will contribute to this mission by ensuring that the platform remains a safe and enjoyable space for millions of users. The work environment is hybrid, requiring employees to be in the office three days a week, fostering collaboration and cross-functional partnerships. This role is subject to strict national security-related screening due to the sensitive nature of the data and information you will be working with.