Tiktok - Seattle, WA
posted 3 days ago
TikTok is the leading destination for short-form mobile video, with a mission to inspire creativity and bring joy to over 1 billion users globally. The company is seeking a Site Reliability Engineer (SRE) for its Data Platform Team, which is responsible for addressing challenges in data infrastructure and data products. This team manages various components, including the Query Engine, Logging and Data Ingestion Infrastructure, Experimentation Platform, and Workflow Management Platform. The primary goal is to support ad-hoc and interactive queries, batch pipelines, logging, and ingesting large volumes of real-time data, as well as facilitating A/B testing for product feature launches. As a Site Reliability Engineer in the data platform area, you will play a crucial role in managing one of the largest data platforms in the world. Your responsibilities will include ensuring that data, services, and infrastructures are reliable, fault-tolerant, efficiently scalable, and cost-effective. You will engage in the entire lifecycle of service, from inception and design to deployment, operation, and refinement. Additionally, you will maintain live services by measuring and monitoring their availability, latency, and overall system health, while practicing sustainable incident response and conducting blameless postmortems. Establishing best engineering practices for both technical and non-technical personnel will also be a key part of your role. The position offers the opportunity to design, build, and deliver various systems as a software engineer, contributing to the development of reliable, scalable, robust, and extensible big data systems that support TikTok's core products and business objectives. This role is ideal for individuals who are passionate about leveraging their technical skills to enhance the reliability and performance of large-scale data systems.