Tiktok - Mountain View, CA

posted 4 days ago

Full-time - Mid Level
Mountain View, CA
Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

About the position

As a Site Reliability Engineer (SRE) in the Data Platform team at TikTok, you will play a crucial role in managing and enhancing one of the largest data platforms in the world that directly supports the TikTok app. Your primary responsibility will be to ensure that the data, services, and infrastructures are reliable, fault-tolerant, efficiently scalable, and cost-effective. This position is part of the U.S. Data Security (USDS) division, which focuses on data protection policies and content assurance protocols to keep U.S. users safe. In this role, you will engage in and improve the entire lifecycle of services, from inception and design through deployment, operation, and refinement. You will maintain services once they are live by measuring and monitoring availability, latency, and overall system health. This includes practicing sustainable incident response and conducting blameless postmortems to learn from incidents and improve future performance. You will also be responsible for establishing best engineering practices for both technical and non-technical team members, ensuring that everyone is aligned with the goals of reliability and efficiency. Additionally, you will design and implement reliable, scalable, robust, and extensible big data systems that support core products and business objectives. This role requires a strong foundation in software and systems engineering, as well as a commitment to collaboration and cross-functional partnerships. TikTok promotes a hybrid work schedule, requiring employees to work in the office three days a week, which may be adjusted based on departmental needs.

Responsibilities

  • Engage in and improve the whole lifecycle of service, from inception and design, through to deployment, operation and refinement.
  • Ensure reliable, fault-tolerant, efficiently scalable and cost-effective data, services and infrastructures.
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
  • Practice sustainable incident response and blameless postmortems.
  • Establish best engineering practice for engineers as well as non-technical people.
  • Design and implement reliable, scalable, robust and extensible big data systems that support core products and business.

Requirements

  • Bachelor's degree in Computer Science, a related technical field involving software or systems engineering, or equivalent practical experience.
  • Experience with algorithms and data structures.

Nice-to-haves

  • Solid communication and collaboration skills.
  • Experience with Big Data technologies such as Hadoop, M/R, Hive, Spark, Metastore, Presto, Flume, Kafka, ClickHouse, Flink.

Benefits

  • 100% premium coverage for employee medical insurance, approximately 75% premium coverage for dependents.
  • Health Savings Account (HSA) with a company match.
  • Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life and AD&D insurance plans.
  • Flexible Spending Account (FSA) Options like Health Care, Limited Purpose and Dependent Care.
  • 10 paid holidays per year plus 17 days of Paid Personal Time Off (PPTO) and 10 paid sick days per year.
  • 12 weeks of paid Parental leave and 8 weeks of paid Supplemental Disability.
  • Mental and emotional health benefits through EAP and Lyra.
  • 401K company match, gym and cellphone service reimbursements.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service