Software Engineer, Data Engine

$136,800 - $280,000/Yr

Tiktok - San Jose, CA

posted about 2 months ago

Full-time - Mid Level
San Jose, CA
Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

About the position

TikTok is the leading destination for short-form mobile video, and our mission is to inspire creativity and bring joy. The Data Engine team plays a crucial role in supporting various core business lines within TikTok and external enterprise customers through Volcano Engine. This team is dedicated to addressing big data architecture challenges for a massive 10EB level data set, aiming to create an industry-leading big data infrastructure. Additionally, the team provides cloud-native real-time data lake and data warehouse services to business customers through its LAS (LakeHouse Analytics Service) product. As a member of the Data Engine team, you will have the opportunity to collaborate with a highly skilled and dynamic team to build a cutting-edge big data infrastructure and architecture. You will dive deep into source code optimizations of major big data systems and represent TikTok at top-level conferences in the big data field, sharing the team's technical milestones and achievements. This role is pivotal in building a long-term competitive advantage for the data engine, ensuring that TikTok remains at the forefront of big data technology. The ideal candidate will be familiar with the principles and source code of one or more mainstream big data systems such as Spark, Presto, Flink, Hive, and HUDI. You will also need to have a strong understanding of data lake technologies, including Iceberg, HUDI, and DeltaLake, as well as the ability to diagnose failures and optimize performance in large-scale systems. Being a committer in major database projects like Spark, Flink, HUDI, Iceberg, Presto, StarRocks, Kafka, or Calcite is preferred.

Responsibilities

  • Build the industry-leading 10 EB-level big data platform that supports core products and businesses
  • Optimize and enhance big data system kernels, Spark SQL, Presto, Flink, Hive, HUDI etc.
  • Build the long-term competitive advantage of the data engine

Requirements

  • Familiar with principles and source code of one or more mainstream big data systems such as Spark, Presto, Flink, Hive, and HUDI
  • Familiar with data lake technologies including Iceberg, HUDI, and DeltaLake
  • Ability to diagnose failures and optimize performance in large-scale systems
  • Committers in major database projects (Spark, Flink, HUDI, Iceberg, Presto, StarRocks, Kafka, Calcite, etc) preferred

Benefits

  • 100% premium coverage for employee medical insurance
  • Approximately 75% premium coverage for dependents
  • Health Savings Account (HSA) with a company match
  • Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life and AD&D insurance plans
  • Flexible Spending Account (FSA) Options like Health Care, Limited Purpose and Dependent Care
  • 10 paid holidays per year
  • 17 days of Paid Personal Time Off (PPTO)
  • 10 paid sick days per year
  • 12 weeks of paid Parental leave
  • 8 weeks of paid Supplemental Disability
  • Mental and emotional health benefits through EAP and Lyra
  • 401K company match
  • Gym and cellphone service reimbursements
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service