Site Reliability Engineer, Infrastructures - Seattle

$129,960 - $194,750/Yr

Tiktok - Seattle, WA

posted 3 months ago

Full-time - Mid Level

Seattle, WA

Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

About the position

TikTok is the leading destination for short-form mobile video, with a mission to inspire creativity and bring joy to over 1 billion users globally. Our Infrastructure Engineering team plays a crucial role in supporting the company's rapid growth by building and operating hyper-scale datacenters, managing the lifecycle of server fleets, providing cloud solutions, and developing various infrastructure services to ensure they are scalable and reliable. As a Site Reliability Engineer (SRE), you will combine software and systems engineering to build and run large-scale, massively distributed infrastructures. Your primary responsibility will be to ensure that our infrastructure services are reliable, fault-tolerant, efficiently scalable, and cost-effective. You will have the opportunity to manage a variety of complex systems at scale, including those that administer hyper-scale datacenters, public cloud, global content distribution networks (CDNs), and load balancers that handle terabits of traffic. In this role, you will be tasked with building, expanding, and operating Bytedance's global infrastructures, which include large-scale systems in both public and private clouds, data centers, and content delivery networks. You will also be responsible for building tools, automations, visualizations, and monitors to facilitate the operation and optimization of the global infrastructure. Working in a fast-paced environment, you will participate in technical operations and rotations in response to performance and reliability issues. Additionally, you will help improve the entire lifecycle of infrastructure services from inception and design through development, deployment, user support, and refinement.

Responsibilities

Build, expand and operate Bytedance's global infrastructures, including large-scale systems in public and private clouds, data centers and content delivery networks.
Build tools, automations, visualizations and monitors to facilitate the operation and optimization of the global infrastructure.
Participate in technical operations and rotations in response to performance and reliability issues in a fast-paced environment.
Help improve the whole lifecycle of infrastructure services from inception and design throughout development, to deployment, user support and refinement.

Requirements

Master's degree (or Bachelor's degree with 3+ years of experience) in Computer Engineering, Electrical Engineering, Computer Science or related major.
3+ years experience working with Unix/Linux systems from kernel to shell and beyond, with experience working with system libraries, file systems, and client-server protocols.
3+ years experience in one or more programming languages such as Java, C++, Go, or scripting experience in Shell and Python.

Nice-to-haves

Self-driven and capable of coping with ambiguity and moving projects from concept to delivery.
Strong analytical skills and the ability to solve real-world problems in a fast-moving environment.
Experience in designing, analyzing and building automation and tools for large scale systems.
Experience in building solutions with AWS, Google, Azure and other cloud services.
Experience in networking technologies such as TCP/IP, BGP, DNS, etc. in a carrier-grade environment.
Experience in developing and operating systems such as OpenStack, Kubernetes, Nginx, ipvs, ELK stack, Hadoop, etc.

Benefits

100% premium coverage for employee medical insurance, approximately 75% premium coverage for dependents, and a Health Savings Account (HSA) with a company match.
Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life and AD&D insurance plans.
Flexible Spending Account (FSA) options for Health Care, Limited Purpose, and Dependent Care.
10 paid holidays per year plus 17 days of Paid Personal Time Off (PPTO) (prorated upon hire and increased by tenure) and 10 paid sick days per year.
12 weeks of paid Parental leave and 8 weeks of paid Supplemental Disability.
Mental and emotional health benefits through EAP and Lyra.
401K company match, gym and cellphone service reimbursements.

Site Reliability Engineer, Infrastructures - Seattle

About the position

Responsibilities

Requirements

Nice-to-haves

Benefits

Tools

Career Hubs

Guides

Company