Bytedance - San Jose, CA

posted 19 days ago

Full-time - Senior
San Jose, CA
Professional, Scientific, and Technical Services

About the position

The Senior Site Reliability Architect in Security Engineering at ByteDance is responsible for designing and implementing reliable and secure systems that support the company's mission to inspire creativity and enrich life. This role involves building security services and platforms, enhancing system reliability, and ensuring high availability of security products and services. The architect will work closely with cross-functional teams to develop scalable solutions and improve incident response capabilities.

Responsibilities

  • Design the roadmap for improving the reliability and stability of security building blocks.
  • Drive the design and implementation of SRE architecture or frameworks for high availability and reliability of existing security products and services.
  • Build an SRE framework for system deployment, upgrade, rapid troubleshooting, and disaster recovery.
  • Promote the design and development of SRE infrastructure and maintenance tools for full lifecycle security system development.
  • Responsible for capacity planning of security building blocks by analyzing past and future business development.
  • Assess changes in demand for system resources and proactively expand and optimize resources.
  • Drive the establishment and improvement of the system monitoring framework.
  • Enhance awareness of the system operation status.
  • Develop and implement incident response processes and contingency plans.
  • Improve the team's emergency handling capabilities.

Requirements

  • A bachelor's degree or higher in computer science, information security, or a related field.
  • 5 years of relevant experience in developing and maintaining large-scale distributed systems and SRE platform/tool.
  • Solid programming skills, proficient in at least one programming language among Go/Java/Python/Shell.
  • Familiar with at least one Web framework such as Gin/Django/Spring, with a considerable understanding of its design principles.
  • Solid background in operating systems, networks, storage, and computer architectures; familiar with cloud-native frameworks like Kubernetes.
  • Experienced in designing highly reliable and available system architectures, including distributed systems and microservice architectures.
  • Proficient in system troubleshooting and capable of identifying potential problems in the system.
  • Familiar with various popular and classic SRE technologies and tools (such as Ansible, ELK, Prometheus, and Grafana).

Nice-to-haves

  • A Master or PhD degree in computer science, information security, or a related field is preferred.
  • An international degree or international working experience.
  • Experience of R&D in software, especially large scale distributed systems.

Benefits

  • Medical, dental, and vision insurance from day one.
  • 401(k) savings plan with company match.
  • Paid parental leave.
  • Short-term and long-term disability coverage.
  • Life insurance.
  • Wellbeing benefits.
  • 10 paid holidays per year.
  • 10 paid sick days per year.
  • 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure).
Job Description Matching

Match and compare your resume to any job description

Start Matching
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service