Squarespace - New York, NY

posted 13 days ago

Full-time - Mid Level
New York, NY
251-500 employees
Professional, Scientific, and Technical Services

About the position

The Senior Site Reliability Engineer (SRE) for Compute is responsible for ensuring the reliability, availability, and performance of the company's compute infrastructure. This role involves collaborating with development teams to design and implement scalable systems, automating processes, and troubleshooting complex issues. The SRE will also play a key role in capacity planning and performance tuning, ensuring that the infrastructure can support the growing demands of the business.

Responsibilities

  • Design and implement scalable and reliable compute infrastructure.
  • Collaborate with development teams to improve system reliability and performance.
  • Automate operational processes to enhance efficiency.
  • Monitor system performance and troubleshoot issues as they arise.
  • Participate in capacity planning and performance tuning activities.
  • Develop and maintain documentation for systems and processes.

Requirements

  • Bachelor's degree in Computer Science or related field.
  • 5+ years of experience in site reliability engineering or related roles.
  • Strong knowledge of cloud computing platforms (AWS, Azure, GCP).
  • Experience with container orchestration tools (Kubernetes, Docker).
  • Proficiency in scripting languages (Python, Bash, etc.).
  • Familiarity with monitoring and logging tools (Prometheus, Grafana, ELK stack).

Nice-to-haves

  • Experience with infrastructure as code (Terraform, CloudFormation).
  • Knowledge of networking concepts and protocols.
  • Familiarity with CI/CD pipelines and tools (Jenkins, GitLab CI).
  • Experience in a DevOps environment.

Benefits

  • Health insurance
  • 401k plan
  • Flexible working hours
  • Professional development opportunities
  • Paid time off
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service