Fireblocksposted about 2 months ago
$132,000 - $174,000/Yr
San Francisco Bay Area, CA
Professional, Scientific, and Technical Services

About the position

Want to build something new? Join us as one of the first members of our SRE team and help shape its future. You'll define processes, set best practices, and ensure scalability and reliability across our infrastructure. Working closely with R&D, you'll design and implement solutions to support the high-throughput needs of our platform. You'll have the opportunity to enhance the reliability and performance of our systems and make a significant impact as we continue to grow.

Responsibilities

  • Own the production infrastructure over AWS and Azure. Implement sustainable and scalable solutions with goals of improving availability and performance
  • Help Identify root causes for every incident and prevent incidents from ever happening again
  • Have alerts on symptoms and not on outages. Ensure all infrastructure and application alerts are 'actionable' alerts and/or self-healing automation
  • Work closely with the R&D and Support: offering education and guidance on integration, support, and monitoring across the toolset
  • Everything as a code approach: Run our infrastructure with Ansible, Terraform, and Kubernetes
  • Document every action and turn it into repeatable actions and then into automation
  • Focus on the system's observability, availability, reliability, performance/latency, monitoring
  • Conduct periodic on-call duties and emergency response

Requirements

  • At least 3+ years of experience as DevOps or SRE in a SaaS environment
  • Experience with Coding languages - Python/JavaScript/Bash, or similar
  • At least 3+ years of experience with Alerting & Monitoring systems such as DataDog Splunk / New Relic / Prometheus, or similar
  • Experience working with Linux systems from kernel to shell and beyond
  • Cloud systems such as AWS / Google cloud / Azure
  • Configuration management such as Ansible/Chef/Puppet
  • Experience with Docker, Kubernetes and Helm
  • SCM - Git/bitbucket/gitlab/Phabricator/gerrit
  • High Analytical & Troubleshooting skills - ability to solve complex problems
  • Strong verbal and written communication skills and a collaborative mindset
  • Ability to dive into detail while understanding the big picture

Nice-to-haves

  • DataDog extensive experience, monitoring/dashboard expert
  • Participated in Kubernetes migration projects
  • Previous experience as a C++ or Node Developer
  • BSC in Computer Science or related technical certifications
  • Previous experience in cryptocurrencies / blockchains - big advantage

Benefits

  • Competitive base salary range of $132,000 to $174,000
  • Target bonus
  • Competitive equity grant
  • Very generous benefits

Job Keywords

Hard Skills
  • Ansible
  • Bash
  • Chef
  • Datadog
  • Kubernetes
  • 9YKIf8n
  • CLVet1
  • CREF4d9pIY
  • e3TYRUP
  • i8bY nvaINe
  • JSICfy1qDOF
  • n9uMC0xQX zcQy2h
  • pnxtHPUh2l E0z42k38D
  • qorwS6v
  • S2OLyk0wEfVXHM GX1VyDgm9xN
  • TBWlGjfbcR4u
  • vDoL
  • WMk9N37dVof
Soft Skills
  • hU5rdGep aSKVMi85 N9PhGfd8
Build your resume with AI

A Smarter and Faster Way to Build Your Resume

Go to AI Resume Builder
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service