Unclassified - Phoenix, AZ

posted 5 months ago

Full-time - Senior
Phoenix, AZ

About the position

As a Senior Site Reliability Engineer at Circle, you will play a crucial role in designing, building, and maintaining the infrastructure that supports Circle's growing worldwide customer base. This position requires a deep understanding of public cloud providers and the ability to ensure that Circle's products and core systems operate consistently and performantly. You will be part of a dynamic and fast-paced environment where collaboration with cross-functional teams is essential. Your expertise will help in delivering exceptional customer experiences while continuously learning and developing your skills. In this role, you will support multiple development teams by providing an agile and responsive CI/CD platform that enables high-quality builds with measurable performance and quality. You will be responsible for building, maintaining, improving, scaling, and securing cloud infrastructure and resources using Infrastructure as Code (IaC) tools such as Terraform, CloudFormation, and Ansible. Automation will be a key focus, as you will automate operational tasks using programming languages like Go and Python, as well as serverless solutions like AWS Lambda and Kubernetes Jobs. You will also design, manage, and monitor Kubernetes clusters for various production workloads, and contribute to the development of Circle's blockchain infrastructure by creating and managing blockchain nodes across multiple blockchains, including Algorand, Ethereum, and Solana. Participation in an on-call rotation will be required to mitigate disruptions in production systems, along with conducting root cause analysis when issues arise. Additionally, you will plan and test disaster recovery scenarios for a highly available microservices architecture and collaborate with the Security team to maintain a strong security posture. Mentoring and engaging with team members will be an important aspect of your role, as you help grow and scale the team. This position offers a unique opportunity to work in a collaborative and innovative environment, making a significant impact on the company's infrastructure and customer experience.

Responsibilities

  • Support multiple development teams with an agile, responsive CI/CD platform to deliver high-quality builds with measurable performance and quality
  • Build, maintain, improve, scale, and secure cloud infrastructure and resources using IaC tools (Terraform, CloudFormation, Ansible)
  • Automate operational tasks via Go, Python, and serverless solutions (AWS Lambda, Kubernetes Jobs)
  • Design, manage, and monitor Kubernetes clusters for multiple production workloads
  • Drive forward blockchain infrastructure by creating and managing blockchain nodes across various blockchains (Algorand, Ethereum, Hedera, Flow, Solana, Stellar)
  • Participate in an on-call rotation to mitigate disruption for any production systems and conduct root cause analysis
  • Plan and test disaster recovery scenarios for a highly available microservices architecture
  • Collaborate with the Security team to create and maintain security-focused tools and frameworks
  • Engage and mentor team members and help grow and scale the team

Requirements

  • 4+ years in DevOps or SRE roles, with a focus on tooling, automation, and infrastructure on a major public cloud provider
  • Proficiency with coding and/or scripting in Go, Python, and Shell
  • At least 3 years of combined experience in building and maintaining CI/CD platforms and supporting agile engineering teams in building microservices
  • Experience with building Docker images and deploying containers in Kubernetes clusters
  • Familiarity with modern CI/CD platforms with complex gates and workflows
  • Knowledge of Blue-Green, Canary, and A/B Testing deployment strategies
  • Experience with distributed blockchain systems and maintaining blockchain full nodes
  • Familiarity with database technologies (PostgreSQL, Redis, OpenSearch)
  • Experience in migrating and transforming large, complex datasets from diverse sources
  • Knowledge of data warehousing tooling and services (Apache Airflow, AWS DMS, Snowflake)
  • Understanding of networking routing, DNS, load balancing, and edge networking
  • Familiarity with APM, RUM, monitoring, and telemetry tools
  • Experience with Helm charts and maintaining Kubernetes clusters
  • Ability to author and maintain IaC with Terraform and deploy resources in public cloud providers (AWS, Azure, GCP)
  • Strong skills in observability, troubleshooting, and performance solutions
  • Excellent communication skills and ability to explain technical concepts to peers and stakeholders

Nice-to-haves

  • 7+ years in DevOps or SRE roles, with a focus on tooling, automation, and infrastructure on a major public cloud provider
  • Experience leading teams technically on architecture and system design
  • Deep understanding of API design and REST principles
  • Experience with cloud services (AWS, Google Cloud, Microsoft Azure)
  • Familiarity with containers and Kubernetes
  • Strong focus on coding standards and code quality with a desire for excellent test coverage
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service