Insulet Corporation-posted 9 months ago
$148,200 - $222,300/Yr
Full-time • Senior
Remote • Acton, MA
Miscellaneous Manufacturing

As a Staff SRE in Site Reliability Engineering (SRE) at Insulet, you will play a critical role in architecting, implementing, and maintaining highly available and scalable infrastructure and systems. You will lead a team of SRE engineers, driving best practices, develop a culture of automation, and ensuring the reliability of our services. This role requires a hands-on approach to solving complex technical challenges while providing technical leadership to the team.

  • Provide technical guidance and mentorship to the SRE team.
  • Drive the implementation of best practices in reliability, scalability, and performance.
  • Lead by example, demonstrating excellence in technical skills and problem-solving.
  • Collaborate with cross-functional teams to design scalable, resilient, and efficient systems.
  • Architect and implement infrastructure solutions that meet the requirements of high availability and performance.
  • Drive the adoption of modern technologies and tools to improve system reliability and efficiency.
  • Develop and maintain automation tools for provisioning, deployment, and monitoring.
  • Automate routine tasks to improve operational efficiency and reduce manual intervention.
  • Design and implement monitoring solutions to proactively identify issues and prevent service disruptions.
  • Lead incident response efforts, conducting post-mortem analysis, and implementing measures to prevent recurrence.
  • Develop & Automate runbooks and playbooks to streamline incident resolution processes.
  • Conduct capacity planning exercises to ensure systems can handle current and future loads.
  • Identify performance bottlenecks and optimize system performance through tuning and optimization efforts.
  • Collaborate with development teams to design and implement scalable architectures.
  • Document system architectures, configurations, and procedures.
  • Promote knowledge sharing within the team through technical presentations, workshops, and documentation.
  • Bachelor's in computer science, Engineering, or a related field.
  • 9+ years of experience in the field including 5+ Site Reliability Engineering, DevOps, or a similar role.
  • Proven experience architecting and managing highly available, scalable, and fault-tolerant systems.
  • Strong understanding of cloud computing platforms (e.g., AWS, Azure, GCP) and container orchestration technologies (e.g., Kubernetes).
  • In-Depth knowledge of AWS services including VPC, Lambda, IAM, ELB, EC2, ECS, CloudWatch, API Gateway, S3, SQS, SNS, WAF, X-Ray, and Route53 or GCP services including VPC, Cloud Functions, IAM, Cloud Load Balancing, Compute Engine, Google Kubernetes Engine (GKE), Stackdriver, API Gateway, Cloud Storage, Pub/Sub, Firebase Cloud Messaging, Cloud Armor, Cloud Trace, Cloud DNS.
  • Experience with infrastructure as code tools such as Terraform, Ansible, or similar.
  • Excellent troubleshooting and problem-solving skills.
  • Strong communication and leadership skills, with the ability to collaborate effectively with cross-functional teams.
  • Experience leading and mentoring engineering teams is highly desirable.
  • 100% remote working arrangements (may work from home/virtually 100%; may also work hybrid on-site/virtual as desired).
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service