Intuit - Atlanta, GA

posted 9 days ago

Full-time - Mid Level
Atlanta, GA
1,001-5,000 employees
Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

About the position

The Engineering Leader for Site Reliability Engineering (SRE) at Mailchimp will oversee the reliability, scalability, and performance of applications utilized by both internal engineers and external customers. This role involves collaborating with cross-functional teams to design, implement, and maintain robust systems while driving a cultural change towards operational excellence within the organization. The ideal candidate will have extensive technical experience and a proven track record in managing high-scale, highly available systems.

Responsibilities

  • Drive a mindset of operational excellence across the Mailchimp Engineering organization.
  • Design and implement strategies for site reliability operations, including automation, monitoring, and maintenance processes.
  • Coach and develop engineers responsible for site reliability and performance.
  • Stay up-to-date with industry trends and emerging technologies to drive continuous improvement.
  • Coordinate with cross-functional teams, including engineering, operations, support, and product teams to ensure the reliability and consistency of our services.
  • Collaborate with other operational excellence teams across Intuit on shared best practices and learnings.
  • Provide technical guidance and mentorship to team members and stakeholders.

Requirements

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field, or equivalent experience.
  • 8+ years of experience in Site Reliability Management, with 3+ years in a management role.
  • Proven track record of managing teams of engineers and developing strategies for site reliability and performance operations.
  • Excellent communication skills and ability to lead cross-functional teams and stakeholders.
  • Proactive and results-driven attitude, with a passion for building reliable, scalable, and performant systems.
  • Proficiency in programming languages such as PHP, Go, Python, and Java.
  • Strong understanding of Linux/Unix systems and network protocols.
  • Experience with cloud platforms such as AWS and/or Google Cloud.
  • Expertise in containerization and orchestration technologies like Docker and Kubernetes.
  • Proficient in using monitoring and observability tools (e.g., Prometheus, Grafana, Splunk).
  • Experience with CI/CD pipelines and automation tools (e.g., Jenkins, GitLab CI, Github Actions, etc).
  • Knowledge of database management systems (SQL and MySQL) and caching technologies.
  • Familiarity with infrastructure as code (IaC) and configuration management tools (e.g., Terraform, Puppet).
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service