LE002 Duck Creek Technologies, LLC - Remote, OR

posted 2 months ago

Full-time - Mid Level
Remote - Remote, OR
1,001-5,000 employees

About the position

Duck Creek Technologies is seeking a Senior Site Reliability Engineer (SRE) to join our dynamic team. In this role, you will be part of a group of technical engineers dedicated to delivering high-quality, production-ready software for our Cloud platform. As a remote position, you will engage with various engineering teams across the Duck Creek Cloud product suite, leading and supporting the design and implementation of core components to ensure optimal performance on Duck Creek Cloud. Your contributions will be critical in providing insights and recommendations, designing, creating, and training engineers to build world-class cloud solutions that are highly available, fault-tolerant, and secure. You will be responsible for evaluating new technologies that enhance Duck Creek's cloud solutions. Collaborating closely with cloud teams, product managers, and engineers, you will work to improve service delivery and reliability throughout the entire lifecycle of our services. Your role will involve driving technical efforts to adopt innovative solutions, identifying the root causes of errors and instability in our production cloud services, and advocating for changes that enhance reliability, resilience, and observability. You will leverage your deep technical leadership and interpersonal skills to foster result-based collaborations, identify and reduce toil through creative innovation and automation, and help Duck Creek lead the insurance industry in cloud solutions. This position requires a strong background in software solutions and infrastructure, with a focus on enterprise-scale continuous delivery environments. You will need to demonstrate a proven record of implementing infrastructure at an enterprise scale, experience with on-premises to cloud migrations, and a deep understanding of both Windows Server and Linux environments. Your expertise in cloud platforms, particularly Azure, and container orchestration technologies will be essential, as will your familiarity with APM and observability tools such as Dynatrace, Splunk, Prometheus, Grafana, and DataDog. You will also need to be comfortable working autonomously within a distributed team and possess knowledge of cloud and application security, as well as cloud design patterns for scalability and resiliency.

Responsibilities

  • Engage with engineering teams to design and implement core components for Duck Creek Cloud.
  • Provide insights and recommendations for building high-quality cloud solutions.
  • Evaluate new technologies to enhance Duck Creek's cloud offerings.
  • Collaborate with product managers and engineers to improve service delivery and reliability.
  • Drive technical efforts to adopt innovative solutions and improve operational excellence.
  • Identify root causes of errors and instability in production cloud services.
  • Advocate for changes that enhance reliability, resilience, and observability in systems.
  • Lead initiatives to reduce toil through automation and creative innovation.

Requirements

  • Degree preferred or equivalent years of practical job experience in a similar function or role.
  • 6 years of experience in design, architecture, creation, troubleshooting software solutions and infrastructure with 3 years of SRE experience or equivalent.
  • Experience with enterprise scale continuous delivery environments.
  • Deep knowledge of all parts of infrastructure with a proven record of implementing it at an enterprise scale.
  • Proven background in On Prem to Cloud migrations.
  • Deep background in Windows Server and Linux environments.
  • Experience with sustainable incident response in a blameless environment.
  • Knowledge of cloud platforms (prefer Azure) and container and orchestration technologies.
  • Experience with APM and Observability tools such as Dynatrace, Splunk, Prometheus, Grafana, DataDog.
  • Experience with Incident response related tools and methodologies, for example - PagerDuty, Blameless Post mortems.

Nice-to-haves

  • Knowledge of Cloud and application security preferred.
  • Strong knowledge of cloud design patterns for scale, data management, resiliency, etc., preferred.
  • A love for high quality and a knack for testing preferred.
  • Implementable opinions about design, cloud, dashboards, metrics, and SLO's preferred.

Benefits

  • Flexible work environment allowing remote, hybrid, or office work options.
  • Inclusive culture promoting diversity and equal opportunity.
  • Opportunities for professional development and continuous learning.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service