LE002 Duck Creek Technologies, LLC - Remote, OR
posted 2 months ago
Duck Creek Technologies is seeking a Senior Site Reliability Engineer (SRE) to join our dynamic team. In this role, you will be part of a group of technical engineers dedicated to delivering high-quality, production-ready software for our Cloud platform. As a remote position, you will engage with various engineering teams across the Duck Creek Cloud product suite, leading and supporting the design and implementation of core components to ensure optimal performance on Duck Creek Cloud. Your contributions will be critical in providing insights and recommendations, designing, creating, and training engineers to build world-class cloud solutions that are highly available, fault-tolerant, and secure. You will be responsible for evaluating new technologies that enhance Duck Creek's cloud solutions. Collaborating closely with cloud teams, product managers, and engineers, you will work to improve service delivery and reliability throughout the entire lifecycle of our services. Your role will involve driving technical efforts to adopt innovative solutions, identifying the root causes of errors and instability in our production cloud services, and advocating for changes that enhance reliability, resilience, and observability. You will leverage your deep technical leadership and interpersonal skills to foster result-based collaborations, identify and reduce toil through creative innovation and automation, and help Duck Creek lead the insurance industry in cloud solutions. This position requires a strong background in software solutions and infrastructure, with a focus on enterprise-scale continuous delivery environments. You will need to demonstrate a proven record of implementing infrastructure at an enterprise scale, experience with on-premises to cloud migrations, and a deep understanding of both Windows Server and Linux environments. Your expertise in cloud platforms, particularly Azure, and container orchestration technologies will be essential, as will your familiarity with APM and observability tools such as Dynatrace, Splunk, Prometheus, Grafana, and DataDog. You will also need to be comfortable working autonomously within a distributed team and possess knowledge of cloud and application security, as well as cloud design patterns for scalability and resiliency.