Abbott Laboratories - Pleasanton, CA
posted 4 months ago
As a Staff Site Reliability Engineer at Abbott, you will be a senior member of the Site Reliability Engineering (SRE) team, playing a pivotal role in establishing and executing a site reliability strategy specifically for the Heart Failure Division Medical Device Mobile and Cloud Digital Software portfolio, which includes both Class II and Class III devices. Your primary responsibility will be to partner with and influence our Architecture and Engineering teams to deliver highly resilient software solutions that meet the needs of our customers. In this role, you will implement SRE improvement processes and procedures, driving change within the organization. A strong software engineering background in a highly secured environment is essential, along with experience in DevOps, formal test automation, load testing, or SRE practices. You will leverage your extensive technical knowledge in the development, delivery, and implementation of complex and critical software systems. Your expertise in SRE principles, including Service Level Indicators (SLIs), Service Level Objectives (SLOs), Error Budgets, Toil, Observability, and Release Engineering, will be critical to your success. You will be expected to develop, communicate, and execute a vision that fosters the adoption of practices and tooling, thereby strengthening Abbott's position as a leader in the Heart Failure business. Your responsibilities will include developing a culture of SRE within our software development and operational practices, implementing a comprehensive SRE strategy in collaboration with the HF Digital team, and identifying critical KPIs and metrics to execute on the SRE roadmap. You will assist software engineering teams and business stakeholders in establishing and evolving reliability goals, automating manual processes, and managing continuous execution of tests. Additionally, you will work closely with various teams to resolve critical issues, evaluate service tiers, and participate in blameless postmortems to enhance future incident responses. Your role will also involve building robust CI/CD pipelines and partnering with customer support for rapid issue resolution.