Senior Staff Site Reliability Engineer

$156,000 - $208,000/Yr

Wex Corporate Payments - Chicago, IL

posted 15 days ago

Full-time - Senior

Chicago, IL

Funds, Trusts, and Other Financial Vehicles

About the position

The Senior Staff Site Reliability Engineer (SRE) at WEX is a technical leadership role focused on enhancing the reliability, performance, and observability of the company's Benefits systems. This position involves designing and implementing complex systems, leading incident response efforts, and driving the adoption of SRE best practices across the organization. The ideal candidate will have extensive experience in software development, cloud computing, and operational excellence, and will work closely with engineering teams to ensure systems are secure and efficient.

Responsibilities

Provide technical guidance and mentorship to other SREs and engineers.
Lead the design and implementation of complex systems and solutions.
Drive the adoption of SRE best practices across the organization.
Architect and implement highly available, scalable, and fault-tolerant systems.
Optimize system performance and resource utilization.
Proactively identify and mitigate risks to system reliability.
Lead incident response efforts, driving efficient resolution and post-incident analysis.
Develop and implement processes to improve incident response capabilities.
Design and develop automation tools to streamline operational tasks and improve system reliability.
Utilize monitoring and observability tools to gain deep insights into system behavior.
Work closely with development teams to ensure software design meets operational requirements.
Foster a culture of collaboration and knowledge sharing across teams.
Forecast future capacity needs and implement strategies to ensure systems scale efficiently.
Continuously identify performance bottlenecks and lead efforts to optimize system performance.
Champion security best practices and ensure compliance with industry standards and regulations.
Stay current with emerging technologies and industry trends, evaluating and introducing new tools and techniques.

Requirements

7+ years of hands-on experience as a Site Reliability Engineer or equivalent role
7+ years of development experience with at least one major programming language
Expert-level knowledge of Cloud Computing platforms (AWS and Azure)
Proven ability to lead complex technical projects and initiatives
Strong communication and collaboration skills, with the ability to influence and build consensus
Deep understanding of observability, logging, and monitoring technologies
Experience with a variety of RDBMS and NoSQL data stores
Expertise in containerization technologies such as Docker and Kubernetes
Expertise in infrastructure as code
Experience designing and building RESTful APIs
Extensive hands-on experience with (Datadog, Splunk, or other tooling)
Familiarity with Agile methodologies and practices
Extensive experience in providing and leading critical application support in a 24/7/365 high-availability environment
Experience with GitOps
BA/BS degree in Computer Science or related technical field, or equivalent job experience

Benefits

Health insurance
Dental insurance
Vision insurance
Retirement savings plan
Paid time off
Health savings account
Flexible spending accounts
Life insurance
Disability insurance
Tuition reimbursement

Senior Staff Site Reliability Engineer

About the position

Responsibilities

Requirements

Benefits

Tools

Career Hubs

Guides

Company