Site Reliability Engineer III

JPMorgan Chase - Chicago, IL

posted 3 months ago

Full-time - Mid Level

Chicago, IL

Credit Intermediation and Related Activities

About the position

As a Site Reliability Engineer III at JPMorgan Chase within the Corporate Technology division, you will be at the forefront of a rapidly evolving technology landscape. Your role will involve solving complex and broad business problems by developing simple and effective solutions. You will leverage your skills in code and cloud infrastructure to configure, maintain, monitor, and optimize applications and their associated infrastructure. This position requires you to independently decompose and iteratively improve existing solutions, ensuring that they meet the high standards expected in a mission-critical environment. In this role, you will be a significant contributor to your team, sharing your expertise in end-to-end operations, availability, reliability, and scalability of applications or platforms. You will guide and assist others in building appropriate level designs and achieving consensus among peers. Collaboration is key, as you will work closely with other software engineers and teams to design and implement deployment approaches using automated continuous integration and continuous delivery pipelines. Your responsibilities will also include implementing infrastructure, configuration, and network as code for the applications and platforms under your purview. You will collaborate with technical experts, key stakeholders, and team members to resolve complex problems, ensuring that service level indicators are understood and utilized to proactively address issues before they impact customers. Additionally, you will support the adoption of site reliability engineering best practices within your team, fostering a culture of reliability and efficiency.

Responsibilities

Guides and assists others in the areas of building appropriate level designs and gaining consensus from peers where appropriate
Collaborates with other software engineers and teams to design and implement deployment approaches using automated continuous integration and continuous delivery pipelines
Collaborates with other software engineers and teams to design, develop, test, and implement availability, reliability, scalability, and solutions in their applications
Implements infrastructure, configuration, and network as code for the applications and platforms in your remit
Collaborates with technical experts, key stakeholders, and team members to resolve complex problems
Understands service level indicators and utilizes service level objectives to proactively resolve issues before they impact customers
Supports the adoption of site reliability engineering best practices within your team

Requirements

Formal training or certification on site reliability engineering concepts and 3+ years applied experience
Proficient in site reliability culture and principles and familiarity with how to implement site reliability within an application or platform
Proficient in at least one programming language such as Python, Java/Spring Boot, and .Net
Proficient knowledge of software applications and technical processes within a given technical discipline (e.g., Cloud, artificial intelligence, Android, etc.)
Experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others
Experience with continuous integration and continuous delivery tools like Jenkins, GitLab, or Terraform
Familiarity with container and container orchestration such as ECS, Kubernetes, and Docker
Familiarity with troubleshooting common networking technologies and issues

Nice-to-haves

Ability to contribute to large and collaborative teams by presenting information in a logical and timely manner

Site Reliability Engineer III

About the position

Responsibilities

Requirements

Nice-to-haves

Tools

Career Hubs

Guides

Company