First Citizens Bank - Scottsdale, AZ

posted 29 days ago

Full-time
Scottsdale, AZ
Credit Intermediation and Related Activities

About the position

As a Site Reliability Engineer at First Citizens Bank, you will play a crucial role in ensuring the performance, reliability, and availability of critical applications. This position involves working with customer-facing systems and driving adherence to service level objectives (SLOs) through effective monitoring and scaling. You will be responsible for maintaining and troubleshooting large-scale application deployments while collaborating with both technical and non-technical teams to promote best practices in Site Reliability Engineering (SRE).

Responsibilities

  • Own the availability, performance, and reliability of customer-facing systems.
  • Drive adherence to SLOs through monitoring, alerting, and scaling.
  • Engage in software development within an Enterprise Java Environment, utilizing Spring Boot and Python for CI/CD pipelines.
  • Maintain, support, and troubleshoot critical, large-scale application and infrastructure deployments.
  • Analyze and troubleshoot application, operating system, networking, configuration, and performance issues.
  • Understand and apply Site Reliability Engineering concepts and best practices.
  • Execute system deployments in AWS, private cloud, and OpenShift environments.
  • Design, document, and implement automated procedures.
  • Automate system administrative tasks using scripting tools, preferably Python or shell.
  • Understand Internet networking protocols such as TCP/IP, TLS, DNS, HTTP, and SMTP.
  • Utilize monitoring and automation tools like Ansible, Gitlab, Splunk, Grafana, and Prometheus.
  • Champion SRE best practices and communicate effectively with technical and non-technical staff.
  • Familiarize with system hardening and security best practices.

Requirements

  • Bachelor's Degree and 2 years of experience in Application Engineering, or High School Diploma/GED and 6 years of experience in Application Engineering.
  • Experience in software development in an Enterprise Java Environment, including Spring Boot and Python for CI/CD pipelines.
  • Aptitude for analyzing and troubleshooting application, operating system, networking, configuration, and performance problems.
  • Understanding of Site Reliability Engineering concepts and best practices.
  • Experience executing system deployments in AWS, private cloud, and OpenShift.
  • Experience automating system administrative tasks with scripting tools (Python or shell preferred).
  • Fundamental understanding of Internet networking protocols: TCP/IP, TLS, DNS, HTTP, SMTP.
  • Extensive experience with monitoring and automation tools such as Ansible, Gitlab, Splunk, Grafana, and Prometheus.

Nice-to-haves

  • 4+ years of experience in Software Engineering background.
  • 2+ years of experience implementing/following SRE practices.
  • Experience working in a large financial institution or similar environment in scope and complexity.
  • Hands-on experience with deploying and maintaining systems in a containerized environment (public or private cloud).
  • Ability to create meaningful metrics and alerting for service health monitoring.
  • Skilled with configuration management and automation frameworks.
  • Proficiency in driving Root Cause Analyses to meaningful improvements.
  • Leading troubleshooting efforts with production/non-production systems.

Benefits

  • Comprehensive benefits program for full-time associates (20+ hours).
  • Customized offerings designed to support families.
  • Access to various benefits programs as detailed on the company's benefits page.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service