American Express - Sunrise, FL

posted 2 months ago

Full-time - Senior
Sunrise, FL
Credit Intermediation and Related Activities

About the position

As a Senior Engineer in Site Reliability at American Express, you will play a pivotal role in ensuring the reliability and performance of our software systems. This position is part of a global Site Reliability Engineering (SRE) organization, where you will collaborate with Core Engineering and Platform Teams to align engineering efforts with strategic goals. Your work will involve managing complex challenges unique to American Express, leveraging your expertise in coding, algorithms, complexity analysis, and large-scale system design. You will be part of a diverse tech team that values collaboration, intellectual curiosity, and problem-solving, fostering an environment where your contributions are recognized and valued. In this role, you will be responsible for the technical aspects of software reliability for assigned applications, which includes designing, developing prototypes, and coding assignments. You will function as a leader on an agile team, contributing to software builds through consistent development practices, participating in architectural decisions, and leading code reviews and automated testing. Your responsibilities will also include debugging software components, consulting with teams to build standards for high availability, and implementing orchestration and automation solutions to enhance accuracy and reduce defects. You will drive monitoring requirements to ensure business-service level visibility and provide mentorship to software engineers on design patterns that resist failure. Additionally, you will introduce new technologies to the production support toolchain, helping to minimize friction during production releases and improving incident recovery processes. Your role will also involve facilitating the resolution of non-application issues, ensuring operational readiness throughout the application lifecycle, and being an efficiency multiplier for your team by analyzing workflows and driving productivity.

Responsibilities

  • Perform technical aspects of software reliability for assigned applications including design, developing prototypes, and coding assignments.
  • Function as a leader on an agile team by contributing to software builds through consistent development practices (tools, common components, and documentation).
  • Participate in architectural decisions to ensure software transaction flows are appropriately supported and designed.
  • Lead code reviews and automated testing.
  • Debug software components and identify code defects for remediation.
  • Consult with teams to build standards that drive the highest levels of availability.
  • Evaluate and implement orchestration, automation, and tooling solutions to ensure consistent processes and repetitive tasks are performed with a higher level of accuracy and reduced defects.
  • Drive monitoring requirements to ensure business-service level visibility for all support teams.
  • Provide mentorship to software engineers related to design patterns that are resistant to failure.
  • Build, implement and advise on recovery tooling to adhere to enterprise standards and/or frameworks.
  • Influence team members with creative changes and improvements by challenging status quo and demonstrating risk taking.
  • Introduce new and impactful technologies to the production support tool chain that help minimize friction for production releases and support, and more quickly diagnose and recover from production incidents.
  • Partner with appropriate supporting teams to ensure operational readiness throughout the application lifecycle.
  • Facilitate the resolutions of non-application issues (3rd party upstream and downstream issues, infrastructure issues, storage, database, network, file transfer etc.)

Requirements

  • At least 8 years of proven experience with system design, algorithms, data structures, analysis, and software design.
  • Bachelor's degree or equivalent experience in computer science, Technology, or Engineering.
  • Experience working in a 24/7 environment with on-call responsibilities to provide support to production support on a need basis.
  • Proven understanding of cloud native principles: service discovery, circuit breakers, observability, distributed tracing, automation and monitoring tools.
  • Demonstrated leadership and management experience in working with multi-functional, geographically dispersed teams on complex projects.
  • Understands team dynamics and experienced at building teams that deliver results.
  • Relentless drive to innovate in process and software to better meet the needs of our customers.
  • Good Understanding monitoring technologies including logging, time-series or machine-learning products from a product owners' point of view.
  • Knowledge of configuration management, release automation, and orchestration technologies.

Nice-to-haves

  • Experience in a broad range of software development and operations technologies such as Cloud Infrastructure, virtualization, load balancing, containers, JVM's, web servers, application debugging, queueing technologies, caching technologies, databases (RDBMS and NoSQL), routing and switching, etc.
  • Experience in modeling and architecting complicated business domains and associated methodologies/paradigms: i.e. Domain Driven Design, Event Sourcing, CQRS.
  • Proven track record implementing minimalistic event driven microservices chassis (not just Spring), i.e. Quarkus/Vert.x, Micronaut, Javalin, Ktor or non-JVM: Javascript, Go.
  • Excellent understanding of application development languages/platforms (Java, .NET, Go, Python, etc.) and importance of APIs and REST based services.
  • Excellent problem-solving, written, interpersonal and communication skills that drive executional impact at scale.
  • Combines deep technical expertise, a continuous improvement and automation approach, and systematic and rational root cause analysis to find opportunities to make things faster and better.
  • Appetite for trying new things and motivating change in a large and sometimes slow-moving organization.

Benefits

  • Competitive base salaries
  • Bonus incentives
  • 6% Company Match on retirement savings plan
  • Free financial coaching and financial well-being support
  • Comprehensive medical, dental, vision, life insurance, and disability benefits
  • Flexible working model with hybrid, onsite or virtual arrangements depending on role and business need
  • 20+ weeks paid parental leave for all parents, regardless of gender, offered for pregnancy, adoption or surrogacy
  • Free access to global on-site wellness centers staffed with nurses and doctors (depending on location)
  • Free and confidential counseling support through our Healthy Minds program
  • Career development and training opportunities
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service