Visa - Austin, TX

posted 3 months ago

Full-time - Mid Level
Hybrid - Austin, TX
Credit Intermediation and Related Activities

About the position

As a Site Reliability Engineer at Visa, you will play a crucial role in ensuring the security and availability of our systems and applications, which handle an immense volume of transactions across the globe. Our operations are designed to be exceptionally reliable, with a focus on maintaining a secure environment for transactions that occur in fractions of a second. With millions of cyberattacks targeting our infrastructure, your primary responsibility will be to remediate any security findings promptly and ensure that our environments remain operational without any outages. You will be expected to conduct root cause analyses within hours of any incidents and ensure that all findings are addressed in the production environment after thorough testing in lower environments. In this role, you will take ownership of the environment, tracking all planned and ongoing activities. You will be responsible for deploying new code and regularly analyzing the environment for potential improvements. Automation will be a key focus, as you will work to increase self-healing capabilities within our systems. Collaboration with product developers will be essential when new services are introduced or when migrating existing environments to new technologies. Given the nature of our business, which operates around the clock, you will work in shifts and coordinate with multiple teams across various locations. Documentation will also be a critical aspect of your role, as you will prepare technical run books and ensure compliance with incident and change management processes. This position is hybrid, allowing you to alternate between remote work and office attendance, with an expectation of being in the office 50% or more of the time based on business needs. You will be part of a dynamic team that is committed to continuous improvement and innovation in the face of evolving technology and security challenges.

Responsibilities

  • Ensure the security and safety of the environment by remediating all security findings within the required resolution dates defined by governance.
  • Prevent outages, ensuring that environments are operational at all times, and conduct root cause analysis within hours of any issues.
  • Track all activities planned or occurring in the environments as the owner of the environment.
  • Deploy new code in the environment and regularly analyze the environment for improvements.
  • Automate manual tasks to increase self-healing capabilities of the environments.
  • Collaborate with product developers for planning when new services are introduced or when migrating old environments to new technologies.
  • Document all activities as per incident or change management processes and prepare technical run books for the team.
  • Work in shifts to support the 24/7 operations of the business, synchronizing with multiple locations and teams.

Requirements

  • 2 or more years of work experience with a Bachelor's Degree; OR 5 years relevant work experience.
  • 3 or more years of work experience with a Bachelor's Degree or more than 2 years of work experience with an Advanced Degree (e.g. Masters, MBA, JD, MD).
  • Engineering degree in IT or Computer Science.
  • 5 years of IT experience with expertise in DevOps, Build and Release Engineering, Cloud Infrastructure, and Automation.
  • Ability to work as a team player by collaborating with different cross-functional teams.
  • Good written and communication skills.
  • Great problem-solving and troubleshooting skills.
  • Ability to effectively prioritize and coordinate while multi-tasking.
  • Ability to learn fast and implement the latest technology trends in the industry.
  • Good understanding of different cloud providers (AWS, GCP, Azure) and Operating models.
  • Core skills in Dockers, Kubernetes, and Linux.
  • DevOps experience with Jenkins, Ansible, Docker, Kubernetes.
  • Experience in applications written in Go and Rust, with troubleshooting skills around different system integration issues.
  • Experience implementing CI/CD processes for seamless deployments.
  • Expertise in troubleshooting applications in middleware stacks like Tomcat, Apache, Kafka, MQ, and streaming services like Flink and Spark.
  • Ability to query Big Data Systems like Hadoop for reporting and alerting.
  • Ability to build deployment and build scripts using scripting languages such as Shell scripting (Bash), JavaScript, Python, or others.
  • Good understanding of monitoring solutions like Prometheus, Splunk, and Grafana.
  • Intermediate knowledge of Load Balancer and TCP layer architecture.

Nice-to-haves

  • Experience in creating deployments, services, and ingress flows for applications in Kubernetes clusters.
  • Participation in release-level discussions and familiarity with the total SDLC and Agile methodology.

Benefits

  • Medical
  • Dental
  • Vision
  • 401(k)
  • FSA/HSA
  • Life Insurance
  • Paid Time Off
  • Wellness Program
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service