Senior Site Reliability Engineer

Experian•Telangana, IN

174d

About The Position

The position involves identifying, designing, and implementing changes to existing services to enhance reliability, performance, and standardization across all AWS services, microservices, and serverless services. The role requires the ability to proactively identify inefficient resource utilization and remediate resources to improve platform stability and cost efficiency. You will integrate with product development teams to understand their services and support them while working on SRE-related platforms. Troubleshooting production issues, providing root cause analysis, and designing solutions to prevent future occurrences are key responsibilities. Additionally, you will plan and test for capacity growth, monitor services, create intelligent alarming for quicker incident detection and resolution, and build automations and internal tools to improve processes. Hands-on experience with Datadog and Splunk, as well as proficiency in Python, .NET, and Java, is essential.

Requirements

Degree in computer science or related field
10 years of programming skills
3+ years of application development experience in AWS
Experience working with Cloud environments and technologies (AWS)
Expertise writing and optimizing multi-threaded applications
Proficient in developing service-oriented (SOA) and REST architectures
Solid experience with microservices architecture and Domain driven design
Experience of web application security
Familiarity creating Docker Containers
Working knowledge of noSQL and relational databases
Attention to quality and detail
Experience in Agile/Scrum methodology
AWS optimization techniques
Amazon Web Services deployment and integration
Client integration security and vulnerabilities
Ability to easily integrate themselves with and work with new teams.

Responsibilities

Identify, design, implement changes to existing services to improve reliability, performance and standardization across all AWS services and microservices and serverless services.
Proactively identify inefficient resource utilization and remediate resources to improve platform stability and cost efficiency.
Integrate with product development teams to understand their services and support them on SRE related platforms.
Troubleshoot production issues, providing root cause analysis and designing solutions to prevent future occurrences.
Plan and test for capacity growth.
Monitor services and create intelligent alarming for quicker incident detection and resolution.
Build automations and internal tools to improve processes.

Benefits

Best family well-being benefits
Enhanced medical benefits
Paid time off

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Industry

Administrative and Support Services

Education Level

Bachelor's degree

Number of Employees

501-1,000 employees

Senior Site Reliability Engineer

About The Position

Requirements

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company