Senior Associate - AWS Cloud Data Platform Site Reliability Engineer (SRE)

$82,500 - $140,000/Yr

New York Life - Lebanon, NJ

posted 2 months ago

Full-time - Mid Level

Lebanon, NJ

5,001-10,000 employees

Insurance Carriers and Related Activities

About the position

The Senior Associate - AWS Cloud Data Platform Site Reliability Engineer (SRE) role at New York Life involves building and maintaining a core data, reporting, and analytics platform for the Insurance & Agency Group. The position focuses on ensuring the reliability, performance, and scalability of cloud-based data infrastructure using AWS services, while contributing to innovative initiatives that enhance the company's digital landscape.

Responsibilities

Develop and maintain monitoring, alerting, and logging systems to proactively detect and resolve incidents.
Perform root cause analysis and implement solutions to prevent recurrence.
Manage incident response, including on-call rotations, triaging, and escalation.
Create and manage Infrastructure as Code (IaC) using tools like Terraform.
Automate deployments, scaling, backups, and disaster recovery processes.
Develop and maintain CI/CD pipelines to ensure smooth deployment and rollback processes.
Analyze performance metrics and optimize infrastructure and application performance.
Define and enforce Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
Conduct capacity planning and scaling to manage anticipated loads.
Implement security best practices, including network security, IAM policies, and encryption.
Conduct security audits and compliance checks to ensure regulatory adherence.
Respond to security incidents and implement remediation measures.
Work with development teams to ensure services are reliable, scalable, and easily monitored.
Collaborate with cross-functional teams to design, build, and maintain cloud infrastructure.
Identify and implement improvements to operational processes and workflows.
Design, implement, and test disaster recovery and business continuity plans.
Ensure regular backups and replication to minimize data loss and downtime.

Requirements

3+ years of experience as a Cloud Site Reliability Engineer.
1+ years of experience with AWS services (AWS S3, EC2, Glue, Redshift, RDS) in shared service or hybrid environments.
Proficiency in AWS services (EC2, S3, RDS, Lambda, VPC, CloudWatch, IAM, etc.).
Strong knowledge of scripting languages (Python, Bash, etc.) and automation tools (Terraform).
Experience with CI/CD tools and DevOps practices.
Familiarity with monitoring and logging tools.
Strong troubleshooting and problem-solving skills, with exposure to Machine Learning (ML) and Artificial Intelligence (AI) fields.
Exposure to industry-standard Data Governance processes and procedures.
Bachelor's degree in Computer Engineering, Computer Science, MIS, or a related field is preferred but not required.

Benefits

Leave programs
Adoption assistance
Student loan repayment programs
Comprehensive benefit options

Senior Associate - AWS Cloud Data Platform Site Reliability Engineer (SRE)

About the position

Responsibilities

Requirements

Benefits

Tools

Career Hubs

Guides

Company