Motion Recruitment - Phoenix, AZ
posted 2 months ago
The Site Reliability Engineer (SRE) position is a fully remote role focused on enhancing the observability of a new cloud platform being developed by a financial client. This role is crucial as the company transitions to a cloud-first mindset, and the SRE team is in its early stages, tasked with identifying and addressing gaps in observability across various shared services such as Mulesoft, Collibra, Confluent, Kafka, and APIC. The SRE team will not be responsible for day-to-day monitoring but will instead focus on creating tools and utilities that enhance visibility and provide critical data metrics for these services. The ideal candidate will be a problem solver with a strong background in Infrastructure as Code (IaC), configuration management, and cloud technologies, particularly Google Cloud Platform (GCP), although experience with other cloud providers is also acceptable. The role emphasizes hands-on technical skills, particularly in Linux systems, and requires proficiency in scripting, especially with Python. The SRE will play a pivotal role in shaping the observability strategy of the infrastructure and platform, ensuring that the systems are robust and reliable.