Apollo Solutions - San Francisco, CA
posted 3 months ago
Apollo Solutions is seeking a Site Reliability Engineer (SRE) to join a pioneering artificial intelligence company that is revolutionizing the use of AI and machine learning in gaming and security sectors. This organization is engaged in significant projects with government contracts and gaming console companies, making it an exciting time to be part of their team. The SRE will collaborate closely with other engineers to implement best practices, establish monitoring and alerting systems, and manage incident responses effectively. In this role, the Site Reliability Engineer will lead the charge in promoting SRE best practices throughout the organization. This includes setting the technical direction for the company's cloud infrastructure, which encompasses both AWS and GCP. The SRE will be responsible for designing and building systems that prioritize high availability and scalability, ensuring that the infrastructure can support the company's ambitious goals. Additionally, the SRE will play a crucial role in fostering a DevOps culture within the engineering organization, encouraging collaboration and efficiency among team members. Mentorship of junior engineers will also be a key responsibility, helping to develop the next generation of talent within the company.