Umbra Lab - Santa Barbara, CA
posted 3 months ago
Umbra is seeking an experienced Site Reliability Engineer (SRE) to join our team in designing, building, operating, and scaling our mission-critical infrastructure. This role is pivotal in ensuring the reliability and scalability of our systems, which are essential for delivering high-quality satellite data to our customers. The SRE will work closely with cross-functional teams, including developers, product managers, and other stakeholders, to align on technical strategies and provide expert guidance. The position can be based in our Santa Barbara office or can be performed fully remotely, offering flexibility to the right candidate. The ideal candidate will possess a deep understanding of the entire technology stack and architecture, enabling informed decisions regarding technical debt and trade-offs. They will demonstrate leadership in technical innovation, advocating for new technologies and best practices while continuously refining existing processes to enhance efficiency and effectiveness across projects and services. Effective communication with both technical and non-technical stakeholders is crucial, as the SRE will foster collaboration and understanding across diverse teams. The role also involves driving impactful changes that benefit the entire team and extend beyond individual contributions. In this position, the SRE will ensure that critical systems meet service level agreements (SLAs) through proactive monitoring and effective incident response. They will develop and promote new technologies and tools, conducting research and creating proofs of concept to introduce solutions that enhance team capabilities. The SRE will lead by example in fostering a culture of excellence and reliability, continuously evaluating and improving team processes and workflows to increase efficiency and reduce complexity. Participation in on-call rotations to provide support and resolve complex technical issues is also a key responsibility.