Motion Recruitment - Arlington, VA
posted 3 months ago
This company is looking for a Site Reliability Engineer to lead a team responsible for building, managing, maintaining, and scaling the centralized infrastructure services that support our mission-critical operations. The role is based in Herndon, VA, and will remain remote-friendly, requiring a couple of days on-site each month. As a Site Reliability Engineer, you will oversee the design of software solutions that integrate Open Source, Commercial Off-The-Shelf (COTS), and custom-developed components. You will deploy, configure, and manage services across production, QA, and development environments on platforms such as OpenStack and Docker. In this position, you will build and manage infrastructure using Terraform and develop deployment automation tools using Ansible. You will also create automation and configuration management solutions with SaltStack and Jenkins, and implement encryption solutions with HashiCorp Vault. Additionally, you will contribute to the development of a large-scale Software Defined Network (SDN) using Guardicore, document processes, procedures, configurations, and deployment plans, and collaborate with technical teams to implement systems and software. Occasionally, you will provide operational support, including troubleshooting and problem resolution, and offer technical leadership in operational processes and change management while mentoring less experienced engineers. Regular progress updates to management will be part of your responsibilities, and you will participate in a 24x7 on-call rotation.