Site Reliability Engineer

$81,250 - $146,875/Yr

Leidos - Oklahoma City, OK

posted 6 days ago

Full-time
Oklahoma City, OK
Professional, Scientific, and Technical Services

About the position

The Site Reliability Engineer at Leidos will support the Air Force Life Cycle Management Center by delivering comprehensive IT and support services that ensure mission success while adhering to DoD standards. This role focuses on the design, development, and deployment of systems and components that meet high reliability, availability, and maintainability standards.

Responsibilities

  • Develop and implement reliability test plans (e.g., FMEA, FTA, accelerated life testing).
  • Analyze test data, identify failure trends, and recommend improvements.
  • Develop and implement risk management strategies to mitigate system reliability risks.
  • Support lifecycle management of DoD systems, ensuring reliability throughout design, deployment, and sustainment.
  • Plan and conduct regular BC/DR test exercises.
  • Support design and deployment of a Business Continuity / Disaster Recovery (BC/DR) plan.
  • Contribute to reliability design reviews and system improvements.
  • Provide technical expertise to teams, resolving reliability issues and supporting investigations.
  • Prepare and present technical reports on reliability assessments and improvements.
  • Ensure compliance with DoD reliability standards and guidelines.
  • Document and report reliability activities, aligning with contractual and regulatory requirements.

Requirements

  • US Citizen with a Top Secret clearance
  • Bachelors Degree with 4+ years of experience or Masters degree with 2+ years of experience
  • Experience implementing hybrid cloud infrastructure, networking, and architectures
  • Strong experience with Microsoft Azure
  • Experience with implementing instrumentation schemes to support Service Level Agreement monitoring
  • Familiarity with DoD reliability standards (MIL-STD, DoD-STD) and system safety practices
  • Strong analytical, problem-solving, and communication skills
  • Ability to work in a team-oriented, collaborative environment with both government and contractor personnel.

Nice-to-haves

  • Experience in embedded systems, hardware-software integration, or complex aerospace systems.
  • Experience with Cloud Edge Devices, especially Azure Stack Hub or Azure Stack Edge systems.
  • Experience operating or implementing complex multi-site networks.
  • Experience operating infrastructure in a DevSecOps environment through Infrastructure as Code, Configuration as Code, or other 'as-code' implementations.
  • Experience performing in a systems administration role.
  • Relevant cloud certifications, especially Azure certifications.
  • Certification in Reliability Engineering (CRE or similar).
  • Familiarity with Lean Six Sigma, Agile, Scrum, or other continuous improvement methodologies.
  • Experience in a DoD program environment or with military stakeholders.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service