Striveworksposted 7 days ago
$150,000 - $190,000/Yr
Full-time - Senior
Austin, TX
Publishing Industries

About the position

As a Senior Site Reliability Engineer (SRE) at Striveworks, you will be challenged-and trusted-on day one to take ownership of specific product deployments by maintaining, optimizing, and enhancing our on-premises and cloud computing environments. You will play a crucial role in the successful deployment of our software solutions to clients. You will be responsible for executing technical aspects of implementation projects and for ensuring the seamless integration, customization, and configuration of our software. Your expertise will play a critical role for the company as we deploy new instances of Striveworks' machine learning operations (MLOps) capabilities to customer infrastructure. You are right for this opportunity if you value and possess technical expertise and you enjoy pushing the boundaries of your capabilities. You will be responsible for maintaining Striveworks' software deployments using Infrastructure-as-Code (IaC) methodologies.

Responsibilities

  • Automating IaC to manage virtual machines and deploy containers, services, and other infrastructure; leaning on expertise to deploy custom Kubernetes clusters in AWS, Azure, GCP, on-premises, or hybrid cloud environments
  • Working with platform developers, DevOps, and customer-facing teams to define requirements and build solutions for customer use cases of the platform
  • Software deployments to commercial and, later, unclassified, CUI, Secret, and Top Secret Department of Defense (DoD) networks
  • Incident response and initial triage of critical system faults
  • Monitoring, automating, and improving software reliability, performance, and availability for various projects
  • Providing guidance and leadership to junior SRE team members

Requirements

  • 6+ years of direct, hands-on experience in microservice deployment in Kubernetes
  • Diagnosing and resolving issues within containerized environments
  • Helm Chart and Kustomizations development/deployment
  • Python and Bash programming
  • Automation and IaC (e.g., Terraform, Ansible)
  • Cloud infrastructure (e.g., AWS, Azure, GCP, or OpenStack)
  • Managing and troubleshooting Linux systems (e.g., RHEL, Ubuntu, CentOS)
  • The ability to work cross-functionally to define requirements and build solutions for customer use cases of the platform
  • The ability to respond professionally and competently to incident reports and triage critical system faults
  • US person (Permanent Resident or US Citizen), or otherwise able to obtain a Top Secret security clearance
  • Willingness and ability to obtain and maintain a Top Secret security clearance

Nice-to-haves

  • Active Top Secret security clearance and intimate familiarity with DOD networking, tools, infrastructure, security requirements, and policies
  • Experience with software deployments to on-premises and cloud-based unclassified, CUI, Secret, or Top Secret networks within the DOD
  • Deep knowledge of DevOps principles and practices for deploying and managing service mesh in cloud environments
  • Experience with DevSecOps/DevOps and CI/CD for the administration and deployment of GPU-enabled servers
  • Experience designing, managing, and optimizing workloads across multiple cloud providers
  • Experience deploying, maintaining, or contributing to Cloud Native Computing Foundation (CNCF) projects
  • Proficiency with US federal information system security policies, including Security Technical Implementation Guides (STIGs), NIST 800-171, NIST 800-53, CMMC, and ICD 503
  • Experience with network-attached storage (NAS) and storage area network (SAN) technologies
  • Experience with Kubernetes and cloud-native applications and services in denied, disrupted, intermittent, and limited impact (DDIL) environments
  • Experience with both blue-green and Canary deployment strategies
  • DOD 8570 IAT II certification (Security+ CE); proficient with security automation and familiarity with API security, container security, and cloud security

Benefits

  • Top-of-market salary and total compensation
  • Generous equity plan
  • Health/vision/dental insurance
  • Unlimited PTO
  • Parental leave
Hard Skills
Kubernetes
3
Ansible
1
Bash
1
CentOS
1
Linux
1
0De6w3u9kyUX sjmuVz5PSnqX
0
1ZSWPQXcpev70gL VAXResgiS
0
1dW3NBm
0
3fvaG4K8YRQk K9qLE1By
0
4MoDHux
0
4lLP9k3QNye rhc1tAjHMnykP
0
938y6 dvlW
0
Blebv4jax3
0
FTvoZ3Cbpl9jr4e DZ1 in8mT
0
HIcMrv5P N5ZmMTnAH
0
OtlbavmMs9r 0LRwpId8c
0
PKIgSfM5l Hur84bhz
0
UdmrkFTBK 98uo5nsGEIw
0
UwL1RbtNE W7ujDKvxkh
0
Xguz 7EWb
0
aTLEt F56U
0
c4Tx1gUH 5wzPN 4dDTcVky
0
fnrNej4HK zi2QOEbvSdW
0
itrOk7 lFsteQjbz
0
jzo9qKuVNdCYA dMV3ezP1n0
0
modTzfnQN nvXTcsCL93f2
0
nHkCvD5W aGEsL8XoxO05bWT
0
oBf4aYmL CHS6crzhN Pt5GhaUo
0
tFAT6q YG4ykqVvZP
0
tZaypgxwSC
0
ujMGPU KHj57ZGpUORnTya
0
xuJBetf8a stiwIglfB
0
yYiTIcGf QvjCT13H 6uRyYP4v
0
Soft Skills
Cay4O ShMHwFOUm
0
Build your resume with AI

A Smarter and Faster Way to Build Your Resume

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service