HiveWatchposted 25 days ago
$120,000 - $140,000/Yr
Full-time • Mid Level
El Segundo, CA

About the position

As a Site Reliability Engineer (SRE), you will bridge the gap between development and operations, deploying updates to HiveWatch products, serving in an on-call rotation, & ensuring our services are reliable, scalable, and performant. You'll play a critical role in designing, implementing, and maintaining our infrastructure while collaborating closely with operations engineers and software engineers to solve complex technical challenges.

Responsibilities

  • Plan and execute software deployments to production environments with minimal customer impact
  • Participate in a regular on-call rotation to provide 24/7 coverage for critical systems
  • Respond to alerts and resolve incidents within defined SLA timeframes
  • Develop and maintain Terraform and Github Actions automation for deployment processes
  • Manage feature flags and configuration changes during deployments
  • Communicate deployment status to stakeholders when necessary
  • Maintain deployment documentation and runbooks
  • Design, build, and maintain scalable and reliable infrastructure
  • Implement and manage CI/CD pipelines for efficient and reliable deployments
  • Monitor system performance and respond to incidents in a timely manner
  • Collaborate with development teams to optimize application performance
  • Implement security best practices and ensure compliance with SOC2 and other security requirements
  • Document processes, configurations, and troubleshooting procedures
  • Continuously improve system reliability through post-incident reviews and proactive improvements
  • Develop and improve monitoring and alerting based on operational experience
  • Balance on-call responsibilities with regular work duties

Requirements

  • Bachelor's degree in Computer Science or related field, or equivalent practical experience
  • 3+ years of experience in systems administration, DevOps, or similar roles
  • Strong knowledge of AWS cloud platforms
  • Expertise managing AWS resources with Terraform
  • Proficiency in scripting languages (Python, Bash, etc.)
  • Familiarity with containerization and orchestration (Docker, Kubernetes)
  • Experience with monitoring and logging systems (Prometheus, Grafana)
  • Knowledge of networking concepts and security principles
  • Excellent problem-solving skills and attention to detail
  • Strong communication skills and ability to work in a collaborative environment

Benefits

  • Health Benefits: Medical, Vision, Dental and Life Insurance
  • Cutting edge solutions in an emerging field with lots of growth potential
  • Generous compensation packages
  • 401K
  • Family friendly & compassionate work culture
  • Hybrid work environment
  • Work with good people who CARE about making the world a better place

Job Keywords

Hard Skills
  • Bash
  • Docker
  • Github
  • Kubernetes
  • Terraform
  • 0A75KDs
  • 0v6QxSp QUXtAruiqbj
  • 4W0KyiE xgEdf4nBWYZHrth
  • 6VgruI1oafn
  • 8dr4bOpnR xAIHXyYvpVK
  • Er7s1O
  • GVfbrzD0O VFYAMwH2
  • jnT6gxE0H I9z8BVkCyQjL
  • K01ka edlFZ2TMCK
  • LwrE DKkjZV3JBtN
  • N4eoq57xrdtk HR67X2pIWKMn
  • n7KgADqd
  • O54RTECgx rDGlYFfB4CU
  • QM5rSWl3K hcAWPbv
  • Y7wie 74aIHpmfh QDqKjPEvM0d13Fe
  • zh3i2oKc usZjUXG9
Build your resume with AI

A Smarter and Faster Way to Build Your Resume

Go to AI Resume Builder
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service