Teaching Strategies - Denton, TX

posted 23 days ago

Full-time - Senior
Remote - Denton, TX
Educational Services

About the position

Teaching Strategies is seeking a highly talented and innovative Site Reliability Engineer (SRE) to join our Infrastructure Engineering group. This hands-on technical role is crucial for maintaining the reliability and performance of our customer-facing services and products. The SRE will collaborate closely with development, QA, and Core Platform Engineering teams to implement Infrastructure-as-Code, automation, and CI/CD practices, ensuring scalable and efficient technology solutions for our educational products.

Responsibilities

  • Own uptime and support all customer-facing services and products.
  • Drive improvements to observability of service performance metrics, monitors, and alerting.
  • Provision, manage, and automate our SaaS platform across multiple production and test environments.
  • Support and enhance build and release pipelines using process and tooling to provide self-service automations.
  • Collaborate with development teams to identify and remove potential performance bottlenecks.
  • Establish SLIs and SLOs for services in partnership with engineering teams.
  • Participate in the on-call rotation with the team.
  • Resolve incidents, perform root cause analysis, and maintain a library of runbooks.
  • Implement and automate security controls, governance processes, and compliance validation.
  • Participate in and drive infrastructure architecture decisions.
  • Mentor junior members of the team.

Requirements

  • Minimum of 10 years of build automation and release management experience in a SaaS production environment.
  • Hands-on experience with Linux and system administration and engineering.
  • Comfortable in a containerized environment using Kubernetes (EKS), helm, and ArgoCD.
  • Proficiency with configuration management tools such as Ansible, Chef, and Salt.
  • Production experience in operations for mission-critical services that are always-up and always-available.
  • Strong knowledge of ephemeral infrastructure, horizontal scaling, self-healing architectures, service discovery, logging, monitoring, and alerting.
  • Expert level experience with AWS and hybrid cloud systems/designs.
  • Proficiency with IaC tools such as Terraform and AWS CloudFormation.
  • Expert understanding of troubleshooting systems at the protocol layer (TCP/IP, UDP, HTTP, SSL/TLS, DNS).
  • Proficient with scripting languages such as Bash, Python, or Go.
  • Experience developing CI/CD pipelines using Jenkins or BitBucket Pipelines.
  • Knowledge of best-practice security, performance, and networking techniques for high-traffic customer-facing systems.
  • Experience with monitoring and logging tools such as New Relic or AWS CloudWatch.
  • Experience with relational and NoSQL databases, including Microsoft SQL, Postgres, and MongoDB.
  • Excellent troubleshooting and testing skills.
  • A passion for learning new technologies.
  • Experience with Agile methodology and a passion for software development best practices.
  • Strong sense of collaboration, teamwork, and accountability.

Nice-to-haves

  • Experience working for a B2B SaaS company.

Benefits

  • Competitive compensation package
  • Employee Equity Appreciation Program
  • Health and wellness insurance benefits
  • 401k with employer match
  • Flexible work environment
  • Unlimited paid time off (including paid holidays and Winter Break)
  • Paid parental leave
  • Tuition assistance, professional development, and opportunities for career growth
  • Best in class technology equipment for every employee
  • Penthouse suite in downtown DC seconds away from Washington Nationals Stadium and Audi Field
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service