Echo Global Logistics - Chicago, IL

posted 19 days ago

Full-time - Mid Level
Chicago, IL
501-1,000 employees
Support Activities for Transportation

About the position

The Site Reliability Engineer (SRE) will ensure the reliability and scalability of our services and infrastructure. This role involves managing automation and monitoring tools, collaborating with infrastructure and development teams to resolve issues efficiently, and maintaining clear documentation for our systems and processes.

Responsibilities

  • Monitor and maintain on-site and cloud infrastructure.
  • Manage object configuration in Kubernetes/EKS.
  • Develop and maintain automation scripts using Ansible, Python, and/or shell.
  • Write efficient and reusable code and documentation.
  • Collaborate with cross-functional teams to support continuous integration and delivery pipelines, observability, monitoring, and alerting.
  • Participate in an on-call rotation to resolve incidents and maintain service availability.

Requirements

  • 3-5 years experience in complex hybrid cloud/on-site environments supporting SaaS platforms and infrastructure.
  • Familiarity with monitoring and logging tools (e.g., Prometheus, ELK stack).
  • Background in system administration, datacenter infrastructure support, or DevOps.
  • Experience with infrastructure as code (IaC) tools (e.g., Terraform), policy as code, and configuration as code.
  • Strong knowledge and hands-on experience with AWS services.
  • Experience with automation tools, particularly Ansible.
  • Fluency in at least one scripting/programming language (Python, shell, etc.).
  • Understanding of CI/CD principles and tools.
  • Familiarity with full stack monitoring - APM, RUM, synthetic testing, load/performance testing, etc.
  • Familiarity with the entire tech stack and the ability to think critically at a systems level to understand technical problems.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service