Net2Source - Dallas, TX

posted 3 months ago

Full-time - Mid Level
Dallas, TX
1,001-5,000 employees
Administrative and Support Services

About the position

The Site Reliability Engineer (SRE) position at Net2Source Inc. is a critical role within the Site Reliability Engineering team, which is responsible for ensuring the availability, reliability, and performance of services and platforms in a highly transactional 24x7 environment. The SRE will monitor application performance, implement improvements, and automate tasks to enhance system efficiency. This role requires troubleshooting capabilities in both cloud-based and on-premises environments, handling live production incidents, and debugging application and infrastructure issues while adhering to SRE best practices. In this position, the SRE will coordinate with product owners and business representatives to define Service Level Objectives (SLOs) and error budgets for key functionalities of projects. Participation in design reviews of software components is essential to ensure they are built correctly. The SRE will also review products prior to production deployments to validate compliance with established SLOs. Conducting system analysis and configuration management to develop improvements for system software performance, availability, and reliability is a key responsibility. Collaboration with software engineers and QA teams is crucial to ensure that the system meets non-functional requirements such as performance, security, and availability. The SRE will document system knowledge, create runbooks, and ensure that critical system information is accessible to relevant stakeholders. Additionally, the role involves maintaining and monitoring the deployment of servers, docker containers, databases, and backend infrastructure, as well as participating in production feedback sessions and problem management calls to identify opportunities for product improvement.

Responsibilities

  • Monitor application performance and implement improvements for stability.
  • Apply automation and software to tasks that can benefit from it.
  • Troubleshoot OS, Networking, and database issues in cloud/on-premises environments.
  • Handle live production incidents and debug application and infrastructure issues.
  • Coordinate with product owners to define Service Level Objectives and error budgets.
  • Participate in design reviews to ensure software components are built correctly.
  • Review products before production deployments for compliance with SLOs.
  • Conduct system analysis and develop improvements for software performance and reliability.
  • Collaborate with software engineers and QA to meet non-functional requirements.
  • Document system knowledge and create runbooks for critical information.
  • Maintain and monitor deployment of servers, docker containers, and databases.
  • Participate in production feedback sessions and problem management calls.

Requirements

  • Bachelor's degree in computer science or related field, or equivalent experience.
  • 5+ years of experience in full-stack application support or SRE role.
  • Experience in Javascript, Typescript, and web development technologies.
  • Proficient in scripting languages such as Powershell and/or Python.
  • Troubleshooting experience with complex application incidents in AWS stack.
  • Experience conducting design reviews of software components.
  • Extensive experience with observability platforms like DataDog.
  • Knowledge of DevOps methodologies and CI/CD tools such as Jenkins and CodePipeline.
  • Hands-on experience with AWS public cloud is essential.

Nice-to-haves

  • Experience with automation and configuration tools like Puppet and Ansible.
  • Project implementation experience on public cloud is a plus.
  • Ability to adapt to new application stacks and technology concepts.
  • Excellent communication skills, both verbal and written.
  • Ability to collaborate with remote teams across different time zones.

Benefits

  • Health insurance coverage
  • 401k retirement savings plan
  • Paid holidays
  • Flexible scheduling options
  • Professional development opportunities
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service