Komatsu - Tucson, AZ

posted 3 months ago

Part-time - Mid Level
Tucson, AZ
Repair and Maintenance

About the position

As a Site Reliability Engineer (MTS) at Komatsu, you will play a crucial role in applying software engineering practices to solve operational challenges and ensure the reliability of our systems. Your primary focus will be on monitoring our running systems, which includes Continuous Application Delivery, Health Monitoring and Alerting, Operations Support, and Configuration Management. This position is essential in providing value to our customers by ensuring that our products operate efficiently and effectively in the field. In this role, you will proactively research, select, configure, and deploy observability and deployment tools, frameworks, and processes. Your efforts will be directed towards increasing the company's efficiency in early identification and tracing of incidents that could impact production environments. You will also facilitate sustained improvements based on your findings and recommendations, ensuring that customers derive maximum value from Modular products. You will be responsible for reviewing implementations by developers, providing constructive feedback, and documenting any issues as technical debt when identified. To avoid duplication of work, you will raise awareness when teams develop parallel solutions, ensuring that effective standards are in place. Establishing communication channels with other teams will be vital, as you will need to maintain familiarity with the department's development roadmap, identify risks, and align work to promote re-use and efficiency. Documentation will be a key aspect of your role, as you will maintain guidelines and standards for Site Reliability Engineering tasks, sharing knowledge and documenting the work done. You will also participate in an on-call, follow-the-sun support rotation, collaborating with team members across different time zones to minimize individual exposure to after-hours shifts. Your ability to work remotely at least part-time will be beneficial due to the nature of our system deployments. Collaboration and communication across business units within Modular will be essential to achieve our goals and objectives, while maintaining compliance with all legislative, Modular, and customer site policies, rules, and requirements. Safety is our top priority, and you will reinforce this commitment by demonstrating that “zero accidents” is achievable.

Responsibilities

  • Proactively researches, selects, configures, and deploys observability and deployment tools, frameworks, and processes.
  • Facilitates sustained improvements based on findings/recommendations.
  • Ensures customers are obtaining value from Modular products.
  • Reviews implementation by developers and provides constructive feedback, recording issues as technical debt when found.
  • Avoids duplication of work by raising awareness when teams develop parallel solutions while effective standards already exist.
  • Establishes communication channels with other teams and maintains familiarity with the department's development roadmap, identifying risks and aligning work, promoting re-use and efficiency.
  • Documents and maintains guidelines and standards for Site Reliability Engineering tasks, sharing knowledge and documenting the work done.
  • Works on an on-call, follow-the-sun support rotation, shared between team members in different time zones to minimize individual exposure to after-hours shifts.
  • Collaborates and communicates across business units within Modular to achieve goals and objectives.
  • Maintains compliance to all legislative, Modular and customer site policies, rules and requirements.
  • Reinforces awareness and demonstrates commitment that safety is our top priority and 'zero accidents' is achievable.

Requirements

  • Degree in Engineering or other related field or equivalent prior work experience.
  • Proficient in object-oriented programming languages.
  • Proficient in scripting languages.
  • Extensive experience with Modular products and site installations.
  • Experience with Configuration Management concepts and tools such as Version Control, Branching Strategies, Issue and Project Tracking tools, Release Management, and Continuous Integration.
  • Strong analytical, debugging, problem-solving and root-cause analysis skills.
  • Familiarity with the TCP/IP stack.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service