Komatsu - Tucson, AZ
posted 3 months ago
As a Site Reliability Engineer (MTS) at Komatsu, you will play a crucial role in applying software engineering practices to solve operational challenges and ensure the reliability of our systems. Your primary focus will be on monitoring our running systems, which includes Continuous Application Delivery, Health Monitoring and Alerting, Operations Support, and Configuration Management. This position is essential in providing value to our customers by ensuring that our products operate efficiently and effectively in the field. In this role, you will proactively research, select, configure, and deploy observability and deployment tools, frameworks, and processes. Your efforts will be directed towards increasing the company's efficiency in early identification and tracing of incidents that could impact production environments. You will also facilitate sustained improvements based on your findings and recommendations, ensuring that customers derive maximum value from Modular products. You will be responsible for reviewing implementations by developers, providing constructive feedback, and documenting any issues as technical debt when identified. To avoid duplication of work, you will raise awareness when teams develop parallel solutions, ensuring that effective standards are in place. Establishing communication channels with other teams will be vital, as you will need to maintain familiarity with the department's development roadmap, identify risks, and align work to promote re-use and efficiency. Documentation will be a key aspect of your role, as you will maintain guidelines and standards for Site Reliability Engineering tasks, sharing knowledge and documenting the work done. You will also participate in an on-call, follow-the-sun support rotation, collaborating with team members across different time zones to minimize individual exposure to after-hours shifts. Your ability to work remotely at least part-time will be beneficial due to the nature of our system deployments. Collaboration and communication across business units within Modular will be essential to achieve our goals and objectives, while maintaining compliance with all legislative, Modular, and customer site policies, rules, and requirements. Safety is our top priority, and you will reinforce this commitment by demonstrating that “zero accidents” is achievable.