Hitachi Solutionsposted 21 days ago
$142,500 - $198,750/Yr
Full-time • Mid Level
Greenville, SC

About the position

This is a full-time role in our product organization for an expert in systems design with considerable skill and expertise in large software development in an AZURE dev environment. Designs and implements Continuous Integration/Continuous Deployment (CI/CD) tooling using GitHub Actions / Azure DevOps, and related technologies. This includes defining and implementing: build and test pipelines for containerized architectures, infrastructure as code (IaC) for the stateful deployment of environments, Role-Based Access Control (RBAC), linting and other code quality controls, gitops and kubernetes pipelines, and managing SaaS deployment APIs. Individuals in this role will assist in the design, engineering, development, planning and administration of Azure Kubernetes AKS clusters for a set of critical business applications. This role will work closely with application, engineering, security and operations teams to engineer and build Kubernetes and Azure PaaS & IaaS solutions within an agile and modern enterprise grade operating model. Qualified applicants will have a demonstrated capability to learn new concepts quickly, and/or have robust domain expertise.

Responsibilities

  • Responsible for availability, latency, performance, efficiency, monitoring/observability, emergency response, capacity planning, setting, and maintaining SLOs, SLIs and Error Budgets, creating dashboards.
  • Analyze, troubleshoot, and resolve operational challenges contributing to defined SLO's.
  • Manage site stability, performance, reliability, and maintain uptime for production environments.
  • Develop a fully automated multi-environment observability stack based on the existing system and extend it to predict capacity needs based on the usage patterns.
  • Strive for automation to reduce toil and increase development velocity.
  • Perform application-specific production support, incident management, change management, problem management, RCAs, and service restoration as needed.
  • Identify changes for the product architecture from the reliability, performance and availability perspective with a data driven approach.
  • Analyze and address complex technical challenges and issues that arise during the software development & run lifecycle.
  • Debug, troubleshoot, and resolve technical problems efficiently.
  • Create and maintain technical documentation, including design specifications, user guides, run books and best practice guidelines.
  • Actively look for opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation.
  • Collaborate with software development teams in the release management process and to shape the future roadmap and establish strong operational readiness across teams.
  • Participate in Agile ceremonies, such as sprint planning, stand-up meetings, and retrospectives.
  • Collaborate with product managers, designers, and other engineers to ensure alignment and efficient project execution.
  • Share your expertise and mentor engineers, helping them grow and develop their skills.
  • Foster a culture of continuous learning and improvement within the team.
  • Stay updated with the latest technologies, tools, and cloud computing.
  • Proactively learn and adapt to new technologies to drive innovation.
  • Collaborate with customers to understand their needs, gather feedback, and provide technical support and guidance as needed.
  • Triage incoming Web Support escalation requests routing to applicable internal teams.
  • Contribute to incident root cause analysis, service restoration, and serve as an incident commander during outage events.

Requirements

  • Strong background as a SRE supporting a 24x7 highly available production environment for a SaaS or cloud service provider.
  • Solid experience with Monitoring/APM/Observability tools (Data dog, Application Insights, Prometheus, Grafana etc.)
  • Strong background with Azure Resources like Key Vault, Data Factory, Azure Databricks and Storage Accounts.
  • Experience implementing observability plans around logs, metrics, and traces.
  • Experience in an agile development team developing software.
  • Implement and participate exercising best practices for CI/CD.
  • Experience with cloud infrastructure environments, preferably Azure, and Infrastructure as code (Terraform, Bicep, ARM).
  • Design, develop, and maintain infrastructure using popular IaC tools and technologies like Terraform, Helm, others.
  • Strong experience with containerization technology and/or Kubernetes.
  • Experience with Release automation, system administration, configuration management.
  • Experience with programming languages (Python, Go, etc.).
  • Strong understanding of Linux, Windows, software development, systems, networking, and cloud concepts.
  • Strong interpersonal and teaming skills - ability to set and enforce process and influence engineers who are not direct reports.
  • Strong analytical and programming skills (Python, Go etc.).

Nice-to-haves

  • Experience with MLFlow and other MLOps pipeline technology.

Benefits

  • Bonus Plan
  • Medical, Dental and Vision Coverage
  • Life Insurance and Disability Programs
  • Retirement Savings with Company Match
  • Paid Time Off
  • Flexible Work Arrangements including Remote Work

Job Keywords

Hard Skills
  • Azure DevOps
  • Go
  • Kubernetes
  • Python
  • Terraform
  • 0GyrJmaxRwu uJNUkf2l
  • 0xomzV Sf053AXkr
  • 21JbBd fsSAK dtbkDzpK
  • 3dYNJxr hNFGosV21XQ
  • 54dg0bs hd15 yqRM6crP3XtS
  • 5D6WSPpuXr3
  • 6Pp71YgQ
  • 8vaTxRpyPNK y2CmchsqVFn
  • AOq6NdEe3w Rb5tng9K
  • AOupGc
  • dg1s5of WC4SyD6sX
  • e8Zp2KBE cA0B3VjYz
  • fBSQ7NYEDh DpvurVsCL
  • J0Bgunm
  • l6ZIr1 P3Dv4RGusT
  • lfCwuGZF VwmlcI4G
  • n1Fdq7mjGlcbKIE YCa mXjuU
  • NhZ320v7x1FfDK WeqxYRknA3j
  • niOdBMtmu fqlovhkYuWpC
  • NLcAxz1MJ8e qGmkuU0K7nt
  • OPG7DolwI qwW2Oug1SRA
  • osqO3n5 oqzXgKS0
  • OXUZ4tC6q IgCeObGLf
  • PdA73SkGMqFCWHr kKz ePuyW
  • PtX6l RI2fSg
  • R0eYX 2iTk0c rmd14GuqI
  • rlNq3VoU2CQP zfg7ednyiVt5
  • RQtXoVn
  • RXdght deWJt756muV9XwG
  • Sg5GXAnw NLlvT2c
  • tGO6pYoA7jh
  • tVNCHxpqUMF
  • vhznG0lZo giIhCBAjM
  • VUdhRI
  • WBM5rDzF2e VM3koLKps6gxZd
  • zKYtdsJN3 zJkIDP5e94hY
  • ZzIlSaLY 9ovdQaYSKmk
Build your resume with AI

A Smarter and Faster Way to Build Your Resume

Go to AI Resume Builder
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service