Allscripts Healthcare - Atlanta, GA

posted 2 months ago

Full-time - Mid Level
Remote - Atlanta, GA
Publishing Industries

About the position

As a Senior Site Reliability Engineer at Veradigm, you will play a crucial role in managing and enhancing the reliability of our systems. This position requires a blend of technical expertise and leadership, as you will not only handle incidents but also mentor other engineers and foster a culture of continuous improvement. Your work will directly impact the efficiency and performance of healthcare provider solutions, contributing to the overall mission of transforming healthcare through data-driven insights.

Responsibilities

  • Serve as an on-call engineer, managing and resolving incidents affecting system availability and performance.
  • Collaborate with development, operations, and infrastructure teams to design, implement, and maintain robust systems.
  • Proactively monitor and analyze system metrics to identify and mitigate potential issues.
  • Conduct thorough root cause analysis of incidents and implement long-term solutions to prevent recurrence.
  • Automate manual processes to improve efficiency and reduce human error.
  • Participate in capacity planning and performance optimization efforts for system scalability and reliability.
  • Stay updated with industry trends and emerging technologies related to cloud services and Site Reliability Engineering.

Requirements

  • Bachelor's degree in computer science, engineering, or a related field (or equivalent work experience).
  • 4-7 years of experience in development, operations, and infrastructure, with at least 2-3 years as a Site Reliability Engineer or DevOps Engineer.
  • Coding proficiency in a high-level programming language (C# preferred) and knowledge of Object-Oriented Programming (Java, Objective-C, C#, C/C++, Python).
  • Proficient in scripting and automation using languages such as Python, Bash, or PowerShell.
  • 3+ years of experience with service-oriented architectures and microservices.
  • Solid understanding of Site Reliability Engineering principles, with experience applying SLAs, SLIs, and SLOs.
  • Extensive experience in incident management and on-call support in a high-availability production environment.
  • Strong knowledge of cloud services, particularly Azure and AWS.

Nice-to-haves

  • Certifications in Azure, AWS, Terraform, Kubernetes.
  • Familiarity with DevOps practices and tools, such as CI/CD pipelines and infrastructure-as-code.
  • Experience with monitoring and logging tools, such as Splunk, Prometheus, Grafana, ELK stack.

Benefits

  • Holidays and vacation
  • Medical, dental, and vision insurance
  • Company paid life insurance
  • Retirement savings plan
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service