This job is closed

We regret to inform you that the job you were interested in has been closed. Although this specific position is no longer available, we encourage you to continue exploring other opportunities on our job board.

Adobeposted 2 months ago
$153,600 - $286,600/Yr
Full-time • Mid Level
Seattle, WA
Resume Match Score

About the position

Adobe is on a mission to change the world through digital experiences, empowering everyone from emerging artists to global brands to design and deliver exceptional digital experiences. We are looking for an outstanding Site Reliability Engineer for Adobe’s AI Inference Platform, Adobe Firefly. This role involves working closely with Engineering teams to build, scale, and secure the AI Platform, enabling product teams to manage and deploy Machine Learning capabilities used by Adobe client applications. The platform will support thousands of models in various lifecycle stages and will provide ML model serving at scale across multiple clouds.

Responsibilities

  • Identify and implement methodologies and solutions to increase reliability, scalability, security, and efficiency.
  • Ensure the highest uptime and Quality of Service (QoS) for Adobe’s customers through operational excellence.
  • Define service level objectives (SLOs) and indicators (SLIs) to represent and measure service quality.
  • Support and maintain globally distributed, multi-cloud environments.
  • Automate common, repeatable tasks at a large scale to streamline operational procedures.
  • Identify areas to improve service resiliency through techniques such as chaos engineering and performance/load testing.
  • Coordinate with other Adobe platform teams and service providers to innovate on Generative AI as a Service.

Requirements

  • A Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related field, or equivalent industry experience.
  • Experience in building and scaling distributed systems.
  • Experience with containerization and orchestration technologies like Kubernetes.
  • Production level expertise with containerization orchestration engines (e.g. Kubernetes) and understanding of modern continuous development techniques and pipelines (IaC, CI/CD, ArgoCD, Git).
  • Fundamental programming skills, ideally practical experience in one or more of the following languages: Python, Go.
  • Good knowledge of infrastructure configuration management tools like Ansible and Terraform.
  • Experience in using observability and tracing-related tools like InfluxDB, Prometheus, and Elastic Stack.
  • An understanding of AI/ML, including ML frameworks and commercial AI/ML solutions.

Nice-to-haves

  • Familiarity with Pytorch, SageMaker, HuggingFace, NVIDIA TensorRT or OpenAI Triton.

Benefits

  • Competitive salary range of $153,600 -- $286,600 annually based on location and experience.
  • Short-term incentives in the form of the Annual Incentive Plan (AIP).
  • Long-term incentives in the form of a new hire equity award for certain roles.
  • Commitment to equal opportunity and affirmative action.

Job Keywords

Hard Skills
  • Ansible
  • Elastic Stack
  • InfluxDB
  • Kubernetes
  • Prometheus
  • 6bZTkm9IK8B7 7O5yQEa0XM8
  • aq6Pg5tZcU9 U2wsu43ep
  • BbvzZaSDR4cN qnH9r307kvZJ
  • bMZ4ksE G4mfl5
  • CS1ZPoMasQ
  • GHxhSe8d ClX3J9
  • he0ZiS kzXLw8KHbmi
  • HxMOL5BovJbP aljw2sFJTWvt
  • KdZYGXWS57p hu8Kbx5kPEJL
  • lQckviSY tJIUyOuk1v2M
  • NnDP3cYw O8jYPwSbp
  • oRydiGr1T eWXFojUI
  • pJjtcd3C
  • szQWS EoJM4qF8
  • VeJSfl2 e07MGS
  • wgNy5 fMwQLrlg1Zn
  • xfa1OPu0 kCVT8jevB
  • ZQq7JEvYA1m6 JEw6qe9MiP28
Build your resume with AI

A Smarter and Faster Way to Build Your Resume

Go to AI Resume Builder
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service