AI Operations Engineer, Security AI

CiscoSan Jose, CA
450d

About The Position

As an AI Operations Engineer within the Cisco Security AI team, you will be instrumental in ensuring the reliability, scalability, and performance of the infrastructure that supports AI Assistants and platforms. This role focuses on operational support, enhancing system reliability, and minimizing incidents, contributing to the overall security resilience of businesses worldwide.

Requirements

  • Bachelor's degree plus a minimum of 5 years or Master's degree plus a minimum of 3 years of industry experience in programming using Python, Java, Bash, or similar languages.
  • 2+ years of experience using container technologies such as Docker or Kubernetes.
  • 2+ years of cloud computing experience with cloud providers such as AWS, GCP, or Azure.
  • Experience with Infrastructure as Code (IaC) tools like Terraform and observability platforms like Splunk, Grafana, or Prometheus.
  • Experience with source control and continuous integration tools like Git and Jenkins.

Nice To Haves

  • Experience with SaaS based products.
  • Experience with model training and deployment in production environments.
  • Ability to work cross-functionally with engineering and data science teams.
  • Aptitude to stay up to date with advancing AI and Machine Learning Technologies and industry best practices.
  • Ability and the passion to learn new things quickly.

Responsibilities

  • Implement and maintain all AI Ops pipeline to automate tasks such as data collection, model training, and deployment.
  • Configure advanced monitoring, alerting, and logging tools to track the health and performance of AI Platform infrastructure.
  • Write automation as part of Infrastructure as Code (IaC) to deploy, maintain, and upgrade critical infrastructure resources.
  • Collaborate with application development and infrastructure teams to design and architect AI Ops solutions.
  • Use data and telemetry to improve feature work and propose feature improvements.
  • Respond to incidents and outages, troubleshoot issues, and implement solutions to restore service.
  • Participate in OnCall rotation to monitor and quickly resolve critical production alerts.

Benefits

  • Medical, dental, and vision insurance
  • 401(k) plan with Cisco matching contribution
  • Short and long-term disability coverage
  • Basic life insurance
  • Numerous wellbeing offerings
  • Up to twelve paid holidays per calendar year
  • Floating holiday and a day off for birthday
  • Vacation time off policy with flexible limits
  • Sick time off with carryover options
  • Paid time away for critical issues
  • Additional paid time to volunteer and give back to the community

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Industry

Computer and Electronic Product Manufacturing

Education Level

Bachelor's degree

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service