Cisco-posted about 1 year ago
Full-time • Mid Level
San Jose, CA
Computer and Electronic Product Manufacturing

As an AI Operations Engineer within the Cisco Security AI team, you will be instrumental in ensuring the reliability, scalability, and performance of the infrastructure that supports AI Assistants and platforms. This role focuses on operational support, enhancing system reliability, and minimizing incidents, contributing to the overall security resilience of businesses worldwide.

  • Implement and maintain all AI Ops pipeline to automate tasks such as data collection, model training, and deployment.
  • Configure advanced monitoring, alerting, and logging tools to track the health and performance of AI Platform infrastructure.
  • Write automation as part of Infrastructure as Code (IaC) to deploy, maintain, and upgrade critical infrastructure resources.
  • Collaborate with application development and infrastructure teams to design and architect AI Ops solutions.
  • Use data and telemetry to improve feature work and propose feature improvements.
  • Respond to incidents and outages, troubleshoot issues, and implement solutions to restore service.
  • Participate in OnCall rotation to monitor and quickly resolve critical production alerts.
  • Bachelor's degree plus a minimum of 5 years or Master's degree plus a minimum of 3 years of industry experience in programming using Python, Java, Bash, or similar languages.
  • 2+ years of experience using container technologies such as Docker or Kubernetes.
  • 2+ years of cloud computing experience with cloud providers such as AWS, GCP, or Azure.
  • Experience with Infrastructure as Code (IaC) tools like Terraform and observability platforms like Splunk, Grafana, or Prometheus.
  • Experience with source control and continuous integration tools like Git and Jenkins.
  • Experience with SaaS based products.
  • Experience with model training and deployment in production environments.
  • Ability to work cross-functionally with engineering and data science teams.
  • Aptitude to stay up to date with advancing AI and Machine Learning Technologies and industry best practices.
  • Ability and the passion to learn new things quickly.
  • Medical, dental, and vision insurance
  • 401(k) plan with Cisco matching contribution
  • Short and long-term disability coverage
  • Basic life insurance
  • Numerous wellbeing offerings
  • Up to twelve paid holidays per calendar year
  • Floating holiday and a day off for birthday
  • Vacation time off policy with flexible limits
  • Sick time off with carryover options
  • Paid time away for critical issues
  • Additional paid time to volunteer and give back to the community
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service