Vision Square - San Jose, CA

posted 4 days ago

Full-time - Mid Level
San Jose, CA
Administrative and Support Services

About the position

The GenAI Data Engineer will work on a Generative AI project utilizing Large Language Models (LLMs) to enhance the client's bug tracking system, including JIRA. The role involves developing and optimizing data pipelines, implementing AI solutions, and leveraging multimodal data inputs to improve task management and reporting capabilities.

Responsibilities

  • Design, develop, and evaluate innovative LLM models to solve diverse challenges using MS Azure services.
  • Understand business problems and help define and implement a scalable Generative AI platform.
  • Build large-scale production systems utilizing advanced machine learning and big data technologies.
  • Develop and optimize data processing workflows for chunking, indexing, ingestion, and vectorization for both text and non-text data.
  • Benchmark and implement various vector stores, embedding techniques, and retrieval methods.
  • Create a flexible pipeline supporting multiple embedding algorithms and search types.
  • Implement and maintain auto-tagging systems and data preparation processes for LLMs.
  • Develop tools and pipelines for text and image data cleaning and refinement.
  • Integrate and optimize workflows using Azure AI Search and various vector store technologies.
  • Use advanced prompting techniques to develop and optimize prompts for LLMs.
  • Contribute to automated model testing and evaluation infrastructure.
  • Build automations to create and select the best prompts for any given task.

Requirements

  • Demonstrated experience with generative AI models, particularly LLMs for text analysis and response generation.
  • Strong capabilities in building data pipelines for data ingestion, preprocessing, and transformation.
  • Proficiency with Python and relevant AI/ML libraries.
  • Familiarity with MS Azure AI services, including Azure AI Studio and Azure AI Search.
  • Hands-on experience with frameworks like LangChain, AutoGen, and Semantic Kernel.
  • Knowledge of embedding techniques and vector store technologies for efficient similarity searches.
  • Experience developing tools for automated data annotation and tagging.

Nice-to-haves

  • Experience with multimodal AI systems integrating text and images.
  • Familiarity with RAG systems and knowledge base construction.
  • Strong problem-solving skills in a fast-paced environment.

Benefits

  • Flexible working hours
  • Health insurance
  • Professional development opportunities
  • Remote work options
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service