MongoDB - New York, NY

posted 5 months ago

Full-time - Mid Level
Hybrid - New York, NY
Professional, Scientific, and Technical Services

About the position

At MongoDB, we are at the forefront of innovation in the data management software market, which is projected to grow significantly in the coming years. Our Data Pipelines Engineering team plays a crucial role in this transformation by building ETL pipelines that populate our Internal Data Platform. This platform is essential for driving analytics that enhance the efficiency of our operations. We focus on creating highly performant and scalable processes that can handle massive datasets, making them readily available for querying in an optimal manner. Additionally, we are developing a Generative AI framework that empowers teams across the company to leverage the data stored in their Retrieval-Augmented Generation (RAG)-based applications. As a Senior Data & AI Engineer, you will be responsible for innovating strategies for building AI tools, including optimizing the chunking and retrieval of RAG-based data. You will evaluate and incorporate new concepts and tools into MongoDB's internal AI architecture while staying updated on industry trends in the AI space. Understanding the potential dangers associated with chatbots is critical, and you will build guardrails to mitigate these risks. Your role will also involve creating solutions that assess the content returned by AI tools using various frameworks, ensuring that the evaluation results help prevent hallucinations. Collaboration is key in this position, as you will work closely with Security and Compliance teams to ensure that datasets comply with appropriate permissions and regulations. You will also partner with our Data Platform, Architecture, and Governance teams to enhance the scalability, consumability, and discoverability of our data. This role is designed for someone who is passionate about data and AI, and who is eager to contribute to the development of cutting-edge solutions that drive MongoDB's success.

Responsibilities

  • Innovate strategies for building AI tools, including optimal chunking and retrieval of RAG-based data.
  • Stay updated on industry trends in the AI space and evaluate new concepts/tools for MongoDB's internal AI architecture.
  • Understand the dangers associated with chatbots and build guardrails to prevent risks.
  • Build solutions to evaluate content returned by AI tools using various frameworks to prevent hallucinations.
  • Collaborate with Security and Compliance teams to ensure datasets have appropriate permissions and regulations.
  • Work with Data Platform, Architecture, and Governance teams to enhance data scalability, consumability, and discoverability.

Requirements

  • 1+ years building AI and RAG-based applications
  • 3+ years building ML models
  • 5+ years of Python experience
  • Experience with Hive, Iceberg, Glue, or other technologies that expose big data as tables
  • Familiarity with big data file types such as Parquet, Avro, and JSON
  • 1+ years of Spark experience (nice-to-have)
  • 5+ years of building ETL pipelines for a Data Lake/Warehouse (nice-to-have)

Nice-to-haves

  • 1+ years of Spark experience
  • 5+ years of building ETL pipelines for a Data Lake/Warehouse

Benefits

  • Equity participation
  • Employee stock purchase program
  • Flexible paid time off
  • 20 weeks fully-paid gender-neutral parental leave
  • Fertility and adoption assistance
  • 401(k) plan
  • Mental health counseling
  • Transgender-inclusive health insurance coverage
  • Health benefits offerings
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service