Hyperspell - San Francisco, CA

posted 4 days ago

Full-time - Entry Level
San Francisco, CA

About the position

As a Founding Data Engineer at Hyperspell, you will take full ownership of the end-to-end data ingestion pipeline, playing a critical role in designing and implementing the architecture for our Data-as-a-Service platform. This position offers the opportunity to define best practices and establish scalable processes for data ingestion, transformation, and storage, ensuring that data flows securely and efficiently through our system for AI applications.

Responsibilities

  • Design and build an end-to-end data ingestion pipeline from zero to one, defining the architecture and implementing scalable ETL pipelines.
  • Manage and optimize data pipeline infrastructure using tools like Apache NiFi, Airflow, or Flink to ensure resilience and efficiency.
  • Implement robust data transformation processes to prepare data for optimal storage and retrieval.
  • Ensure data security and compliance with SOC2/GDPR standards, including token management and encryption.
  • Collaborate on architecture decisions and align the data pipeline with Hyperspell's core infrastructure and long-term goals.
  • Monitor and scale pipeline performance to handle high data volumes and complex queries.
  • Contribute to a high-performance engineering culture by establishing best practices and mentoring future team members.

Requirements

  • Strong experience building production ETL pipelines in large companies or high-volume environments.
  • Proficiency with Apache Flink, Airflow, and NiFi, with hands-on experience in managing these tools.
  • Knowledge of data transformation best practices, including data extraction, chunking, and storage optimization.
  • Experience in scaling data infrastructure and setting up monitoring for high data volumes.
  • Strong problem-solving skills to navigate complex data integration challenges.
  • Early stage startup mindset, comfortable with ambiguity and ready to take ownership.

Nice-to-haves

  • Experience with graph databases (e.g., Neo4j) and vector databases (e.g., ChromaDB).
  • Expertise in secure data handling practices, including compliance with privacy standards like SOC2 and GDPR.
  • Enjoyment in mentoring other engineers and contributing to a strong engineering culture.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service