Snowflake Computing - Bellevue, WA

posted 7 days ago

Full-time - Senior
Bellevue, WA
Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

About the position

As a Senior Software Engineer on the Polaris and Data Lake Catalog team at Snowflake, you will be instrumental in building and evolving an open and interoperable data lake ecosystem. This role focuses on addressing complex challenges in distributed systems and contributing to Snowflake's mission of providing a truly open data lake architecture, free from vendor lock-in. You will work on the Polaris project, which is an open-source implementation of the Iceberg REST catalog, and help customers unlock the full potential of their data with innovative solutions.

Responsibilities

  • Design and implement scalable, distributed systems to support Iceberg DML/DDL transactions, schema evolution, partitioning, and time travel.
  • Architect and build systems that integrate Snowflake queries with external Iceberg catalogs and various data lake architectures.
  • Develop high-performance, low-latency solutions for catalog federation, allowing customers to manage and query their data lake assets across multiple catalogs.
  • Collaborate with Snowflake's open-source team and the Apache Iceberg community to contribute new features and enhance the Iceberg REST specification.
  • Work on core data access control and governance features for Polaris, including fine-grained permissions and multi-cloud federated access control.
  • Contribute to the managed Polaris service, ensuring compatibility with external query engines like Spark and Trino.
  • Build tooling and services that automate data lake table maintenance for enhanced query performance.

Requirements

  • 8+ years of experience designing and building scalable, distributed systems.
  • Strong programming skills in Java, Scala, or C++ with an emphasis on performance and reliability.
  • Deep understanding of distributed transaction processing, concurrency control, and high-performance query engines.
  • Experience with open-source data lake formats and challenges associated with multi-engine interoperability.
  • Experience building cloud-native services and working with public cloud providers like AWS, Azure, or GCP.
  • A passion for open-source software and community engagement, particularly in the data ecosystem.
  • Familiarity with data governance, security, and access control models in distributed data systems.

Nice-to-haves

  • Contributing to open-source projects, especially in the data infrastructure space.
  • Designing or implementing REST APIs in the context of distributed systems.
  • Managing large-scale data lakes or data catalogs in production environments.
  • Working on highly-performant and scalable query engines such as Spark, Flink, or Trino.

Benefits

  • Medical insurance
  • Dental insurance
  • Vision insurance
  • Life insurance
  • Disability insurance
  • 401(k) retirement plan
  • Flexible spending account
  • Health savings account
  • At least 12 paid holidays
  • Paid time off
  • Parental leave
  • Employee assistance program
  • Other company benefits
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service