Cloud Resources - Palo Alto, CA

posted 2 months ago

Full-time
Palo Alto, CA
Professional, Scientific, and Technical Services

About the position

As a Data Architect, you will play a crucial role in designing and managing databases that are essential for our data-driven initiatives. Your primary responsibilities will include working with various database technologies such as Cosmos DB, MongoDB, and PostgreSQL to create efficient and scalable data models. You will ensure data consistency and integrity across all platforms, which is vital for maintaining the quality of our data assets. In addition to database management, you will be responsible for developing and maintaining distributed task processing systems using Celery. This will involve ensuring robust task queue management with tools like Redis or RabbitMQ, which are critical for handling asynchronous tasks effectively. You will also implement and manage Kafka Streams to facilitate real-time data processing and streaming, which is increasingly important in our fast-paced environment. Your expertise in data orchestration will be leveraged as you work with Airflow to manage workflows and automate data pipelines. You will also develop and manage webhooks using FastAPI, integrating them with serverless functions to enhance our data processing capabilities. Familiarity with Iceberg for data lake management and optimization will be beneficial as you work to streamline our data storage solutions. Experience in the healthcare domain will be a significant advantage, as it will help you understand the specific data challenges and requirements in this field.

Responsibilities

  • Design and manage databases using Cosmos DB, MongoDB, and PostgreSQL.
  • Develop and maintain efficient data models ensuring data consistency and integrity.
  • Develop and maintain distributed task processing using Celery and manage task queues with Redis or RabbitMQ.
  • Implement and manage Kafka Streams for real-time data processing and streaming.
  • Utilize Airflow for data orchestration and workflow management.
  • Develop and manage webhooks using FastAPI and integrate with serverless functions.
  • Optimize data lake management using Iceberg.

Requirements

  • Proven experience with Cosmos DB, MongoDB, and PostgreSQL.
  • Strong knowledge of data modeling and database design principles.
  • Experience with distributed task processing using Celery.
  • Familiarity with task queue management using Redis or RabbitMQ.
  • Proficient in implementing Kafka Streams for real-time data processing.
  • Experience with Airflow for data orchestration and workflow management.
  • Knowledge of developing webhooks using FastAPI.
  • Familiarity with Iceberg for data lake management and optimization.
  • Experience in the healthcare domain.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service