Saama Technologies - San Francisco, CA

posted 4 days ago

Full-time - Mid Level
San Francisco, CA
Professional, Scientific, and Technical Services

About the position

The AWS Pyspark MDM Developer role at Saama Technologies involves leading the design, implementation, and optimization of scalable data pipelines and architectures using various AWS services. The position focuses on data transformation and processing, ensuring data consistency and compliance, and applying best practices in data governance within a collaborative team environment. This role is ideal for individuals passionate about data analytics and looking to make a significant impact in the field.

Responsibilities

  • Lead the design, implementation, and optimization of scalable data pipelines and architectures utilizing AWS Glue, Elastic MapReduce (EMR), Lambda, Redshift, Athena, DynamoDB, OpenSearch, and S3.
  • Use Spark on AWS for data transformation and processing across large datasets.
  • Develop and maintain efficient data workflows with SQS for task queueing and orchestration.
  • Integrate, transform, and manage data using Mulesoft for seamless data integration.
  • Ensure high-performance data storage, retrieval, and analytics across Redshift, DynamoDB, and Athena.
  • Oversee data consistency, integrity, and compliance through IQVIA MDM solutions.
  • Apply best practices in data governance, security, and scalability within a collaborative and cross-functional team environment.

Requirements

  • Proven expertise in AWS data engineering, specifically with Glue, EMR, Lambda, Redshift, Athena, DynamoDB, OpenSearch, and S3.
  • Some experience with data integration (Mulesoft, Talend).
  • Working knowledge of master data management.
  • Demonstrated ability to lead technical projects and mentor data engineering teams.
  • Exceptional analytical and communication skills.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service