Petadata - Seattle, WA

posted about 2 months ago

Full-time - Mid Level
Seattle, WA

About the position

The ETL Developer position at PETADATA involves designing, developing, and implementing scalable data processing solutions using Apache Spark and the Matillion tool. The role requires extensive experience in ETL processes, ensuring data quality and integrity, and collaborating with cross-functional teams to meet data integration needs. The developer will work on both streaming and batch workflows to support efficient data flow and processing for clients.

Responsibilities

  • Design, develop, and implement scalable data processing solutions using Apache Spark.
  • Ensure data quality, integrity, and consistency throughout the ETL pipeline.
  • Integrate data from different systems and sources to provide a unified view for analytical purposes.
  • Collaborate with data analysts to implement solutions that meet their data integration needs.
  • Design and implement streaming workflows using PySpark Streaming or other relevant technologies.
  • Develop batch processing workflows for large-scale data processing and analysis.
  • Analyze business requirements to determine the volume of data extracted from different sources and ensure data quality.
  • Determine the best storage medium required for the data warehouse.
  • Ensure that data is loaded into the warehouse system and meets business needs and standards.
  • Responsible for data flow validation and creating a secured database warehouse.

Requirements

  • 10+ years of experience in implementing ETL processes to extract, transform, and load data.
  • Expertise in Python, PySpark, ETL processes, and CI/CD (Jenkins or GitHub).
  • Proficiency in the Matillion tool.
  • Extensive knowledge and experience with Spark and its technologies.
  • Hands-on experience with Apache Spark framework, including the Spark SQL module.
  • Good knowledge of data analysis, design, and programming skills such as JavaScript, SQL, XML, and DOM.
  • Experience in managing SQL databases and organizing big data.
  • Solid understanding of Data warehousing schemes and Dimensional modeling.

Nice-to-haves

  • Familiarity with various coding languages used in web development, including HTML, CSS, Java, Scala, or R proficiency.
  • Experience in writing clean code that's free of bugs and reproducible by other developers.

Benefits

  • Professional work environment with opportunities for growth in the Information Technology field.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service