Cat Services - Irving, TX

posted 6 days ago

Full-time - Mid Level
Irving, TX
Professional, Scientific, and Technical Services

About the position

The GCP Data Engineer role is focused on designing and implementing data processing frameworks and pipelines using Google Cloud Platform (GCP) technologies. The position requires strong experience in streaming data, particularly with Dataflow and Java, to create efficient data pipelines that support both batch and real-time data processing. The role is contract-based for a duration of 12 months and is located onsite in Irving, TX, with relocation assistance available from day one.

Responsibilities

  • Design and implement Dataflow pipelines using Java.
  • Optimize Dataflow pipelines for performance and cost-efficiency.
  • Integrate Google Cloud Dataflow with BigQuery.
  • Develop robust error handling in Dataflow pipelines.
  • Transform data using Apache Beam and Dataflow.
  • Ensure data quality in ETL pipelines on GCP.
  • Build data pipelines supporting both batch and real-time streams.
  • Utilize GCP services like BigQuery, Airflow, Cloud Composer, Dataflow, Pub-Sub, and Data Proc.

Requirements

  • Strong experience with GCP and its services.
  • Proficiency in Java for Dataflow pipeline creation.
  • Experience with Hadoop and GCP technologies.
  • Familiarity with data management and ETL development.
  • Strong SQL background for analyzing large datasets.
  • Experience in building high-performing data processing frameworks.
  • Knowledge of Apache Spark and Kafka.

Nice-to-haves

  • Experience with Hive queries and Oozie scheduling.
  • Familiarity with Scala and Python for data processing.
  • Understanding of data sources, targets, and business rules.

Benefits

  • Relocation assistance from day one.
  • Opportunity to work on cutting-edge GCP technologies.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service