Net2Source - Irving, TX

posted about 2 months ago

Full-time - Mid Level
Irving, TX
Administrative and Support Services

About the position

The GCP Data Engineer position is a contract role based in Irving, TX, offering a hybrid work environment. The ideal candidate will have over 12 years of total IT experience, with a strong focus on data engineering and analytics. The role requires extensive experience in SQL, Python, and PySpark, with a minimum of 5 years of hands-on experience with Google Cloud Platform (GCP) technologies, particularly BigQuery and Cloud DataProc. The successful candidate will work closely with implementation teams, providing deep technical expertise to deploy large-scale data solutions in both on-premise and cloud environments. In this role, you will collaborate with the data team to leverage the Google Cloud platform for data analysis, model building, and report generation. You will be responsible for integrating large datasets from various sources, transforming business problems into technical data challenges, and ensuring that key business drivers are captured in collaboration with product management. The position involves designing data processing pipelines and architectures, maintaining machine learning and statistical models, and performing data extraction, loading, transformation, cleaning, and validation. The role demands a proactive approach to designing and operationalizing enterprise data solutions using GCP data and analytics services, along with third-party tools such as Python, Pyspark, and Apache Spark. Candidates should have a proven track record of building production data pipelines within a hybrid big data architecture, demonstrating their ability to deliver end-to-end solutions at production scale.

Responsibilities

  • Work with implementation teams from concept to operations, providing deep technical subject matter expertise for deploying large scale data solutions.
  • Collaborate with the data team to efficiently utilize Google Cloud platform for data analysis, model building, and report generation.
  • Integrate massive datasets from multiple data sources for data modeling.
  • Formulate business problems as technical data problems while ensuring key business drivers are captured in collaboration with product management.
  • Design pipelines and architectures for data processing.
  • Create and maintain machine learning and statistical models.
  • Extract, Load, Transform, clean, and validate data.

Requirements

  • 12+ years of total IT experience required.
  • 7-10 years of experience with SQL.
  • 7-10 years of experience with Python/Pyspark.
  • 5 years of experience with GCP, BigQuery, and Cloud DataProc.
  • Active Google Cloud Data Engineer Certification is required.
  • Minimum of 7 years of experience designing and building production data pipelines from data ingestion to consumption within a hybrid big data architecture.

Nice-to-haves

  • Experience with Teradata/Oracle, Informatica/AbInitio/DataStage, Hadoop, Hive, Apache Spark, Cloud Pub/Sub, Cloud Spanner, Cloud SQL, Data Fusion.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service