InfoVision - Irving, TX

posted 29 days ago

Full-time - Senior
Irving, TX
Professional, Scientific, and Technical Services

About the position

The Lead Big Data Engineer (GCP Cloud) is responsible for designing, developing, and maintaining scalable data processing pipelines in a cloud environment. This role focuses on both batch and real-time data processing, leveraging Google Cloud Platform tools and the Hadoop ecosystem to create efficient data solutions. The position requires strong leadership skills to mentor a team of engineers and drive innovative data engineering practices.

Responsibilities

  • Lead the design, development, and maintenance of scalable batch and real-time data processing pipelines.
  • Design and build data pipelines in large-scale distributed systems.
  • Utilize Google Cloud Platform (GCP) tools such as BigQuery, Dataflow, Pub/Sub, and GCS.
  • Implement solutions using the Hadoop Big Data Ecosystem including HDFS, Hive, Pig, HBase, and YARN.
  • Develop data engineering scripts using Python and Scala.
  • Manipulate, transform, and analyze data using SQL.
  • Troubleshoot complex data systems and performance bottlenecks.
  • Mentor and lead a team of engineers in developing data solutions.

Requirements

  • 10+ years of experience in designing and building data pipelines in large-scale distributed systems.
  • Proficiency with Google Cloud Platform (GCP) and its tools.
  • Strong experience with the Hadoop Big Data Ecosystem.
  • Proficiency in Python and Scala for data engineering and scripting.
  • In-depth knowledge of SQL for data manipulation, transformation, and analysis.
  • Experience building both batch and real-time streaming data pipelines.
  • Solid understanding of Hadoop fundamental concepts.
  • Strong familiarity with CI/CD pipelines.
  • Excellent problem-solving skills.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service