E-Solutions Group - New York, NY

posted 3 months ago

Full-time
New York, NY
Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

About the position

The ETL Architect/Developer position is a critical role within our data engineering team, responsible for designing, developing, and maintaining robust ETL processes that handle large volumes of data. The ideal candidate will have extensive experience in database management systems (DBMS) and ETL tools, with a strong emphasis on advanced SQL capabilities. This role requires a deep understanding of database design techniques and the ability to work with extremely large datasets, ensuring data integrity and performance optimization. In this position, you will leverage your programming skills in Python to develop data transformation scripts and automate ETL workflows. Familiarity with the Hadoop ecosystem, including HDFS and Spark, is essential as you will be working with big data technologies to enhance our data processing capabilities. You will also be responsible for orchestrating ETL processes using tools like Airflow, ensuring that data pipelines are efficient and reliable. The role demands strong problem-solving and troubleshooting skills, as you will be tasked with identifying and resolving data-related issues that may arise during the ETL process. You will collaborate closely with data analysts and other stakeholders to understand data requirements and deliver high-quality data solutions that meet business needs. A background in a UNIX or Linux development environment is preferred, as you will be working in these systems to deploy and manage ETL processes effectively.

Responsibilities

  • Design and develop ETL processes to handle large volumes of data.
  • Maintain and optimize existing ETL workflows for performance and reliability.
  • Collaborate with data analysts to understand data requirements and deliver solutions.
  • Utilize Python for data transformation and automation of ETL tasks.
  • Orchestrate ETL processes using Airflow to ensure efficient data pipelines.
  • Troubleshoot and resolve data-related issues in ETL processes.
  • Work with the Hadoop ecosystem, including HDFS and Spark, for big data processing.
  • Implement database design techniques to ensure data integrity and performance.

Requirements

  • 10+ years of experience with DBMS and ETL tools.
  • Advanced SQL capabilities and in-depth knowledge of database design techniques.
  • Programming experience in Python.
  • Scripting experience using Shell.
  • Familiarity with the Hadoop ecosystem (HDFS, Spark).
  • Strong problem-solving and troubleshooting skills.
  • BA, BS, MS, or PhD in Computer Science, Engineering, or a related technology field.
  • Knowledge of RDBMS and MPP systems.
  • Experience with ETL tools like Informatica.
  • Experience with Airflow for orchestration.
  • Experience creating and maintaining BAS services.
  • Experience working in a UNIX or Linux development environment.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service