Python PySpark ETL Developer - REMOTE

$73,332 - $152,776/Yr

NTT DATA - Dallas, TX

posted 3 months ago

Full-time - Mid Level

Remote - Dallas, TX

10,001+ employees

Professional, Scientific, and Technical Services

About the position

As a Python PySpark ETL Developer at NTT DATA, you will play a crucial role in designing, developing, and deploying scalable data processing applications. This position is fully remote, allowing you to work from anywhere while contributing to our innovative projects. You will collaborate closely with data scientists and analysts to understand their requirements and translate them into effective technical solutions. Your primary focus will be on writing efficient and optimized code to process and analyze large volumes of data, ensuring that our data processing applications are robust and reliable. In this role, you will implement data ingestion processes from various data sources into our data processing platform. You will be responsible for creating and maintaining data pipelines and workflows that facilitate data processing and analytics. Performing data quality checks will be essential to ensure data integrity throughout the system, and you will troubleshoot and debug production issues to identify and resolve any technical problems that arise. Staying updated with the latest technologies and tools in data processing is vital, as you will drive innovation and improve performance within the team. Collaboration with cross-functional teams will be necessary to ensure seamless integration of data processing applications with other systems, making your role integral to the success of our data initiatives.

Responsibilities

Design, develop, and deploy scalable data processing applications using Python and PySpark.
Collaborate with data scientists and analysts to understand requirements and translate them into technical solutions.
Write efficient and optimized code to process and analyze large volumes of data.
Implement data ingestion processes from various data sources to the data processing platform.
Create and maintain data pipelines and workflows for data processing and analytics.
Perform data quality checks and ensure data integrity throughout the system.
Troubleshoot and debug production issues to identify and resolve technical problems.
Stay updated with the latest technologies and tools in data processing to drive innovation and improve performance.
Collaborate with cross-functional teams to ensure seamless integration of data processing applications with other systems.

Requirements

Bachelor's degree in Computer Science, Engineering, or a related field.
4+ years of experience in Python development with PySpark experience preferred.
5+ years experience with data integration and ETL processes.

Nice-to-haves

Knowledge of data processing and analytics techniques.
Experience with distributed computing frameworks like PySpark and Apache Spark.
Familiarity with data storage and querying systems like SQL and NoSQL databases.
Understanding of data structures, algorithms, and distributed systems.
Excellent problem-solving and analytical skills.
Strong communication and interpersonal skills.
Ability to work independently and in a team environment.
Proactive attitude towards learning and professional development.

Benefits

Competitive salary based on experience and location.
Opportunities for professional development and learning.
Flexible working hours and remote work options.
Inclusive and diverse work environment.

Python PySpark ETL Developer - REMOTE

About the position

Responsibilities

Requirements

Nice-to-haves

Benefits

Tools

Career Hubs

Guides

Company