Tanisha Systems - Plano, TX

posted 10 days ago

Full-time
Plano, TX
Professional, Scientific, and Technical Services

About the position

The Java Spark Developer role at Tanisha Systems involves designing, developing, and maintaining scalable data pipelines using Apache Spark and Java. The position requires collaboration with data scientists and analysts to deliver high-quality data solutions while ensuring data quality, integrity, and security across all data pipelines. The developer will also optimize data processing jobs for performance and cost-efficiency, and monitor and troubleshoot any issues that arise in the data pipelines.

Responsibilities

  • Design, develop, and maintain scalable data pipelines using Apache Spark and Java.
  • Implement data processing workflows and ETL processes to ingest, transform, and store large volumes of data.
  • Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and deliver high-quality data solutions.
  • Optimize and tune data processing jobs for performance and cost-efficiency.
  • Ensure data quality, integrity, and security across all data pipelines and storage solutions.
  • Develop and maintain data models, schemas, and documentation.
  • Monitor and troubleshoot data pipeline issues, ensuring high availability and reliability.

Requirements

  • Hands-on experience with Java Spark and AWS services, including S3, EMR, Lambda, and Glue.
  • Good knowledge/experience with Snowflake.
  • Experience with SQL and NoSQL databases.
  • Familiarity with CI/CD tools such as Jules and Spinnaker.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service