DNI Delaware Nation Industries - Atlanta, GA

posted 20 days ago

Full-time - Mid Level
Atlanta, GA

About the position

The Databricks Developer position at DNI Delaware Nation Industries involves developing and maintaining data pipelines using Databricks, PySpark, and SQL. The role requires collaboration with cross-functional teams to deliver scalable data solutions while ensuring data governance and performance optimization. The ideal candidate will have a strong background in cloud-based data warehousing and DevOps practices, particularly within the Azure environment.

Responsibilities

  • Design, develop, and maintain data pipelines using Databricks with PySpark and Spark SQL.
  • Write PySpark code and SQL queries for efficient data transformation, manipulation, and analysis.
  • Experience writing to Delta Lake Storage for data persistence, ensuring optimized data processing and storage.
  • Collaborate with cross-functional teams to understand business requirements and translate them into scalable data pipeline solutions.
  • Work with Azure Synapse for data warehousing and analytical processing, optimizing performance and ensuring data accessibility.
  • Implement CI/CD pipelines for automated deployment and integration of data pipelines in collaboration with DevOps teams.
  • Ensure best practices for data governance, security, and performance optimization are followed.
  • Troubleshoot, debug, and optimize complex data pipelines to ensure smooth operation and minimal downtime.
  • Provide technical guidance to team members and participate in code reviews to maintain high coding standards and quality.
  • Collaborate with stakeholders to deliver high-quality, data-driven solutions that meet business objectives.

Requirements

  • 3-5 years of experience as a Databricks Developer or similar role with a strong focus on PySpark coding and SQL development.
  • Proficient in PySpark and SQL for data transformation, querying, and analysis.
  • Experience writing to Delta Lake Storage and optimizing performance of data stored in Delta format.
  • Strong experience with Spark SQL for querying and managing large datasets in a distributed computing environment.
  • Experience working with Azure Synapse for cloud-based data warehousing and analytics.
  • Solid understanding of DevOps principles and experience with CI/CD pipelines using tools such as Azure DevOps.
  • Strong problem-solving skills and the ability to work independently as well as part of a team.
  • Excellent communication skills to effectively collaborate with stakeholders.

Nice-to-haves

  • Prior exposure to healthcare or scientific domains.
  • Familiarity with Azure Data Factory for orchestrating data workflows in the cloud.
  • Exposure to other big data technologies such as Apache Hadoop or Apache Flink.
  • Experience with data governance tools and techniques, including data lineage, auditing, and security best practices in cloud environments.
  • Familiarity with containerization technologies such as Docker or Kubernetes.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service