NYU Langone Health - New York, NY

posted about 1 month ago

Full-time - Senior
New York, NY
Hospitals

About the position

The Lead Data Engineer at NYU Langone Health is a pivotal role within the Enterprise Data and Analytics department, focusing on enhancing decision-making, optimizing operations, and improving patient outcomes through data. This position involves designing and developing robust ETL pipelines, integrating diverse data sources, and ensuring data integrity throughout the ETL lifecycle. The engineer will leverage their expertise in the Databricks platform to deliver high-quality data-driven insights and solutions, while also mentoring other data engineers in the team.

Responsibilities

  • Strategize, design, develop, and work with a team of data engineers to deliver automated cloud infrastructure and DevOps solutions.
  • Design and develop ETL code to integrate various data sources.
  • Create and maintain efficient data pipelines on the Databricks platform.
  • Develop, maintain, and optimize ETL pipelines for both streaming and batch data processing.
  • Ensure data integrity and consistency throughout the ETL lifecycle.
  • Provide comprehensive support for ETL processes from data ingestion to final output.
  • Write and execute unit tests and integration test cases for ETL code to ensure high-quality outcomes.
  • Implement monitoring solutions for ETL jobs to ensure timely and successful data processing.
  • Proactively identify, troubleshoot, and resolve issues in ETL workflows.
  • Optimize ETL jobs for maximum performance and efficiency through performance tuning and troubleshooting.
  • Mentor other data engineers in the team, cross-train, and provide guidance.

Requirements

  • Bachelor's degree in Computer Science, Information Systems, Engineering, or Data Science.
  • Minimum 10 years of experience in designing, developing, and optimizing ETL processes.
  • 5 years of experience in developing/supporting a data platform in Azure Databricks.
  • Proficiency in creating and maintaining efficient data pipelines on the Databricks platform.
  • Experience working in a DevOps environment supporting processes in data platform.
  • Strong verbal and written communication skills to explain complex technical concepts to non-technical stakeholders.
  • Ability to troubleshoot and resolve issues in a timely manner.
  • Experience working in an Agile/Scrum environment.
  • Strong analytical and problem-solving skills.
  • Ability to work independently, handle multiple tasks simultaneously, and adapt quickly to change.

Nice-to-haves

  • In-depth knowledge of Databricks platform and technologies including Delta Lake, Databricks SQL, and Databricks Workflows.
  • Experience with Azure cloud platforms and Azure Data Lake cloud storage.
  • Knowledge of data warehousing, data modeling, and best practices.
  • Proficiency in programming languages such as Python, SQL, Scala, or R.
  • Experience with big data technologies such as Apache Spark, Hadoop, or Kafka.
  • Familiarity with DevOps practices and tools such as CI/CD, Git, etc.
  • Knowledge of infrastructure as Code (IaC) tools like Terraform.
  • Experience with implementing data governance and security measures in a cloud environment.

Benefits

  • Competitive salary range of $81,325.15 - $135,541.73 annually based on experience and qualifications.
  • Opportunities for professional development and continuous learning.
  • A supportive and inclusive work environment that values diversity and equity.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service