Senior Data Engineer in Azure using Databricks for Scala/Python

UBS - Raleigh, NC

posted about 2 months ago

Part-time - Mid Level

Raleigh, NC

Securities, Commodity Contracts, and Other Financial Investments and Related Activities

About the position

As a Senior Data Engineer at UBS, you will play a pivotal role in building complex, secure platforms that facilitate the development of automated infrastructure as code. Your primary focus will be on engineering reliable data pipelines that source, process, distribute, and store data efficiently using Azure and Databricks, particularly with Scala and Python. You will be responsible for crafting intricate transformation pipelines across multiple datasets, generating valuable insights that inform business decisions. This involves utilizing internal data platforms and educating team members on best practices for analytics and big data. In this role, you will develop and implement data engineering techniques to automate manual processes, tackle challenging business problems, and ensure the quality, security, reliability, and compliance of our solutions. You will adhere to digital principles while implementing both functional and non-functional requirements. Additionally, you will build observability into our solutions, monitor production health, resolve incidents, and address the root causes of risks and issues. Leveraging Airflow, you will create complex branching data-driven pipelines and utilize Databricks to establish the Spark layer of these data pipelines. Your expertise in Python and Scala will be crucial for performing low-level complex data operations, and you will be expected to codify best practices and methodologies to share knowledge with other engineers at UBS. You will be part of the Group Compliance Regulatory Governance Technology stream, focusing on data analytics and engineering, utilizing the latest data platforms to enhance the group's data strategy. This includes building the central GCRG data lake, developing data pipelines for strategic data sourcing, creating a data virtualization layer, and enabling advanced analytics and elastic search capabilities through cloud computing.

Responsibilities

Engineer reliable data pipelines for sourcing, processing, distributing, and storing data using Databricks and Airflow.
Craft complex transformation pipelines on multiple datasets to produce valuable insights for business decisions.
Develop and implement data engineering techniques to automate manual processes and solve challenging business problems.
Ensure the quality, security, reliability, and compliance of solutions by adhering to digital principles and implementing functional and non-functional requirements.
Build observability into solutions, monitor production health, resolve incidents, and remediate root causes of risks and issues.
Leverage Airflow to build complex branching data-driven pipelines and utilize Databricks for the Spark layer of data pipelines.
Utilize Python and Scala for low-level complex data operations and share best practices with other engineers.

Requirements

Bachelor's or master's degree in computer science or a similar engineering field is highly desired.
5+ years of total IT experience in software development or engineering, with 3+ years of hands-on experience designing and building scalable data pipelines for large datasets on cloud data platforms.
3+ years of hands-on experience in distributed processing using Databricks, Apache Python/Spark, Kafka, and leveraging Airflow scheduler/executor framework.
2+ years of hands-on programming experience in Scala (must have), with Python and Java as preferred languages.
Experience with monitoring solutions such as Spark Cluster Logs, Azure Logs, AppInsights, and Graphana to optimize pipelines.
Proficiency in working with large and complex code base management systems like GitHub/GitLab, Gitflow, and using tools like IntelliJ/Azure Studio.
Experience working with Agile development methodologies and delivering within Azure DevOps, including automated testing and CI/release management.
Expertise in optimized dataset structures in Parquet and Delta Lake formats, with the ability to design and implement complex transformations between datasets.
Expertise in optimized Airflow DAGS and branching logic for tasks to implement complex pipelines and outcomes, along with expertise in both traditional SQL and No-SQL authorship.

Nice-to-haves

Experience with cloud computing platforms beyond Azure.
Familiarity with machine learning frameworks and libraries.
Knowledge of data governance and data quality best practices.

Benefits

Flexible working arrangements including part-time, job-sharing, and hybrid working options.
Opportunities for career development and gaining new experiences in different roles.
A purpose-led culture that promotes collaboration and connection among employees.

Senior Data Engineer in Azure using Databricks for Scala/Python

About the position

Responsibilities

Requirements

Nice-to-haves

Benefits

Tools

Career Hubs

Guides

Company