Nextgenpros Inc - Minneapolis, MN

posted 5 days ago

Full-time - Mid Level
Minneapolis, MN

About the position

The Azure Data Engineer role involves leading the development and management of ETL pipelines and data processing solutions that support AI/ML initiatives. The position requires advanced skills in Python, Apache Spark, and Azure services, with a focus on building scalable data solutions and collaborating with cross-functional teams to translate business requirements into effective data strategies.

Responsibilities

  • Design, develop, and optimize scalable ETL processes using Python, Apache Spark, and Azure Synapse.
  • Build and manage Azure Data Factory pipelines to orchestrate complex data workflows.
  • Use SQL Pools and Spark Pools within Synapse to manage and process large datasets efficiently.
  • Implement Data Warehousing solutions using Azure Synapse Analytics to provide structured and queryable data layers.
  • Ensure the data platform supports real-time and batch AI/ML data requirements.
  • Build, configure, and manage CI/CD pipelines on Azure DevOps for ETL and data processing tasks.
  • Automate infrastructure provisioning, testing, and deployment using Infrastructure-as-Code (IaC) tools like ARM templates or Terraform.
  • Optimize Azure Data Lake Storage (ADLS Gen2) to store and manage raw and processed data efficiently, ensuring proper access control and data security.
  • Collaborate with Data Scientists, Data Engineers, ML Engineers, and Business Analysts to translate business requirements into data solutions.
  • Work with the DevOps and Security teams to ensure smooth and secure deployment of applications and pipelines.
  • Act as the technical lead in designing, developing, and implementing data solutions, mentoring junior team members.
  • Develop and integrate with external and internal APIs for data ingestion and data exchange.
  • Build, test, and deploy RESTful APIs for secure data access.
  • Use Kubernetes for containerizing and deploying data processing applications.
  • Manage data storage and transformation to support advanced Data Science and AI/ML models.
  • Participate in and lead Agile ceremonies, such as sprint planning, daily stand-ups, and retrospectives.
  • Collaborate with cross-functional teams in iterative development to ensure high-quality and timely feature delivery.
  • Adapt to changing project priorities and business needs in an Agile environment.

Requirements

  • Expertise in Python and Apache Spark for large-scale data processing.
  • Strong experience in Azure Synapse Analytics, including SQL Pools and Spark Pools.
  • Advanced proficiency in Azure Data Factory for ETL pipeline orchestration and management.
  • Knowledge of Data Warehousing principles, with hands-on experience building solutions on Azure.
  • Experience with SQL, including complex queries, optimization, and performance tuning.
  • Familiarity with CI/CD tools like Azure DevOps and managing infrastructure in Azure Cloud.
  • Experience in Java for API integration and microservices architecture.
  • Hands-on knowledge of Kubernetes for containerized data processing environments.
  • Proficiency in working with Azure Data Lake Storage (ADLS) Gen2 for data storage and management.
  • Experience working with APIs (REST, SOAP) and building API-based data integrations.
  • Experience working in an Agile environment, using Scrum or Kanban.
  • Ability to lead, mentor, and coach junior developers in the team.
  • Strong collaboration skills to work with data scientists, analysts, and cross-functional teams to deliver end-to-end data solutions.

Nice-to-haves

  • Azure certifications in data engineering or cloud architecture.
  • Experience deploying AI/ML models on cloud platforms.
  • Familiarity with Data Governance best practices, ensuring compliance with data privacy regulations.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service