Data Engineer - Python

$104,000 - $120,640/Yr

Beacon Hill Staffing Group - Montgomery, AL

posted about 2 months ago

Full-time - Mid Level
Remote - Montgomery, AL
Administrative and Support Services

About the position

The Data Engineer is responsible for designing, developing, and maintaining data pipelines and infrastructure to support data-driven decision-making within the organization. This role requires strong proficiency in Python, SQL, cloud technologies, and Kubernetes to ensure the efficient processing, storage, and retrieval of data. The Data Engineer will play a crucial role in building scalable and reliable data solutions that meet the needs of various stakeholders across the organization. In this position, the Data Engineer will focus on several key areas, including data pipeline development, cloud data infrastructure, containerization and orchestration, and performance optimization. The engineer will design and build data pipelines using Python and SQL, ensuring data quality and consistency across different sources. They will also implement and manage data storage solutions on cloud platforms such as AWS, Azure, or Google Cloud, utilizing services like BigQuery, Snowflake, or Redshift to analyze large datasets. Additionally, the Data Engineer will deploy and manage applications using Docker and Kubernetes, developing CI/CD pipelines to automate workflows. Performance optimization will be a critical aspect of the role, requiring the engineer to tune SQL queries and configure data storage for efficiency. Collaboration with data scientists, analysts, and DevOps teams will be essential to understand data requirements and integrate solutions effectively. Documentation of processes and workflows will also be a key responsibility to ensure transparency and knowledge sharing within the team.

Responsibilities

  • Design, build, and maintain scalable and reliable data pipelines using Python and SQL.
  • Develop ETL (Extract, Transform, Load) processes to integrate data from various sources into data warehouses and databases.
  • Ensure data quality and consistency across different data sources and systems.
  • Implement and manage data storage and processing solutions on cloud platforms (e.g., AWS, Azure, Google Cloud).
  • Utilize cloud data services such as BigQuery, Snowflake, Redshift, or similar to store and analyze large datasets.
  • Configure and manage cloud resources for optimal performance and cost-efficiency.
  • Deploy and manage data applications and services using Docker and Kubernetes.
  • Develop and maintain Kubernetes manifests, Helm charts, and CI/CD pipelines to automate data workflows.
  • Monitor and troubleshoot containerized applications to ensure high availability and reliability.
  • Optimize data processing pipelines for performance and scalability.
  • Tune SQL queries and data storage configurations to handle large volumes of data efficiently.
  • Implement monitoring and logging solutions to track data pipeline performance and identify issues.
  • Work closely with data scientists, analysts, and other stakeholders to understand data requirements and provide support.
  • Collaborate with DevOps and infrastructure teams to integrate data solutions with existing systems.
  • Document data processes, workflows, and configurations for transparency and knowledge sharing.

Requirements

  • Experience as a Data Engineer or in a similar role with a strong focus on Python, SQL, and cloud technologies.
  • Proficiency in Python for data engineering tasks, including scripting and automation.
  • Advanced SQL skills for querying and manipulating data.
  • Hands-on experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and related data services.
  • Experience with containerization and orchestration technologies such as Docker and Kubernetes.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service