Data Engineer - Python

$51,000 - $106,000/Yr

Beacon Hill Staffing Group - Santa Fe, NM

posted about 2 months ago

Full-time - Mid Level
Remote - Santa Fe, NM
Administrative and Support Services

About the position

The Data Engineer position at Beacon Hill Staffing Group is a remote role focused on designing, developing, and maintaining data pipelines and infrastructure to support data-driven decision-making within the organization. This ongoing multi-year contract requires a strong proficiency in Python, SQL, cloud technologies, and Kubernetes to ensure efficient processing, storage, and retrieval of data. The successful candidate will be responsible for building scalable and reliable data pipelines, developing ETL processes, and ensuring data quality across various sources. In this role, the Data Engineer will implement and manage data storage and processing solutions on cloud platforms such as AWS, Azure, or Google Cloud. The position involves utilizing cloud data services like BigQuery, Snowflake, or Redshift to store and analyze large datasets, as well as configuring cloud resources for optimal performance and cost-efficiency. Additionally, the Data Engineer will deploy and manage data applications using Docker and Kubernetes, developing CI/CD pipelines to automate workflows and ensuring high availability of containerized applications. Performance optimization is a key aspect of this role, where the Data Engineer will tune SQL queries and data storage configurations to handle large volumes of data efficiently. Collaboration with data scientists, analysts, and other stakeholders is essential to understand data requirements and provide necessary support. The Data Engineer will also document data processes and workflows to promote transparency and knowledge sharing within the team.

Responsibilities

  • Design, build, and maintain scalable and reliable data pipelines using Python and SQL.
  • Develop ETL (Extract, Transform, Load) processes to integrate data from various sources into data warehouses and databases.
  • Ensure data quality and consistency across different data sources and systems.
  • Implement and manage data storage and processing solutions on cloud platforms (e.g., AWS, Azure, Google Cloud).
  • Utilize cloud data services such as BigQuery, Snowflake, Redshift, or similar to store and analyze large datasets.
  • Configure and manage cloud resources for optimal performance and cost-efficiency.
  • Deploy and manage data applications and services using Docker and Kubernetes.
  • Develop and maintain Kubernetes manifests, Helm charts, and CI/CD pipelines to automate data workflows.
  • Monitor and troubleshoot containerized applications to ensure high availability and reliability.
  • Optimize data processing pipelines for performance and scalability.
  • Tune SQL queries and data storage configurations to handle large volumes of data efficiently.
  • Implement monitoring and logging solutions to track data pipeline performance and identify issues.
  • Work closely with data scientists, analysts, and other stakeholders to understand data requirements and provide support.
  • Collaborate with DevOps and infrastructure teams to integrate data solutions with existing systems.
  • Document data processes, workflows, and configurations for transparency and knowledge sharing.

Requirements

  • Experience as a Data Engineer or in a similar role with a strong focus on Python, SQL, and cloud technologies.
  • Proficiency in Python for data engineering tasks, including scripting and automation.
  • Advanced SQL skills for querying and manipulating data.
  • Hands-on experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and related data services.
  • Experience with containerization and orchestration technologies such as Docker and Kubernetes.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service