Bayforce - Boulder, CO

posted 2 months ago

Full-time - Mid Level
Onsite - Boulder, CO
Professional, Scientific, and Technical Services

About the position

This client is searching for a Senior Data Engineer who possesses strong skills in developing, managing, and optimizing data workflows, handling complex data environments, and utilizing contemporary data platforms. The ideal candidate will have expertise in Databricks, Airflow, Snowflake, PySpark, Python, and SQL, as well as the capability to design and maintain data ecosystems and operational data systems. This position is critical for ensuring that data processes are efficient, scalable, and reliable, which is essential for the organization’s data-driven decision-making processes. The Senior Data Engineer will be responsible for designing and implementing efficient, scalable, and reliable data workflows using Databricks, PySpark, and SQL to handle large datasets. A key aspect of this role will be leading the optimization and implementation of Airflow, providing the engineering team with guidance on best practices. The candidate will work closely with the environment manager to set up and maintain various environments (development, testing, and production) optimized for both performance and cost. In addition to workflow development, the Senior Data Engineer will collaborate on the design and development of operational data systems, ensuring they meet the needs of performance, scalability, and availability. The role requires constant monitoring and adjustment of data processes to optimize resource utilization and minimize delays in data handling. Furthermore, the candidate will be responsible for developing and enforcing best practices for coding, testing, documentation, and deployment within the data engineering team.

Responsibilities

  • Design and implement efficient, scalable, and reliable data workflows using Databricks, PySpark, and SQL to handle large datasets.
  • Lead efforts to set up, maintain, and optimize Airflow, providing the engineering team with guidance on best practices.
  • Work closely with the environment manager to set up and maintain various environments (development, testing, and production) optimized for both performance and cost.
  • Collaborate on the design and development of operational data systems, ensuring they meet the needs of performance, scalability, and availability.
  • Constantly monitor and adjust data processes to optimize resource utilization and minimize delays in data handling.
  • Develop and enforce best practices for coding, testing, documentation, and deployment within the data engineering team.

Requirements

  • 5+ years in data engineering, focusing on managing large-scale environments and building complex data workflows.
  • Extensive hands-on experience with Databricks.
  • Strong expertise in job scheduling and automation with Airflow.
  • Deep knowledge of Snowflake as a data warehousing tool.
  • Advanced proficiency in PySpark for distributed data operations.
  • Solid Python skills for scripting and building data workflows.
  • Expert-level SQL for managing and querying data.
  • Proven experience managing cloud-based environments.
  • Experience in creating and managing operational data systems.
  • Bachelor's degree in Computer Science, Information Systems, or an equivalent field of study, or equivalent work experience.

Nice-to-haves

  • Experience with AWS, Azure, or Google Cloud Platform.
  • Familiarity with designing and implementing serverless solutions.
  • Hands-on experience with Data Lake/Delta Lake.
  • Experience in designing event-driven data workflows.
  • Knowledge of continuous integration and deployment in a data engineering context.
  • Experience with additional tools in the big data space is a plus.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service