Data Engineer

$102,500 - $116,500/Yr

Collective Health - San Francisco, CA

posted about 2 months ago

Full-time - Mid Level

Hybrid - San Francisco, CA

Ambulatory Health Care Services

About the position

At Collective Health, we are on a mission to transform the healthcare experience for our members. As a Data Engineer on the Data Platform team, you will play a crucial role in building and delivering data assets that provide our customers with visibility into their plan costs and clinical outcomes. This position requires a collaborative spirit, as you will work closely with various departments, including Data Science and Analytics, to ensure that our data solutions meet the needs of our stakeholders. You will leverage industry best practices to create a next-generation data ecosystem that efficiently collects, moves, stores, and analyzes data. Your responsibilities will include creating new data pipelines and improving existing ones using technologies such as Spark (Pyspark, Spark SQL). You will partner with analytic consumers to design logical and physical schemas, enhance existing data models, and build new ones that align with business requirements. Cross-functional collaboration is key in this role, as you will interface with teams across Product, Engineering, Data Science, Analytics/BI, and Operations to understand their data needs and provide consultative and engineering solutions. Additionally, you will be responsible for building data expertise and ensuring data quality across various business domains, including healthcare claims and member experience. To excel in this role, you will need a strong foundation in computer science or a related technical field, along with proven experience as a data engineer. Your technical skills should include proficiency in at least one programming language (such as Scala or Python/PySpark) and SQL, as well as experience with schema design, dimensional data modeling, and large-scale data warehousing architecture. You should also have a background in building data pipelines through efficient ETL design and implementation, and familiarity with distributed data systems like Spark and Databricks. Excellent communication skills are essential for collaborating with stakeholders and ensuring that their data needs are met effectively.

Responsibilities

Leverage best in industry practices to build the next generation data ecosystem to collect, move, store and analyze data.
Create new data pipelines and improve/maintain existing pipelines using Spark (Pyspark, Spark SQL).
Partner with analytic consumers to design logical and physical schemas, improve existing data models and build new ones.
Interface with Product, Engineering, Data Science, Analytics/BI, and Operations to understand their data needs, providing both consultative and data engineering solutions for consumers.
Build data expertise and own data quality across various business domains including healthcare claims and member experience.

Requirements

BS degree in Computer Science or related technical field, or equivalent practical experience.
2+ years proven work experience as a data engineer, working with at least one programming language (e.g. Scala, Python/PySpark) plus SQL expertise.
2+ years experience with schema design, dimensional data modeling, and large-scale data warehousing architecture.
Expertise in building data pipelines through efficient ETL design, implementation and maintenance.
Background working with distributed data systems such as Spark, Databricks.
Experience with schedulers/workflow management tools is a plus.
Excellent communication skills to collaborate with stakeholders in Engineering, Product, Data Science, Analytics/BI, and Operations.

Benefits

Health insurance
401k
Paid time off
Stock options

Data Engineer

About the position

Responsibilities

Requirements

Benefits

Tools

Career Hubs

Guides

Company