Collective Health - San Francisco, CA
posted about 2 months ago
At Collective Health, we are on a mission to transform the healthcare experience for our members. As a Data Engineer on the Data Platform team, you will play a crucial role in building and delivering data assets that provide our customers with visibility into their plan costs and clinical outcomes. This position requires a collaborative spirit, as you will work closely with various departments, including Data Science and Analytics, to ensure that our data solutions meet the needs of our stakeholders. You will leverage industry best practices to create a next-generation data ecosystem that efficiently collects, moves, stores, and analyzes data. Your responsibilities will include creating new data pipelines and improving existing ones using technologies such as Spark (Pyspark, Spark SQL). You will partner with analytic consumers to design logical and physical schemas, enhance existing data models, and build new ones that align with business requirements. Cross-functional collaboration is key in this role, as you will interface with teams across Product, Engineering, Data Science, Analytics/BI, and Operations to understand their data needs and provide consultative and engineering solutions. Additionally, you will be responsible for building data expertise and ensuring data quality across various business domains, including healthcare claims and member experience. To excel in this role, you will need a strong foundation in computer science or a related technical field, along with proven experience as a data engineer. Your technical skills should include proficiency in at least one programming language (such as Scala or Python/PySpark) and SQL, as well as experience with schema design, dimensional data modeling, and large-scale data warehousing architecture. You should also have a background in building data pipelines through efficient ETL design and implementation, and familiarity with distributed data systems like Spark and Databricks. Excellent communication skills are essential for collaborating with stakeholders and ensuring that their data needs are met effectively.