Moveworks - Mountain View, CA
posted 16 days ago
In this role, you will design, build, and operate highly performant and scalable batch and stream data processing infrastructure and solutions to support day-to-day machine learning operations, including training, serving, evaluation, and experimental systems. You will be responsible for designing and developing Moveworks' foundational data models, data warehouse, and real-time and offline processing pipelines using technologies such as AWS EMR Spark, Apache Kafka, AWS Athena, Snowflake, Airflow, and Apache HUDI. You will closely collaborate with machine learning and data science teams to understand their data needs, influence the data team's roadmap, and lead as well as execute various projects. Additionally, you will build a data governance platform for secure and compliant data management, which includes services for data cataloging, lineage, audit, deletion, and masking. You will also be tasked with building and operating an orchestration platform that includes Temporal and Airflow, enabling other teams to develop features and workflows. Finally, you will build out platform and data services/APIs to make data available to various stakeholders and for customer-facing data products.
Match and compare your resume to any job description
Start Matching