Data Engineer II, RISC

$118,900 - $205,600/Yr

Amazon - Seattle, WA

posted about 2 months ago

Full-time - Mid Level

Seattle, WA

Sporting Goods, Hobby, Musical Instrument, Book, and Miscellaneous Retailers

About the position

Amazon's Regulatory Intelligence Safety and Risk (RISC) team is dedicated to protecting customers from unsafe, illegal, or non-compliant products while enabling our Selling Partners to offer a wide selection of safe products. The RISC Data Engineering team is looking for a Data Engineer II who possesses strong engineering skills and a background in machine learning operations (MLOps). In this role, you will be responsible for designing, building, and maintaining large-scale data pipelines and infrastructure that support our machine learning, data science, and analytics initiatives. You will work closely with Applied Scientists, Machine Learning Scientists, and business stakeholders to understand their requirements and deliver cutting-edge AI/ML solutions. Your contributions will help improve Amazon's business efficiency and simplify the compliance journey for our Selling Partners. As a Data Engineer II, you will design and implement scalable, fault-tolerant data pipelines and infrastructure using AWS technologies such as Lambda, Glue, EMR/Spark, Step Functions, Airflow, DynamoDB, and AWS Batch. You will automate infrastructure deployment and maintenance processes, applying CI/CD principles to enhance the MLOps ecosystem. Your role will also involve developing optimized data models, ETL/ELT processes, and data transformations to ensure high-quality data for machine learning and analytics. Collaboration with Applied Scientists and analytics teams will be essential to provide scalable data solutions that meet their needs. You will continuously monitor and optimize data pipelines and infrastructure, ensuring compliance with data governance and security standards. Mentoring junior engineers and promoting best practices in data engineering and MLOps will be part of your responsibilities. Staying updated with emerging MLOps technologies and trends will be crucial for the continuous improvement of our data solutions. Join our expert team to make a significant impact on Amazon's operations and the experience of our Selling Partners.

Responsibilities

Design, build, and maintain scalable, fault-tolerant, and efficient data pipelines and infrastructure for machine learning operations (MLOps) leveraging AWS technologies.
Automate infrastructure deployment, maintenance processes, and incorporate CI/CD principles to streamline the MLOps ecosystem.
Develop optimized data models, ETL/ELT processes, data transformations, and data warehouse to ensure high-quality, well-structured data for ML and analytics.
Collaborate closely with Applied Scientists, Machine Learning Scientists, and analytics teams to understand data requirements and provide scalable data solutions.
Continuously monitor, optimize, and enhance data pipelines, processes, and infrastructure to support ML and analytics.
Implement and enforce rigorous data governance, security, and compliance standards for our data, including data validation, cleansing, and lineage tracking.
Mentor junior engineers, promoting best practices and knowledge sharing in data engineering and MLOps.
Stay updated with emerging MLOps technologies, tools, and trends, incorporating them into the existing ecosystem for continuous improvement.

Requirements

Bachelor's degree in computer science, engineering, mathematics, statistics or a related field
3+ years of data engineering experience
Experience with ML
Experience with data modeling, warehousing and building ETL pipelines
Knowledge of distributed systems
Knowledge of professional software engineering & best practices for full software development life cycle, including coding standards, software architectures, code reviews, source control management, continuous deployments, testing, and operational excellence

Nice-to-haves

Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, Step Functions, Airflow, DynamoDB and AWS Batch, SageMaker, IAM roles and permissions
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
Experience with advanced ML system design, implementation and maintenance
Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
Strong problem-solving and engineering skills, with the ability to translate business requirements into technical solutions

Benefits

Medical, financial, and other benefits
Equity and sign-on payments as part of total compensation package
Flexible working culture to support work-life balance
Endless knowledge-sharing and mentorship opportunities for career growth

Data Engineer II, RISC

About the position

Responsibilities

Requirements

Nice-to-haves

Benefits

Tools

Career Hubs

Guides

Company