Amazon - New York, NY

posted 16 days ago

Full-time - Mid Level
New York, NY
Sporting Goods, Hobby, Musical Instrument, Book, and Miscellaneous Retailers

About the position

The Machine Learning Data Engineer role within Amazon's Worldwide Sustainability (WWS) organization focuses on building the data infrastructure necessary to support Amazon's sustainability initiatives. This position involves collaborating with machine learning and environmental scientists to develop and maintain data systems that facilitate the analysis and application of sustainability-related data. The role emphasizes the importance of data in driving solutions for environmental and social advancements, contributing to Amazon's long-term sustainability strategy.

Responsibilities

  • Designing, implementing, and maintaining data infrastructure to support a wide variety of large and complex data sets, ensuring high performance, availability, and integrity.
  • Identifying and solving data needs for Gen AI and foundational model training and benchmarking for tasks across the sustainability domain.
  • Developing and optimizing robust data pipelines for internal data from sources such as Product Lifecycle Management tools, Product Details, Inventory Management Platforms, and financial systems.
  • Design and implement web-scale data collection for images, text, structured data across locations and sustainability domains.
  • Develop comprehensive monitoring, alarming, and data quality controls for all of the above.
  • Partnering with Scientists and Software Engineers to create our data collection strategy and ML Ops best practices.

Requirements

  • 3+ years of data engineering experience
  • Experience with data modeling, warehousing and building ETL pipelines
  • Knowledge of professional software engineering & best practices for full software development life cycle, including coding standards, software architectures, code reviews, source control management, continuous deployments, testing, and operational excellence.

Nice-to-haves

  • Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
  • Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
  • Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
  • Experience working on and delivering end to end projects independently.

Benefits

  • Flexible work hours and arrangements
  • Mentorship and career growth resources
  • Employee-led affinity groups fostering inclusion
  • Ongoing events and learning experiences
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service