Amazon - Boston, MA

posted 3 months ago

Full-time - Mid Level
Boston, MA
5,001-10,000 employees
Sporting Goods, Hobby, Musical Instrument, Book, and Miscellaneous Retailers

About the position

The Amazon Artificial General Intelligence (AGI) Data Services organization is looking for a Language Engineer with experience in dataset construction, linguistic annotation, dialog/semantic schemas, and automatic processing of large datasets. You will play a critical role in driving innovation and advancing the state-of-the-art in natural language processing and machine learning. You will work closely with cross-functional teams, including product managers, engineers, and data scientists to ensure that our AI systems are aligned with human policies and preferences. As a Language Engineer, you will be responsible for designing data collection and creation tasks in response to scientific needs. This includes authoring instructions, defining and implementing quality targets and mechanisms, providing day-to-day coordination of data collection efforts (including planning, scheduling, and reporting), and being responsible for the final deliverables. You will analyze and extract language-related insights from large amounts of data, build tools or tool prototypes for data analysis or data authoring using Python or another scripting language, and use modeling tools to bootstrap or test new functionalities. Collaboration with scientists and software engineers to evaluate the performance of language models will also be a key part of your role, as well as handling competing requests from a range of data customers.

Responsibilities

  • Design data collection/creation tasks in response to science needs: author instructions, define and implement quality targets and mechanisms, provide day-to-day coordination of data collection efforts (including planning, scheduling, and reporting), and be responsible for the final deliverables
  • Analyze and extract language-related insights from large amounts of data
  • Build tools or tool prototypes for data analysis or data authoring, using Python or another scripting language
  • Use modeling tools to bootstrap or test new functionalities
  • Collaborate with scientists and software engineers to evaluate performance of language models
  • Handle competing requests from a range of data customers

Requirements

  • Master's or higher degree in a relevant field (computational linguistics or equivalent field with computational analysis)
  • 2+ years experience in computational linguistics or language data processing
  • Experience with language annotation and other forms of data markup
  • Experience with scripting languages, such as Python
  • Experience working with speech and text language data in multiple languages
  • Excellent communication, strong organizational skills and very detail-oriented
  • Comfortable working in a fast-paced, highly collaborative, dynamic work environment

Nice-to-haves

  • PhD in Computational Linguistics (or equivalent field with computational emphasis)
  • Expertise in bootstrapping language data collections in a quickly changing environment
  • Comfortable working with speech and text language data in multiple languages
  • Experience in writing grammars and building FSTs
  • Experience with statistical language modeling
  • Practical knowledge of version control and agile development
  • Familiarity with database queries and data analysis processes (SQL, R, Matlab, etc.)
  • Willingness to support several projects at one time, and to accept reprioritization as necessary
  • Able to think creatively and possess strong analytical and problem-solving skills

Benefits

  • Employee Discount
  • Health Insurance
  • Vacation & Paid Time Off
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service