Mastercontrol - Salt Lake City, UT

posted 5 months ago

Full-time - Principal
Salt Lake City, UT
Publishing Industries

About the position

MasterControl Inc. is a leading provider of cloud-based quality and compliance software for life sciences and other regulated industries. Our mission is the same as that of our customers: to bring life-changing products to more people sooner. The MasterControl Platform helps organizations digitize, automate, and connect quality and compliance processes across the regulated product development life cycle. Over 1,000 companies worldwide rely on MasterControl solutions to achieve new levels of operational excellence across product development, clinical trials, regulatory affairs, quality management, supply chain, manufacturing, and post-market surveillance. We are looking for a Principal Machine Learning Engineer to join our Machine Learning team to help build systems that accelerate the development and deployment of machine learning models, especially large language models (LLMs). The ideal candidate is someone who has strong ML fundamentals and can also apply them in real production settings. You will partner closely with ML, application, and data engineers to understand requirements and apply your own domain expertise to build high-performance and reusable ML/LLM APIs. The role has a core focus on optimizing inference and fine-tuning of LLMs. You should also be comfortable with infrastructure and large-scale system design, as well as diagnosing both model performance and system failures.

Responsibilities

  • Build high performance, observable, reusable and cost-effective ML/LLM APIs
  • Optimize inference latency by quantizing or low ranking LLMs at fine-tuning
  • Design and be part of larger platform infrastructure design
  • Diagnose both model performance and system failures
  • Engage with ML researchers and stay up to date on the latest trends from industry and academia
  • Participate in teams on call process to ensure the availability of our services
  • Own projects end-to-end, from requirements, scoping, design, to implementation, in a highly collaborative and cross-functional environment.

Requirements

  • 2+ years of experience building machine learning training pipelines or inference services in a production setting
  • Experience with LLM deployment, fine tuning, training, prompt engineering
  • Experience with LLM inference latency optimization techniques, e.g. kernel fusion, quantization, dynamic batching, etc
  • Experience with CUDA, model compilers, and other model-specific optimizations
  • Experience working with a cloud technology stack (Azure, AWS or GCP)
  • Experience with Python, Docker, Kubernetes, and Infrastructure as code (e.g. terraform)
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service