Flagship Pioneering - Cambridge, MA

posted 29 days ago

Full-time - Senior
Cambridge, MA
Professional, Scientific, and Technical Services

About the position

FL94 Inc. is at the forefront of biotechnology, pioneering the innovative field of Protein Editing. As a Senior Machine Learning Engineer specializing in GPU Scaling, you will play a crucial role in our mission to develop breakthrough technologies that transform healthcare and sustainability. Your primary responsibility will be to lead the efforts in efficiently training massive generative models across multiple GPUs, ensuring that our infrastructure supports robust data flow between experimental and computational platforms. This position requires a collaborative spirit, as you will work closely with ML scientists, biologists, and chemists to identify new areas where ML-driven automation can enhance our discovery processes. In this role, you will be tasked with building scalable machine learning pipelines that include efficient parallelization and the effective use of GPUs. You will design and code data processing pipelines, taking ownership of the company-wide data platform. Your collaboration with the Platform Development and Biology teams will be essential in automating a tight design-build-test cycle for ML-generated small molecules. You will thrive in a CI/CD software development environment, utilizing tools like git, participating in code reviews, and independently developing robust code. Your contributions will help shape and execute the vision of FL94, making a significant impact on our scientific endeavors. The ideal candidate will appreciate the importance of building integrated hardware/software systems for training and deploying ML models in biology, viewing this as equally valuable as achieving experimental breakthroughs and developing ML-focused algorithms. This position offers a unique opportunity to be part of a dynamic, interdisciplinary team that is dedicated to pushing the boundaries of science and technology.

Responsibilities

  • Building scalable machine learning pipelines, including efficient parallelization and use of GPUs.
  • Designing and coding data processing pipelines, ownership of company-wide data platform.
  • Working closely with Platform Development and Biology teams to automate a tight design-build-test cycle for ML generated small molecules.
  • Working in a collaborative CI/CD software development environment, including use of git, participating in code reviews, and independent development of robust code.
  • Contributing to creating, shaping, and executing the vision of the company within an entrepreneurial, collaborative, and interdisciplinary team.

Requirements

  • Undergraduate or Master's degree in bioinformatics, computer science, mathematics, physics or related field and 4+ years of experience is required.
  • Experience with MLOps practices and tools including version control, automated testing, and CI/CD.
  • Proficient in Python and experience in using deep learning libraries (i.e. PyTorch).
  • Experience with cloud computing (i.e. AWS), database scale-out (e.g. distributed postgresql), and data pipelining tools (i.e. Nextflow, Airflow).
  • Experience with Slurm or other on-site HPC cluster solutions.

Nice-to-haves

  • Experience with computational chemistry or protein structure platforms and collaborative tools.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service