Genmab - Princeton, NJ

posted 10 days ago

Full-time - Mid Level
Princeton, NJ
Professional, Scientific, and Technical Services

About the position

The successful candidate will contribute to the mission of the global data engineering function at Genmab, focusing on the creation of bioinformatics pipelines to process genomics and transcriptomics data. This role involves a diverse set of data-related responsibilities, including data architecture, access, classification, integration, and the development of data-as-a-product. The candidate will work closely with various stakeholders to enable science to progress faster by standardizing and automating data workflows, connecting systems, and implementing data cataloging.

Responsibilities

  • Design, implement and manage ETL data pipelines that process and transform vast amounts of scientific data from public, internal and partner sources into various repositories on a cloud platform (AWS)
  • Incorporate bioinformatic tools and libraries to the processing pipelines for omics assays such as bulk and single cell RNASeq
  • Enhance end-to-end workflows with automation that rapidly accelerate data flow with pipeline management tools such as Step Functions, Airflow, or Databricks
  • Implement and maintain bespoke databases for scientific data (RWE, in-house labs, CRO data) and consumption by analysis applications and AI products
  • Innovate and advise on the latest technologies and standard methodologies in Data Engineering and Data Management, including recent advancements with GenAI
  • Manage relationships and project coordination with external parties such as Contract Research Organizations (CRO) and vendor consultants/contractors
  • Define and contribute to data engineering practices for the group, establishing shareable templates and frameworks
  • Collaborate with stakeholders to determine best-suited data enablement methods to optimize the interpretation of the data
  • Apply value-balanced approaches to the development of the data ecosystem and pipeline initiatives
  • Proactively communicate data ecosystem and pipeline value propositions to partnering collaborators.

Requirements

  • BS/MS in Computer Science, Bioinformatics, or a related field with 5+ years of software engineering experience (8+ years for senior role) or a PhD in Computer Science, Bioinformatics or a related field and 2+ years of software engineering experience (5+ years for senior role)
  • Excellent skills and deep knowledge of ETL pipeline, automation and workflow management tools such as Airflow, AWS Glue, AWS Step Functions, and CI/CD
  • Strong preference specifically for AWS Step Functions and Lambda
  • Excellent skills with bioinformatics pipeline tools and troubleshooting for quality such as Snakemake, WDL, and Nextflow
  • Excellent skills and deep knowledge in Python, Pythonic design and object-oriented programming, including common Python libraries such as pandas
  • Experience with R a plus
  • Excellent understanding of different bioinformatics modules and databases such as STAR, HISAT2, featureCounts, fastQC, RSeQC and Cell Ranger
  • Solid understanding of modern data architectures and their implementation offerings such as Databricks' Delta Tables, Athena, Glue, Iceberg
  • Experience working with clinical data and understanding of GxP compliance and validation processes
  • Proficiency with modern software development methodologies such as Agile, source control, project management and issue tracking with JIRA
  • Proficiency with container strategies using Docker, Fargate, and ECR
  • Proficiency with AWS cloud computing services such as Lambda functions, ECS, Batch and Elastic Load Balancer and other compute frameworks such as Spark, EMR, and Databricks
  • Strong preference for experience with AWS Omics.

Nice-to-haves

  • Experience in a fast-growing, dynamic company
  • Strong communication skills for collaboration with diverse teams
  • Ability to innovate and tackle unknown challenges

Benefits

  • Competitive salary
  • Bonus eligibility
  • Flexible working environment
  • Open community-based office spaces
  • State-of-the-art laboratories
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service