Brown University - Providence, RI

posted 2 months ago

Full-time - Mid Level
Hybrid - Providence, RI
Educational Services

About the position

The Lead High Performance Computing Engineer is responsible for overseeing the HPC team that manages high-performance computing (HPC) cluster, storage, and networking infrastructure. This role involves integrating HPC systems with the broader IT infrastructure, deploying and managing parallel file systems, and ensuring system security. The engineer will also debug low-level systems software, evaluate software for acquisition, and collaborate with vendors to maintain high availability of production services, while providing occasional user support for system-level application debugging and optimization.

Responsibilities

  • Manage the HPC team and oversee the high-performance computing cluster, storage, and networking infrastructure.
  • Integrate HPC systems with Brown's overarching IT infrastructure.
  • Deploy and manage parallel file systems while maintaining system security.
  • Debug and potentially rewrite low-level systems software as needed.
  • Evaluate software for potential acquisition and work with vendors for support.
  • Provide user support for system-level application debugging and optimization.

Requirements

  • Bachelor's degree preferred.
  • 3 - 5 years of experience with Linux systems administration.
  • Lead or supervisory experience.
  • Familiarity with research computing environments, particularly RHEL/CentOS Linux.
  • Strong problem resolution and troubleshooting skills.
  • Excellent interpersonal and communication skills, both oral and written.
  • Ability to multi-task and prioritize activities effectively.
  • Conceptual knowledge of systems architectures, security, networking, storage systems, and parallel computing.
  • Knowledge of programming languages such as C, C++, bash, and Perl.
  • Experience with source control systems like Git and log correlation software such as Sumologic.
  • Familiarity with large-scale research computing platforms like Globus, HPC environments, SLURM, and GPFS.

Nice-to-haves

  • Experience with GPFS, Lustre, or BeeGFS filesystems.
  • Basic knowledge of MPI programming and RDMA interconnects.
  • Experience with Infiniband and SLURM.

Benefits

  • Generous benefits package including health insurance and retirement plans.
Job Description Matching

Match and compare your resume to any job description

Start Matching
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service