Meta - Pittsburgh, PA

posted 3 days ago

Full-time - Entry Level
Pittsburgh, PA
Web Search Portals, Libraries, Archives, and Other Information Services

About the position

The Software Engineer role on the XR Codec Interactions and Avatars (XRCIA) Compute team at Meta focuses on building tools, libraries, and frameworks to support research in augmented and virtual reality. The position involves managing and optimizing compute resources for GPU superclusters, enabling innovative research and product development in full-body interactive avatars and generative AI. The team values a collaborative environment where self-motivated individuals can thrive and encourages ownership and adaptability in a research-driven context.

Responsibilities

  • Develop, optimize, and maintain automated data ingestion pipelines to move massive datasets at petabytes scale into GPU research supercluster.
  • Provide on-call support and lead incident root cause analysis through multiple data engineering layers (compute, storage, network) for GPU clusters and act as a final escalation point.
  • Collaborate in a diverse team environment across multiple scientific and engineering disciplines, making the architectural tradeoffs required to rapidly deliver software and infrastructure solutions.
  • Leverage the scale and complexity of the larger Meta production infrastructure to accelerate Codec Interaction and Avatars projects.
  • Influence outcomes within your immediate team, peer engineering teams, and with cross-functional stakeholders.
  • Work independently, handle large projects simultaneously, and prioritize team roadmap and deliverables by balancing required effort with resulting impact.

Requirements

  • Currently has, or is in the process of obtaining a Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta.
  • 3+ years experience coding in at least one of the following languages: C++, Python, or Rust.
  • Experience in building large scale data intensive applications.
  • Experience in building and automating web services.
  • Experience in writing system level infrastructure, libraries, and applications.
  • Experience with software development practices such as source control, code reviews, unit testing, debugging and profiling.
  • Proven track record of shipping data processing pipelines for computer vision or compute graphics or machine learning applications.
  • Experience in crafting and maintaining large scale machine learning datasets.
  • Experience in developing performant software and systems.

Nice-to-haves

  • Thorough understanding of Linux operating system, including the networking subsystem.
  • Experience in distributed system performance measurement, logging, and optimization.
  • Experience with Python library management systems such as Conda.
  • Experience with managing HPC scheduler libraries like Slurm, Kubernetes.
  • Prior experience in cluster oncall operations, including troubleshooting server/scheduler/storage errors, maintaining compute/storage environments/libraries/tools, helping onboard users to the cluster, and answering general questions from users.
  • Prior experience in cluster coordination and strategy planning, including collecting/understanding needs of users, developing tools to improve user experience, providing guidance on best practices, forecasting compute/storage needs, and developing long-term user experience/compute/storage strategies.
  • Prior experience building tooling for monitoring and telemetry.
  • Prior experience building PaaS or internal clouds.
  • Prior experience in developing/managing distributed network file systems.
  • Prior experience in network security.
  • Experience in database and data management systems at scale.

Benefits

  • $56.25/hour to $173,000/year + bonus + equity + benefits
  • Individual compensation is determined by skills, qualifications, experience, and location.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service