Allegis Group - Ashburn, VA

posted 2 months ago

Full-time - Mid Level
Ashburn, VA
10,001+ employees
Administrative and Support Services

About the position

This is a significant opportunity to join the Systems Engineering team as a Staff Data Center Engineer, where you will play a vital role in the deployment and reliability of large-scale consumer online services. As a member of the Data Center Engineering and Operations team, your primary responsibility will be to oversee the daily operations of our co-location data centers, ensuring optimal uptime, performance, and capacity of the services we manage. You will leverage your extensive experience to support various services and engage in mission-critical projects within a fast-paced and collaborative environment. In this role, you will be tasked with planning and facilitating data center expansions and build-outs for both new and existing footprints. You will monitor power consumption and environmental conditions within the data centers, leading a team of operations engineers to accomplish daily tasks effectively. Weekly planning meetings will be a part of your responsibilities, where you will collaborate with internal teams to define project requirements for hardware deployment. You will also create automated provisioning processes to streamline hardware deployments and manage vendor relations with manufacturers and value-added resellers (VARs). Your hands-on involvement will include provisioning and deploying server and network equipment, performing initial configurations, diagnosing complex technical issues, and managing hardware life-cycle processes. You will maintain an up-to-date inventory of all hardware across our data centers and implement best practices for maintaining a data center environment. Documentation and tracking of all assigned issues and tasks will be essential, ensuring timely resolution through our internal ticketing system.

Responsibilities

  • Plan and facilitate datacenter expansions and build-outs for new and existing footprints.
  • Monitor datacenter power consumption and environments with existing footprints.
  • Lead a team of datacenter operations engineers in accomplishing various day-to-day tasks.
  • Facilitate weekly planning meetings with the datacenter team.
  • Collaborate with internal teams to define project requirements for hardware deployment.
  • Create automated provisioning processes for streamlining hardware deployments.
  • Plan, schedule and perform upgrades/maintenance on infrastructure hardware.
  • Manage vendor relations with manufacturers and VARs.
  • Maintain existing monitoring processes and implement new functionality/metrics.
  • Perform hands-on datacenter infrastructure provisioning and server/network equipment deployments.
  • Rack/Cable/Provision a large inventory of servers, switches, PDUs and consoles.
  • Perform initial configuration of systems as defined by our standard operating procedures.
  • Diagnose complex technical problems, provide detailed analysis/root cause as well as remediation/mitigation recommendations.
  • Plan and assist with hardware life-cycle management from provisioning to retiring and decommission.
  • Manage RMA processes with various vendors.
  • Maintain an up-to-date inventory list of all hardware equipment across our datacenters.
  • Implement best-practice methodology for maintaining a datacenter environment.
  • Document and track all assigned datacenter related issues and tasks via our internal ticketing system in a timely fashion.

Requirements

  • BA/BS in Information Technology, Computer Science or a related field (or equivalent experience).
  • IT Certifications such as Server+, RHCSA/RHCE, CCNA or similar are a plus.
  • Minimum 8 years of datacenter related experience with at least 5 years of hands-on experience in an enterprise scale datacenter environment.
  • Self-motivated, continuous learner, comfortable and effective in new areas requiring experimentation and rapid problem solving.
  • Excellent time management skills, with the ability to prioritize and multitask under shifting deadlines in a fast-paced environment.
  • Strong understanding of x86 server hardware architecture and subsystems as it relates to configuration, triage, and certification in a large-scale server environment.
  • Knowledgeable in datacenter best practices including cabling, power balancing, cooling and airflow optimization, inventory tracking, capacity planning and host/service diversity.
  • Strong interpersonal skills with the ability to lead as well as work in a team environment.
  • Meticulous attention to detail and strong organization skills.
  • Past experience as a team lead or as a people manager is a plus.
  • Ability to lift and carry equipment up to 75 pounds safely and reliably on a regular basis.
  • Excellent written and verbal communication skills.
  • Demonstrated proficiency in monitoring stacks such as Prometheus, Alertmanager, and Grafana.
  • Hands-on experience with PXE boot, UEFI, AMI BIOS distributions, BMC/iDRAC implementation.
  • Experience creating and executing Ansible playbooks.
  • Practical professional knowledge of Linux and full network stack from NIC firmware to TCP/IP.
  • Expertise with SAN and NAS arrays such as Netapp, Isilon, Pure Storage, and Brocade.
  • Familiarity with version control systems such as Bitbucket and Git.
  • Familiarity with performance testing and reporting tools, such as Phoronix, FIO, Stream and others.
  • Experience with ISC DHCP and BIND DNS operations.
  • Intermediate scripting skills in Python and familiarity with Bash.
  • Significant knowledge of Linux kernel drivers, kernel tuning, and debugging hardware compatibility issues.
  • Basic understanding of subnetting, DHCP Relays, load balancing, and ARP.
  • Working knowledge of package management tools such as APT, RPM.

Nice-to-haves

  • Experience with Docker.
  • Familiarity with cloud services and virtualization technologies.

Benefits

  • Health insurance coverage.
  • 401k retirement savings plan.
  • Paid time off and holidays.
  • Professional development opportunities.
  • Flexible scheduling options.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service