[Remote] Senior Data Center Connectivity Engineer
Note: The job is a remote job and is open to candidates in USA. NVIDIA is a leading technology company specializing in AI and computing solutions, and they are seeking a Senior Data Center Connectivity Engineer. This role involves translating product reference architectures into physical builds for AI deployments, leading cabling and layout optimizations, and collaborating with various teams to ensure successful cluster operations.
Responsibilities
- Own the development of connectivity reference designs based on requirements from cluster architecture, network engineering, infrastructure software and product hardware teams
- Build and develop comprehensive documentation, including detailed rack elevations and network architecture diagrams and cabling point-to-point list. Support projects throughout design and deployment phases
- Serve as the primary engineering support, closely collaborating with deployment and field teams to ensure successful cluster build-out and operation
- Strategically co-design the cluster with power and cooling infrastructure teams, ensuring a thorough understanding of all facility architectural requirements (Arch, power, cooling)
- Work with hardware, network and security teams to translate software stack requirements into physical requirements: hardware selection, fault domain, network architecture
- Develop new solutions and products in the connectivity space to accelerate the deployment of large scale AI Factories
Skills
- Minimum of 12+ years in a connectivity, network architecture or engineering role within a Hyperscale Cloud Provider, large-scale enterprise data center, or High-Performance Computing (HPC) environment
- BA or BS (or equivalent experience)
- Consistent record of designing, deploying, and operating network fabrics for thousands of GPU/CPU nodes
- Deep expertise in high-speed interconnect technologies, including InfiniBand, RoCE, and RDMA
- Proven experience designing connectivity solutions for high-density GPU clusters (100kW+ per rack) and understanding the unique front-end and back-end requirements for AI training vs. inference
- Deep understanding of data center infrastructure, including rack power/cooling, cable management, and physical density constraints
- Demonstrated ability to lead multidisciplinary teams and complete sophisticated technical initiatives
- Deep expertise with NVIDIA's compute and network product families and deployment standards
- Comfortable operating at the intersection of network engineering, MEP systems, and Infrastructure as a Service software layer
- Experienced with field deployments and/or global reference design documentation, ideally both
Benefits
- Equity
- Benefits
Company Overview
Company H1B Sponsorship