Download Print this page

Huawei FusionServer G5500 Manual page 58

Hide thumbs Also See for FusionServer G5500:

Advertisement

How to Defend: Disadvantages
No. Question
Main Considerations
1
The G5500 supports PCIe
1. NVLink is mainly used in specific
GPU cards and does not
scenarios to improve the
support NVLink GPU cards.
performance and the price is high.
2. The specifications are being
developed.
2
The G5500 G560 is 4U-8G,
1. As cabinet power supply in normal
and the density is not high.
equipment rooms is limited, heat
dissipation for some components is
difficult if the density is too high.
2. When the density is too high, the
chassis power modules will be limited
and it will be difficult to support N+N
redundancy.
3
The G5500 G560 does not
1. The GPU server focuses on GPU
support the Scalable
acceleration.
Processors.
2. The specifications are being
developed.
4
When the G5500 G560 is
1. For specifications lower than 4
configured with 4 or more
drives, consider using the SATA
SATA SSDs, the
SSDs to form RAID 5.
performance cannot reach
2. Consider using NVMe SSD.
the line rate.
5
The G5500 G560 does not
1. There are only a few customers
support the 2G Cache 3108
with such requirements. Guide the
card.
customers to use the 1G Cache 3108
card.
www.huawei.com ▪ Huawei Confidential ▪
How to Defend
Currently, the roadmap and guidance are based on PCIe GPU mainly out of the following considerations:
(1) NVLink GPU is used to add NVLink high-speed interconnection between GPUs, increase the transmission bandwidth
between GPUs, and improve the performance of some specific applications (requiring high-bandwidth communication between
GPUs); the performance improvement is not obvious in other applications.
(2) The cost of NVLink GPU cards is higher than that of the PCIe GPU cards (estimated to be 30% highly) and the application
scope is narrow. The PCIe GPU card specifications are preferentially developed. The 8*NVLink card specifications are
expected to be delivered in Q2 2018.
1. The power consumption of the GPU server is high (up to 250–300 W for a GPU). According to the preliminary tests, after
the 8 P100 and 145 W CPUs are pushed to maximum load, the power consumption will exceed 3200 W. In most equipment
rooms, the power supply capability of a single cabinet is 6–15 kW. Even if the 15 kW is used, only up to 4 GPU servers can be
configured. The high density is not a significant benefit on GPU servers, and the high density also causes difficulties in cabinet
heat dissipation and increases difficulty in engineering design.
2. When the 2U height supports 8 GPUs, it is difficult to supply power to the entire chassis. For example, two power modules
are generally used for 2U-8GPU. Even if with 16 A power, the maximum power supply is 3000 W with 220 V AC. When the
power consumption exceeds 3000 W, power supply cannot work in 1+1 mode and reliability cannot be ensured. An alternative
is to separate the power modules from the chassis and use external power modules. However, this will occupy space when
the cabinet is deployed and the actual density will decrease.
3. When the height of 8GPU is 2U, the CPUs and GPUs will have to be configured in a front and rear layout due to the
limitation of the GPU card size. As a result, the overall heat dissipation channel is cascaded. When the load is full, the CPU
and GPU cannot support 35°C. (Some vendors clearly indicate that the temperature specification is reduced to 25°C when the
GPU server is fully configured.)
1. The GPU server is used for AI and HPC applications, which are mainly used for GPU acceleration and do not require high
CPU performance. Therefore, the GPU server preferentially supported the latest GPU card first.
2. Using the modular design, the CPU and GPU can be independently upgraded and replaced. The model supporting Scalable
Processors is being developed.
1. When the 4 SATA SSDs are used to form RAID 5, the performance can reach the line rate, which can meet the
requirements of most applications.
2. The G560 supports 6 NVMe SSDs, which deliver higher performance.
1. According to the 2G and 1G Cache test results, the read performance is not improved when RAID 5 is used. For writes
above 16KB, the random write performance is improved by 10–20%. In the case of RAID 10, the read performance is
improved obviously for operations above 16KB and the write performance is not improved. It is necessary to confirm the
customer's application scenario and guide the customer to use 1G Cache.
57
57

Advertisement

loading