1 / 5
| Core parameters: | Core architecture: Ampere (GA100 GPU cores). Production process: 7nm. Number of transistors: 54 billion. CUDA cores: 6912. Core frequency: 1410MHz. |
| Video memory parameters: | Video memory capacity: 40GB. Video memory type: HBM2e. Memory bit width: 5120 bits. Memory bandwidth: Approx. 2039GB/s. |
| Compute Performance: | FP32 (Single Precision): ~19.5TFLOPS. FP64 (double): ~9.7TFLOPS. TF32: ~156TFLOPS. FP16/BF16: ~312TFLOPS (up to 624TFLOPS in sparse mode). INT8: ~624TOPS. |
| Interfaces & Interconnections: | Interface type: PCI Express 4.0 x16 (PCIe 3.0 compatible, but with half the bandwidth). NVLink: supports NVLink Gen 3 (600GB/s bandwidth per SIM and NVSwitch is required for multi-SIM interconnection), but NVLink bridging is not supported in PCIe (only SXM4 version supports) |
| Power Consumption & Heat Dissipation: | Maximum power consumption: 300W. Heat dissipation: Server-grade heat dissipation (passive heat dissipation, relying on chassis air ducts or dedicated heat sinks) |
| Other Features: | Multi-Instance GPU (MIG): A single card can be physically divided into up to 7 independent instances (each instance has 7 GB of video memory) to achieve multi-task isolation and improve resource utilization. Supported technologies: Acceleration libraries such as CUDA 11, cuDNN, and TensorRT, optimized for deep learning frameworks such as PyTorch and TensorFlow, and vGPUs (such as vComputeServer) |