skip to Main Content
1-844-371-4949 info@applieddatasystems.com

Special Educational Discount

NVIDIA Titan RTX

NVIDIA TITAN RTX features 576 multi-precision Turing Tensor
Cores that deliver up to 130 teraFLOPS (TFLOPS) for
deep learning training; 72 Turing RT Cores that provide
up to 11 GigaRays per second for maximum real-time ray
tracing performance; and 24 gigabytes (GB) of GDDR6
memory for training with higher batch sizes, processing
larger datasets and animation models, and managing
the most demanding creative workflows. Pair two TITANs
together with NVIDIA NVLink and double your memory
and performance

Highlights

GPU Memory 24 GB GDDR6
Memory Interface 384-bit
Memory Bandwidth Up to 672 GB/s
NVIDIA CUDA® Cores 4,608
NVIDIA Tensor Cores 576
NVIDIA RT Cores 72
Single-Precision Performance: 16.3 TFLOPS
Tensor Performance: 130 TFLOPS
NVIDIA NVLink: Connects 2 TITAN RTX GPUs
NVIDIA NVLink Bandwidth: 100 GB/s (bidirectional)
System Interface PCI Express 3.0 x 16
Power Consumption 280 W
Thermal Solution Active
Form Factor 4.4” H x 10.5” L, Dual slot, full height
Display Connectors: 3x DisplayPort, 1x HDMI,
1x USB Type-C
Max Simultaneous Displays:
4x 4096 x 2160 @ 120 Hz,
4x 5120 x 2880 @ 60 Hz,
2x 7680 x 4320 @ 60 Hz
Encode/Decode Engines 1x encode, 1x decode
VR Ready Yes
Graphics APIs: Microsoft DirectX 12 API3,
Vulkan API4, OpenGL 4.6
Compute APIs: CUDA, DirectCompute, OpenCL

NVIDIA A100 Tensor Core GPU

NVIDIA A100’s third-generation Tensor Cores with Tensor Float (TF32) precision provide up to 20X higher performance over the prior generation with zero code changes and an additional 2X boost with automatic mixed precision and FP16. When combined with third-generation NVIDIA® NVLink®, NVIDIA NVSwitch, PCI Gen4, NVIDIA Mellanox InfiniBand, and the NVIDIA Magnum IO software SDK, it’s possible to scale to thousands of A100 GPUs. This means that large AI models like BERT can be trained in just 37 minutes on a cluster of 1,024 A100s, offering unprecedented performance and scalability.

Highlights

Deep Learning Training:

NVIDIA A100’s third-generation Tensor Cores with Tensor Float (TF32) precision provide up to 20X higher performance over the prior generation with zero code changes and an additional 2X boost with automatic mixed precision and FP16

Deep Learning Inference:

A100 introduces groundbreaking new features to optimize inference workloads. It brings unprecedented versatility by accelerating a full range of precisions, from FP32 to FP16 to INT8 and all the way down to INT4. Multi-Instance GPU (MIG) technology allows multiple networks to operate simultaneously on a single A100 GPU for optimal utilization of compute resources. And structural sparsity support delivers up to 2X more performance on top of A100’s other inference performance gains.

High-Performance Computing

A100 introduces double-precision Tensor Cores, providing the biggest milestone since the introduction of double-precision computing in GPUs for HPC. This enables researchers to reduce a 10-hour, double-precision simulation running on NVIDIA V100 Tensor Core GPUs to just four hours on A100. HPC applications can also leverage TF32 precision in A100’s Tensor Cores to achieve up to 10X higher throughput for single-precision dense matrix multiply operations.

Back To Top