Other AI chips need to use CUDA ecological architecture, which has methods but is not effective

China has sealed NVIDIA's customized version of RTX Pro 6000D, and technology manufacturers adopt Chinese self-produced AI chips. According to Chinese media reports, Alibaba's semi-conductor design department "T-Head" has developed artificial intelligence (AI) chips, which can match NVIDIA's previous Chinese customized H20 chips. Even if the chip performance can catch up, whether it can be used to use NVIDIA's "Bullying the Martial Arts" CUDA ecological architecture is still a key point. Professionals said that although CUDA does not support other AI chips in normal conditions, there are non-formal methods, and of course there will be a gap in performance.

Professionals said that CUDA is NVIDIA's GPU acceleration architecture, which tightens the NVIDIA GPU driver (NVIDIA Driver) and the hardware instruction set (PTX / SASS). If you are not using NVIDIA GPU, even if Docker Image has CUDA toolkit, the CUDA kernel cannot be executed because the bottom layer hardware does not support it. The CUDA software of Docker Image can only use NVIDIA Container Runtime (nvidia-docker2 / NVIDIA Container Toolkit) to call the host NVIDIA driver, so the host must have NVIDIA GPU + driver.

But NVIDIA is not the only one in the market. AMD, Intel and even China, which strongly emphasize semiconductor independence, also have products such as Alibaba and Baidu being tested or commercialized. As long as there is a market, there must be supply. Even if other AI chips cannot be directly used with CUDA ecological architecture, there are still feasible alternatives.

Professionals say that if the hardware is not an NVIDIA GPU, but want to use CUDA Ecology/CUDA codes in Docker Image, there are several routes:

1. Translation layer

ZLUDA: Translate the CUDA API call into a call to Intel/AMD GPU (currently limited support level and is not fully compatible). HIPIFY + ROCm: AMD's ROCm ecosystem provides HIP (Heterogeneous-Compute Interface for Portability), which can convert CUDA codes into available AMD GPUs to run HIP programs. However, because it is not a transparent Docker CUDA execution, the program code or special runtime needs to be modified.

2. CPU fallback

Use LLVM-based CUDA emulators (such as cuda-sim or cpu-only build). The disadvantage is that its performance is extremely poor, which is almost only suitable for the correctness of the test program, but not for actual computing.

3.OpenCL/SYCL/oneAPI

If the goal is "cross-platform GPU acceleration", it is recommended to use SYCL/oneAPI or OpenCL. Docker Image can package these runtimes without CUDA. The disadvantage of this method is that the original CUDA program needs to be modified.

Total, without an NVIDIA GPU, it is basically impossible to run CUDA natively with Docker Image. You can only consider ZLUDA / HIP (ROCm) to run CUDA programs, but the compatibility is limited. If you just want to "package the CUDA environment", of course Docker Image can do it, but it requires NVIDIA GPU + drive support to really run.