llm.sycl
August 4, 2024 ยท View on GitHub
A cross-architecture porting of Andrej Karphaty's llm.c to SYCL/Intel OneAPI.
Quick start
Quick start (GPU, fp32 only)
Run the 1 GPU, fp32 code like this
chmod u+x ./download_starter_pack.sh
./download_starter_pack.sh
make train_gpt2_fp32
./train_gpt2_fp32
Quick start (single kernels)
The dev/sycl directory contains a number of standalone kernels that can be compiled and run independently. These are the building blocks of the full model and the train_* files.
Let's take attention_forward.cpp as an example. The following will compile the attention forward pass kernel for Intel's hardware:
cd dev/sycl/
make attention_forward
Then run it with
./attention_forward
Use the -DCUDA and -DCUDA_ARCH flags to enable NVIDIA support. Similarly, -DHIP and -DHIP_ARCH for AMD support.