llm.sycl

August 4, 2024 · View on GitHub

A cross-architecture porting of Andrej Karphaty's llm.c to SYCL/Intel OneAPI.

Quick start

Quick start (GPU, fp32 only)

Run the 1 GPU, fp32 code like this

chmod u+x ./download_starter_pack.sh
./download_starter_pack.sh
make train_gpt2_fp32
./train_gpt2_fp32

Quick start (single kernels)

The dev/sycl directory contains a number of standalone kernels that can be compiled and run independently. These are the building blocks of the full model and the train_* files.

Let's take attention_forward.cpp as an example. The following will compile the attention forward pass kernel for Intel's hardware:

cd dev/sycl/
make attention_forward

Then run it with

./attention_forward

Use the -DCUDA and -DCUDA_ARCH flags to enable NVIDIA support. Similarly, -DHIP and -DHIP_ARCH for AMD support.