diff --git a/README.md b/README.md index 267d04b1e..c4b75a91d 100644 --- a/README.md +++ b/README.md @@ -48,24 +48,96 @@ Kompute is provided as a single header file [`Kompute.hpp`](#setup). See [build-system section](#build-overview) for configurations available. -#### Your First Kompute +#### Your First Kompute (Simple) -In this simple example we will: +This simple example will show the basics of Kompute through the high level API, including: -1. Create a set of data tensors in host memory for processing -2. Map the tensor host data into GPU memory with Kompute Operation -3. Define shader as string or spirv bytes (can also pass path to file) -4. Run compute shader asynchronously with Async function -5. Create managed sequence to submit batch operations to the CPU -6. Map data back to host by running the sequence of batch operations +1. Create and initialise a set of data tensors for processing +2. Run compute shader synchronously +3. Create managed sequence to submit batch operations to the CPU +4. Map data back to host by running operation View [more examples](https://kompute.cc/overview/advanced-examples.html#simple-examples). ```c++ int main() { - // You can allow Kompute to create the Vulkan components, or pass your existing ones - kp::Manager mgr; // Selects device 0 and first compute queue unless explicitly requested + // Default manager selects device 0 and first compute queue + kp::Manager mgr; + + // 1. Create and initialise a set of data tensors for processing + auto tensorA = mgr.buildTensor({ 3., 4., 5. }); + auto tensorB = mgr.buildTensor({ 0., 0., 0. }); + + // 2. Run compute shader synchronously + mgr.evalOpDefault>( + { tensorA, tensorB }, + shader); // "shader" explained below, and can be glsl/spirv string or path to file + + // 3. Sync results from GPU memory to print the results + mgr.evalOpDefault({ tensorA, tensorB }) + + // Prints the output which is A: { 0, 1, 2 } B: { 3, 4, 5 } + std::cout << fmt::format("A: {}, B: {}", + tensorA.data(), tensorB.data()) << std::endl; +} +``` + +Your shader can be provided as raw glsl/hlsl string, SPIR-V bytes array (using our CLI), or string path to file containing either. Below are the examples of the valid ways of providing shader. + +##### Raw GLSL/HLSL as std::string + +```c++ +static std::string shaderString = (R"( + #version 450 + + layout (local_size_x = 1) in; + + layout(set = 0, binding = 0) buffer a { float pa[]; }; + layout(set = 0, binding = 1) buffer b { float pb[]; }; + + void main() { + uint index = gl_GlobalInvocationID.x; + pb[index] = pa[index]; + pa[index] = index; + } +)"); +static std::vector shader(shaderString.begin(), shaderString.end()); +``` + +##### SPIR-V Bytes as uint8_t / char array (using our CLI) + +You can use the Kompute [shader-to-cpp-header CLI](https://kompute.cc/overview/shaders-to-headers.html) to convert your GLSL/HLSL or SPIRV shader into C++ header file (see documentation link for more info). + +```c++ +static std::vector shader = { 0x03, //... spirv bytes go here) +``` + +##### File path to file containing raw glsl/hlsl or SPIRV bytes + +```c++ +static std::string shader = "path/to/shader.glsl"; +// Or SPIR-V +static std::string shader = "path/to/shader.glsl.spv"; +``` + +#### Your First Kompute (Extended) + +We will cover the same example as above but leveraging more advanced Kompute features: + +1. Create a set of data tensors in host memory for processing +2. Map the tensor host data into GPU memory with Kompute Operation +3. Run compute shader asynchronously with Async function +4. Create managed sequence to submit batch operations to the CPU +5. Map data back to host by running the sequence of batch operations + +View [more examples](https://kompute.cc/overview/advanced-examples.html#simple-examples). + +```c++ +int main() { + + // Creating manager with Device 0, and a single queue of familyIndex 2 + kp::Manager mgr(0, { 2 }); // 1. Create a set of data tensors in host memory for processing auto tensorA = std::make_shared(kp::Tensor({ 3., 4., 5. })); @@ -74,44 +146,28 @@ int main() { // 2. Map the tensor host data into GPU memory with Kompute Operation mgr.evalOpDefault({ tensorA, tensorB }); - // 3. Define shader as string or spirv bytes (can also pass path to file) - std::string shader(R"( - #version 450 - - layout (local_size_x = 1) in; - - layout(set = 0, binding = 0) buffer a { float pa[]; }; - layout(set = 0, binding = 1) buffer b { float pb[]; }; - - void main() { - uint index = gl_GlobalInvocationID.x; - pb[index] = pa[index]; - pa[index] = index; - } - )"); - - // 4. Run compute shader asynchronously with Async function + // 3. Run compute shader Asynchronously with explicit dispatch layout mgr.evalOpAsyncDefault>( { tensorA, tensorB }, - std::vector(shader.begin(), shader.end())); + shader); // Using the same shader as above - // 4.1. Before submitting sequence batch we wait for the async operation + // 3.1. Before submitting sequence batch we wait for the async operation mgr.evalOpAwaitDefault(); - // 5. Create managed sequence to submit batch operations to the CPU + // 4. Create managed sequence to submit batch operations to the CPU std::shared_ptr sq = mgr.getOrCreateManagedSequence("seq").lock(); - // 5.1. Explicitly begin recording batch commands + // 4.1. Explicitly begin recording batch commands sq->begin(); - // 5.2. Record batch commands + // 4.2. Record batch commands sq->recordrecordend(); - // 6. Map data back to host by running the sequence of batch operations + // 5. Map data back to host by running the sequence of batch operations sq->eval(); // Prints the output which is A: { 0, 1, 2 } B: { 3, 4, 5 } @@ -143,7 +199,7 @@ int main() { The core architecture of Kompute include the following: * [Kompute Manager](https://kompute.cc/overview/reference.html#manager) - Base orchestrator which creates and manages device and child components * [Kompute Sequence](https://kompute.cc/overview/reference.html#sequence) - Container of operations that can be sent to GPU as batch -* [Kompute Operation (Base)](https://kompute.cc/overview/reference.html#algorithm) - Individual operation which performs actions on top of tensors and (opt) algorithms +* [Kompute Operation (Base)](https://kompute.cc/overview/reference.html#algorithm) - Base class from which all operations inherit * [Kompute Tensor](https://kompute.cc/overview/reference.html#tensor) - Tensor structured data used in GPU operations * [Kompute Algorithm](https://kompute.cc/overview/reference.html#algorithm) - Abstraction for (shader) code executed in the GPU