No description
Find a file
2020-09-04 06:11:35 +01:00
builders Added builder images for linux and process to perform release 2020-08-29 11:10:51 +01:00
config Added base documentation generated from doxyen and sphinx 2020-08-28 07:52:03 +01:00
docs Added initial advanced example of logistic regression implementation 2020-08-31 22:48:27 +01:00
external/bin Added xxd source 2020-08-29 15:26:54 +01:00
scripts Added builder images for linux and process to perform release 2020-08-29 11:10:51 +01:00
seldon Added seldon 2020-08-05 08:12:51 +01:00
shaders Removed print statement in tests 2020-09-02 21:31:57 +01:00
single_include Updated SPDLOG as an optional dependency 2020-09-03 19:18:22 +01:00
src Updated test to use gtest instead of Catch2 2020-09-04 06:11:35 +01:00
test Updated test to use gtest instead of Catch2 2020-09-04 06:11:35 +01:00
.ccls Added tests for shader load data including raw and compiled strings+files 2020-08-31 11:47:35 +01:00
.gitignore Added builder images for linux and process to perform release 2020-08-29 11:10:51 +01:00
CMakeLists.txt Updated test files to work with gtest 2020-09-03 22:20:16 +01:00
Dockerfile Updated vulkan application to be containerised 2020-08-05 07:42:16 +01:00
Dockerfile.seldon Updated versions 2020-08-05 18:39:03 +00:00
LICENSE Added fully functional build shaders with dependencies on folders 2020-08-25 08:39:23 +01:00
Makefile Updated SPDLOG as an optional dependency 2020-09-03 19:18:22 +01:00
pylintrc Added python converter for shader scripts 2020-08-23 09:50:44 +01:00
README.md Fixed sequence example 2020-09-03 19:30:07 +01:00
vcpkg.json.opt Updated test files to work with gtest 2020-09-03 22:20:16 +01:00

GitHub GitHub GitHub GitHub GitHub GitHub

Vulkan Kompute

The General Purpose Vulkan Compute Framework.

Blazing fast, lightweight, easy to set up and optimized for advanced GPU processing usecases.

🔋 Documentation 💻 Import to your projectTutorials 💾

Principles & Features

  • Single header library for simple import to your project
  • Documentation leveraging doxygen and sphinx
  • BYOV: Bring-your-own-Vulkan design to play nice with existing Vulkan applications
  • Non-Vulkan core naming conventions to disambiguate Vulkan vs Kompute components
  • Fast development cycles with shader tooling, but robust static shader binary bundles for prod
  • Explicit relationships for GPU and host memory ownership and memory management
  • Providing simple usecases as well as advanced machine learning & data processing examples

Getting Started

Setup

Kompute is provided as a single header file Kompute.hpp that can be simply included in your code and integrated with the shared library.

This project is built using cmake providing a simple way to integrate as static or shared library.

Your first Kompute

Run your tensors against default operations via the Manager.

int main() {

    kp::Manager mgr; // Automatically selects Device 0

    // Create 3 tensors of default type float
    auto tensorLhs = std::make_shared<kp::Tensor>(kp::Tensor({ 0., 1., 2. }));
    auto tensorRhs = std::make_shared<kp::Tensor>(kp::Tensor({ 2., 4., 6. }));
    auto tensorOut = std::make_shared<kp::Tensor>(kp::Tensor({ 0., 0., 0. }));

    // Create tensor data in GPU
    mgr.evalOpDefault<kp::OpCreateTensor>({ tensorLhs, tensorRhs, tensorOut });

    // Run Kompute operation on the parameters provided with dispatch layout
    mgr.evalOpDefault<kp::OpMult<3, 1, 1>>(
        { tensorLhs, tensorRhs, tensorOut }, 
        true, // Whether to retrieve the output from GPU memory
        std::vector<char>(shader.begin(), shader.end()));

    // Prints the output which is { 0, 4, 12 }
    std::cout << fmt::format("Output: {}", tensorOutput.data()) << std::endl;
}

Pass compute shader data (in raw or compiled SPIR-V) format for faster dev cycles.

int main() {

    kp::Manager mgr(1); // Explicitly selecting device 1

    auto tensorA = std::make_shared<kp::Tensor>(kp::Tensor({ 0, 1, 2 }));
    auto tensorRhs = std::make_shared<kp::Tensor>(kp::Tensor({ 2, 4, 6 }));

    // Define your shader as a string, or directly pass the compiled bytes
    std::string shader(
        "#version 450\n"
        "layout (local_size_x = 1) in;\n"
        "layout(set = 0, binding = 0) buffer bufa { uint a[]; };\n"
        "layout(set = 0, binding = 1) buffer bufb { uint b[]; };\n"
        "void main() {\n"
        "    uint index = gl_GlobalInvocationID.x;\n"
        "    b[index] = a[index];\n"
        "    a[index] = index;\n"
        "}\n"
    );

    // Create tensor data in GPU
    mgr.evalOpDefault<kp::OpCreateTensor>({ tensorA, tensorB });

    // Run Kompute operation on the parameters provided with dispatch layout
    mgr.evalOpDefault<kp::OpMult<3, 1, 1>>(
        { tensorLhs, tensorRhs, tensorOut }, 
        true, // Whether to retrieve the output from GPU memory
        std::vector<char>(shader.begin(), shader.end()));

    // Prints the output which is A: { 0, 1, 2 } B: { 3, 4, 5 }
    std::cout << fmt::format("A: {}, B: {}", 
        tensorA.data(), tensorB.data()) << std::endl;
}

Pass file path for shader data (in raw or compiled SPIR-V) for faster dev cycles.

int main() {

    kp::Manager mgr; // Automatically selects Device 0

    auto tensorA = std::make_shared<kp::Tensor>(kp::Tensor({ 0, 1, 2 }));
    auto tensorRhs = std::make_shared<kp::Tensor>(kp::Tensor({ 2, 4, 6 }));

    // Create tensor data in GPU
    mgr.evalOpDefault<kp::OpCreateTensor>({ tensorA, tensorB });

    // Run Kompute operation on the parameters provided with dispatch layout
    mgr.evalOpDefault<kp::OpMult<3, 1, 1>>(
        { tensorLhs, tensorRhs, tensorOut }, 
        true, // Whether to retrieve the output from GPU memory
        "path/to/shader.comp");

    // Prints the output which is A: { 0, 1, 2 } B: { 3, 4, 5 }
    std::cout << fmt::format("A: {}, B: {}", 
        tensorA.data(), tensorB.data()) << std::endl;
}

Record commands in a single submit by using a Sequence to send in batch to GPU.

int main() {

    kp::Manager mgr;

    std::shared_ptr<kp::Tensor> tensorLHS{ new kp::Tensor({ 0.0, 1.0, 2.0 }) };
    std::shared_ptr<kp::Tensor> tensorRHS{ new kp::Tensor( { 2.0, 4.0, 6.0 }) };
    std::shared_ptr<kp::Tensor> tensorOutput{ new kp::Tensor({ 0.0, 0.0, 0.0 }) };

    // Create a new sequence
    std::weak_ptr<kp::Sequence> sqWeakPtr = mgr.getOrCreateManagedSequence();

    if (std::shared_ptr<kp::Sequence> sq = sqWeakPtr.lock())
    {
        // Begin recording commands
        sq.begin();

        // Record batch commands to send to GPU
        sq.record<kp::OpCreateTensor>({ tensorLHS });
        sq.record<kp::OpCreateTensor>({ tensorRHS });
        sq.record<kp::OpCreateTensor>({ tensorOutput });
        sq.record<kp::OpMult<>>({ tensorLHS, tensorRHS, tensorOutput });

        // Stop recording
        sq.end();

        // Submit operations to GPU
        sq.eval();
    }

    std::cout << fmt::format("Output: {}", tensorOutput.data()) << std::endl;
}

Advanced Examples

We cover more advanced examples and applications of Vulkan Kompute, such as machine learning algorithms built on top of Kompute.

You can find these in the advanced examples documentation section, such as the logistic regression example.

Build Overview

Dependencies

Given Kompute is expected to be used across a broad range of architectures and hardware, it will be important to make sure we are able to minimise dependencies.

Required dependencies

The only required dependency in the build is Vulkan (vulkan.h and vulkan.hpp which are both part of the Vulkan SDK).

Optional dependencies

SPDLOG is the preferred logging library, however by default Vulkan Kompute runs without SPDLOG by overriding the macros. It also provides an easy way to override the macros if you prefer to bring your own logging framework. The macro override is the following:

#ifndef KOMPUTE_LOG_OVERRIDE // Use this if you want to define custom macro overrides
#if KOMPUTE_SPDLOG_ENABLED // Use this if you want to enable SPDLOG
#include <spdlog/spdlog.h>
#endif //KOMPUTE_SPDLOG_ENABLED
// ... Otherwise it adds macros that use std::cout (and only print first element)
#endif // KOMPUTE_LOG_OVERRIDE

You can choose to build with or without SPDLOG by using the cmake flag KOMPUTE_OPT_ENABLE_SPDLOG.

Motivations

Vulkan Kompute was created after identifying the challenge most GPU processing projects with Vulkan undergo - namely having to build extensive boilerplate for Vulkan and create abstractions and interfaces that expose the core compute capabilities. It is only after a few thousand lines of code that it's possible to start building the application-specific logic.

We believe Vulkan has an excellent design in its way to interact with the GPU, so by no means we aim to abstract or hide any complexity, but instead we want to provide a baseline of tools and interfaces that allow Vulkan Compute developers to focus on the higher level computational complexities of the application.

It is because of this that we have adopted development principles for the project that ensure the Vulkan API is augmented specifically for computation, whilst speeding development iterations and opening the doors to further use-cases.

Components & Architecture

The core architecture of Kompute include the following:

  • Kompute Manager - Base orchestrator which creates and manages device and child components
  • Kompute Sequence - Container of operations that can be sent to GPU as batch
  • Kompute Operation - Individual operation which performs actions on top of tensors and (opt) algorithms
  • Kompute Tensor - Tensor structured data used in GPU operations
  • Kompute Algorithm - Abstraction for (shader) code executed in the GPU
  • Kompute ParameterGroup - Container that can group tensors to be fed into an algorithm

To see a full breakdown you can read further in the documentation.

Full Vulkan Components Simplified Kompute Components


(very tiny, check the docs to for details)

Kompute Development

We appreciate PRs and Issues. If you want to contribute try checking the "Good first issue" tag, but even using Vulkan Kompute and reporting issues is a great contribution!

Contributing

Dev Dependencies

  • Testing
    • Catch2
  • Documentation
    • Doxygen (with Dot)
    • Sphynx

Development

  • Follows Mozilla C++ Style Guide https://www-archive.mozilla.org/hacking/mozilla-style-guide.html
    • Uses post-commit hook to run the linter, you can set it up so it runs the linter before commit
    • All dependencies are defined in vcpkg.json
  • Uses cmake as build system, and provides a top level makefile with recommended command
  • Uses xxd (or xxd.exe windows 64bit port) to convert shader spirv to header files
  • Uses doxygen and sphinx for documentation and autodocs
  • Uses vcpkg for finding the dependencies, it's the recommanded set up to retrieve the libraries
Updating documentation

To update the documentation will need to:

  • Run the gendoxygen target in the build system
  • Run the gensphynx target in the buildsystem
  • Push to github pages with make push_docs_to_ghpages
Running tests

To run tests you can use the helper top level Makefile

For visual studio you can run

make vs_cmake
make vs_run_tests VS_BUILD_TYPE="Release"

For unix you can run

make mk_cmake MK_BUILD_TYPE="Release"
make mk_run_tests