Skip to content

TDeTTECT/ABIs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ABIs

TODO:

  • Implementing a sensible way of handling the lengths of vectors. Manually passing the length every time does not make sense, but neither does forcing all vectors to be the same.
  • Doing some more template magic for variable argument functions - adding multiple arguments to a kernel in a 'single' function call.
  • Automatically generating the local/global work group sizes
  • .dot output of your current kernel graph + auto generation of kernel (.cl) file empty function definitions.
  • making the kernel calls asynchronous using the clCommandQueue and event pointers
  • Template magic for the template classes. Currently if you want to add a new type to operate on you'd have to recompile the library to include it. Maybe just pass a void* and a length and cast it back in the .cl function? If everything would be included in the header files you'd lose the templating madness of explitly defining all used templates, but this comes at the cost of quite a lot of (A) readability and (B) compile speeds.
  • Benchmarks of asynchronous vs synchronous kernel calls
  • Detect circular dependencies of kernels
  • Examples with image processing, deep learning and a renderer/raytracer (tbh the main use case of GPUs)

Features

  • No exposed low-level C, just the C++ STL - Focus on programming the GPU instead of messing about with long C-style OpenCL function calls and manual memory management.
  • CMake support for Linux and Mac - No more linking problems if you have installed the correct driver.
  • The kernel type is a template: choose from processing floats, ints or chars on your GPU.
  • Variable amount of input vectors, output vectors and scalars.
  • Human readable OpenCL errors for easy debugging of your program due to the already present error checks on every OpenCL call.
  • Includes an example of some slighly more advanced OpenCL to help you get started - computing the sum of a vector in logarithmic time (example/sum.cl)
  • (WORK IN PROGRESS, check the multikernel branch) The specification of a kernel chain in graph form. It automatically handles interdependencies of kernels and allows you to customize which arguments act as inputs/outputs to which other kernels.

Overview: it's this easy!

// example/main.cpp
try {
  ABIs<int> framework (SHOW_DEBUG);

  //Load a kernel function
  framework.loadKernel("simplekernel.cl");

  //Bind the inputs and outputs to the kernel function arguments
  framework.setInputBuffer(0, std::vector<int> {1, 2, 3, 4, 5});
  framework.setSingleValue(1, 10);

  //Run the kernel and display the results
  framework.runKernel();
  framework.showAllValues();
}
catch (std::exception& e) {
  std::cerr << "Error: " << e.what() << std::endl;
}
// example/simplekernel.cl
__kernel void simplekernel(	__global int* array, const int singlevalue )
{
	int i = get_global_id(0);
	array[i] = array[i] * singlevalue;
}

Getting started

All platforms:

  • Update your graphics drivers
  • Install the drivers with OpenCL support (NVIDIA CUDA Toolkit or AMD APP SDK)
git clone https://github.com/Gladdy/ABIs.git
cd ABIs
mkdir build && cd build
cmake ..
make
./test

Thanks to:

About

Prebuilt statically linked gdbserver and gawk executables for Linux on ARMEL, MIPS/MIPSEL and more platforms for use on embedded devices, including for systems with many different ABIs (including m…

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors