I've now written enough on Ocius to consider it "done for the summer", so I thought I'd wrap it up in a big blog post, explaining it's files and the features within. It's been a very educational summer, I've learned a lot and gained a whole lot of experience. C++ is a very important language in high performance fields such as game development, which is kind of sad, but probably not changing any time soon. If you want the most of the machine you have to get closer too it and C++ does that for you.If you want to check out the code, you can clone it as such: (if you find anything bad, please let me know)
git clone git://silven.nu/ociuscd ocius/mkdir buildcd build/cmake ..makesudo make install
Ocius in itself has no dependencies. The OpenCL parts requires OpenCL, the interop and OpenGL parts requires OpenGL. If you wish to use Ocius OpenCL capabilities without any OpenGL interop, you must #define __OCIUS_NO_GL to prevent requiring your binary to be linked with libGL. The rest of functionality has no dependencies, the examples manipulating images however depend on PNG++.First of is the OpenCL wrapper code found in ocius/ocl.hpp. It contains classes to easily create an OpenCL context, device and command queue as well as some utilities for loading kernel program from disk. Again, if you can't link aginst libGL you can #define __OCIUS_NO_GL to prevent it from using functions from there. The core of this file is the Environment class, it manages creation of everything from devices to buffers. Once you have a device and a kernel set up you can easily run it. Any cl::Event objects returned can be used to wait for results as everything is done async. For more information check the dumpinfo, programbank and mandelbrot example programs.Things I have plans on doing for this file includes verbosity settings for debugging and enable support for exceptions. Some clean up is also needed. I think it has all functionality one would expect from a library like this.Next is the file ocius/threadpool.hpp. It contains a simple thread pool where you can submit runnable functions for execution. There is much left to do here, such as private work queues with job stealing. But as far as I know, no compiler implements the C++11 thread_local keyword, so I will wait with adding such a feature until I know I can do it in a standards compliant way. It also currently does not support waitable tasks and futures, only void returning functions are supported. I will add this later, but for now one can use the method ThreadPool::wait() to block until all tasks are done. This is good enough for most use cases. The example program threadpool displays it's usage.Next is the file ocius/benchmark.hpp which implements tools to benchmark code. You can either perform a simple test where the same function is performed X number of times and then the results are printed. The results include the total time, the average time and the standard deviation. It uses the std::chrono::high_resolution_clock to get the elapsed time. No unnecessary time looping is measured. You can also create a custom benchmark, use it throughout your program to time different parts of it and compile it at a time and resolution of your choice. This way you can easily, albeit naively, profile your code to find bottle necks. I'm rather proud of this particular file and I don't know more what features to add. For usage see the pi and threadpool example programs.The file ocius/memorypool.hpp implements a Memory Pool of smart pointers pointing to objects on the heap for reuse. A pointer can be retrieved and used for a time, and when the final reference goes out of scope, it will return the pointer to the Memory Pool, instead of performing a delete on the underlying object. This allows for reuse on objects with a short lifespan. The memory pool is not concurrent and that is something I wish to look into in the future. I don't know the usability of this file as I've yet to find a good scenario for it. For a short demonstration see the memorypool example program.Next is the file ocius/ogl.hpp which has some OpenGL utility functions for creating shader programs and loading png++ image objects as textures. This together with the ocius/ocl.hpp header file enables OpenGL/OpenCL interop, such as manipulating texture objects and vertex buffer objects. For a demonstration see the glinterop example program. This example requires GLEW and GLFW.The file ocius/util.hpp is a utility file used by both the OpenCL and OpenGL code to do some error checking and loading files as text from disk.Finally there is the file ocius/concurrency.hpp which implements an exception safe guard class to prevent the destruction of std::thread objects before it goes out of scope. It also contains a thread safe queue which I am not happy with. It pushes Rvalue references and pops std::shared_ptr objects, which causes double destruction of the input. This however is required due to concurrency issues when popping. I could make it so it also pushes std::shared_ptr objects, but that would make it cumbersome to use. I'm not happy with the performance or features of this class and therefor I don't reuse it through any other of the files. A linked blocking queue is a good addition to any concurrent library and I wish to revisit this in the future, as well as implement a lock free version. However there is already the Intel TBB which I guess does this way better than I could have done anyway.The concurrency file also contain a naive class for message passing called a Hub. A Hub manages channels, not too unlike those used in boost::MPI. A Channel is then used to send messages to other channels, like a real life mailbox. I wanted to implement a channel like the one found in Golang, however I never figured out how to make it bidirectional without the use of the thread_local keyword. Besides, I really want a lock free queue for a structure like this. The Hub was supposed to be used in conjuction with the ThreadPool, in order to send messages to and from the workers. So in summery, the concurrency file is a failure and I'm not happy with it. For an example of currency usage, see the threads example.While I'm on the topic of concurrency. I wanted to implement more lock free structures such as a stack and a heap and maybe a quadtree, however there doesn't seem to be any support for atomic operations on std::shared_ptrs yet, so I simply couldn't as I wanted ocius to be a 100% standards compliant library. These kinds of operations, such as parallelizing computation with OpenCL is only something you'd do on a modern machine and on modern machine you usually have things like double word hardware CAS operations, so it shouldn't be much of a problem relying on such features.As a conclusion here is a few words on the project as a whole. Ocius is a small part of what could be a good modern program. It's one tool. Not a be all end all solution. If you want to be a good a programmer, just like a good carpenter, you need to know all the tools at your disposal. Other very interesting tools are TBB, OpenMP, OpenACC, or even C++ AMP if you're really into selling your soul. I wrote Ocius as a start for a personal toolbox for me in my upcoming masters thesis and work beyond that. It's not perfect, there are things I hope people point out to me so I can fix it. There are a lot of things to do. I never meant to save the world, only to take a step in the right direction.I'm glad I got to be a part of the first IDA Summer of Code initiative. I've learned a lot during this time and I thank everyone who's been involved. Even though this is my last blog post about it, it will not be last work I do related to it.