Sunday, November 26, 2006

System simulation

I'm (rather predictably) still working on my distributed rendering system.. but as it's a bit tiresome to make some modifications, recompile, restart.. -- more importantly, that it takes time to do so (1), I wrote a simple and nice simulation program that let me try different rendering/clustering strategies easily, swap them, evaluate or compare them, etc. That way I can concentrate on them rather than waiting 5-10 minutes to start visualizing a one gigabyte dataset on the cluster. Even better, the final idea is to integrate the simulation in the real system (it actually already gets timings from the real one and extrapolate the results) in order to have a nice feedback loop: run things, keep simulations in parallel, switch to other strategies if the simulation says it's better, update timings if needed, etc. Here is a screenshot of the current program:

The screenshot shows three simulated clustering strategies; basically we want to render an image using multiple computers. Each computer having one or more rendering agent. We divide the image in tiles and the rendering agents' job is to render them, send them back to the clustering agent, which will recompose the final image to send it back to a visualisation client. It's a fairly straightforward distributed technic. What's more interesting here is that we have three different strategies defining which tiles are sent to which agent. The simplest one is to divide the total number of tiles by the number of agents, and send continuous tiles up to that number for each agent (second bottom figure -- each color represents one agent). Another is to alternate tiles (eg no consecutive tiles are sent). Another one is to break the tiles according the the image complexity (first bottom left figure). The image complexity beeing the time each tile took to be renderered (bottom right figure).

The graphic shows the evolution of these three strategies depending on the number of agents. We can see that the alternate tiles strategy works fairly well usually (because the complexity is more evenly distributed), unless we ends on a multiple of the width (16), meaning the tiles are aligned, and thus the complexity is not properly distributed. Having a simulator can let us choose which strategy is the best before having to really use it.

(1) which break one of my Rules of Programming, ie, keep the REPL (Read-Eval-Print-Loop) cycle as short as possible: as obviously the shortest it is, the more ideas you can try.