Artificial Neural Networks have seen huge performance improvements when running on modern Graphics Processors (GPUs) or specialised accelerators. There are however significant challenges in using GPUs to obtain similar performance gains with Genetic Programming, in particular instruction divergence when evaluating candidate programs in cases where there isn’t enough training data to fully utilise the GPU, due to the fact GPUs offer much greater data parallelism than function parallelism. A popular solution is to parallelise interpretation of GP candidate programs on the GPU, but this comes with a large performance overhead. We are researching an alternative approach where the parallel execution on the GPU of common sub-trees across the full set of GP candidates is orchestrated by the CPU, as most sub-trees persist between generations there is little compilation overhead.
The open source TORCS driving simulator has long been a test bed for driving AI, and has more recently been the subject of attempts to train driving AI through Reinforcement Learning. The main limitation is that TORCS, even in text mode is significantly limited in terms of the speed of the simulation, meaning that learning is slow, while this could be sped up by running many instances of the TORCS software, the system requirements for the software are significant and the cost of a large number of virtual machines running TORCS would therefore be high. Our approach is to extract the parts of the TORCS code needed for training (such as the physics model), and optimise them for use in lock-step with Reinforcement Learning, with the goal of training an AI driver that is able to drive well in the full TORCS simulation.