Jem Alexander speaks to Ivan del Duca, technical director at Milestone, about the potential for neural networks to vastly improve the artificial intelligence of computer-controlled racers in driving games
Science fiction is nothing new when it comes to video games, with titles like Mass Effect, Deus Ex and Destiny among some of the industry’s most popular brands. But one aspect of sci-fi has been sadly lacking from the world of interactive entertainment; neural networks are quickly becoming science fact but have been woefully underserved in the development of video games. Italian developer Milestone is looking to change that and has invested in an internal research and development team to try to unlock the potential of driverless cars in the racing genre.
Ivan Del Duca, the studio’s technical director, has strong views about the use of neural nets in creating its upcoming racers. “My opinion is that artificial intelligence will be very much discussed in the future. It will be the future of humanity, probably,” he says. “Everyone is moving towards AI. I think that the gaming industry is a bit late on this train. Games are very complex, and so I understand why we are late, but I think it’s time to move on with this kind of technology.
“For racing games I think this is an ideal starting point, since we have AIs that are driving cars in the real world. Autonomous vehicles are already a reality, so it’s obviously possible to do it in a video game.”
This internal team devoted to neural nets and advanced AI is not something they expect to see results from very quickly. This is a case of investing in their future.
“We are still experimenting, it’s a project that will last two years,” says Del Duca. “The biggest problem that we are facing is the training cost. It is still very high. We have amazing results, but to train an AI on a specific track, it still takes weeks. It’s just like training a dog or any other living being.”
To train an AI on a specific track still takes weeks. It's just like training a dog
Ivan del Duca, Milestone
“We are trying to optimise the training process to be able to train the network in maybe a day or even less. Currently we are experimenting with solo driving, with just one car on the track. We are using Deep Deterministic Policy Gradient network. It’s very complex, but basically it’s a network layered like our brain, so it’s trained just like our brain. This agent (the AI networks are called agents in a world) is put on a track and the only methods this agent has to control his actions are by steering and pressing the throttle, etc. Simple actions.”
This set of simple rules allow the agent to make basic decisions in order to reach its goal. In this case, finding the fastest route around a track. When it does so, just like a dog performing a trick, it will receive belly rubs and a tasty treat. Or the AI equivalent, at least.
“The reward function gives a reward to the agent when driving in a good way, on the racing line, with good times,” explains Del Duca. “It’s like a game. You give the AI some points. Some plus points if they do something right and minus points if they do something wrong. At the end of the epoch, the process of training an AI, the rating of the AI is calculated based on the points they have obtained. There is a system, it’s called back propagation, that corrects and recalibrates the errors, training the weights of every neuron inside the network.”
Of course, just like a puppy on its first day of training, Artificial intelligences start out peeing on the carpet and chasing their own tails.
“At the start, the AI is totally stochastic so it goes everywhere and it tries to understand what’s the right approach for racing on the track,” Del Duca continues. “With a complex system of reward and recalibration of the network the network learns to drive on the track. We now have some agents that are very good drivers on some tracks and they also powerslide and take advantage of all the characteristics of the vehicle they drive, but we have still to introduce the opponents. This will be a very hard part. There’s nothing ready to be put in a game right now, but there will be in the next ten months or so.”
WHY NEURAL NETWORKS?
This may all seem like a very long and complicated way to achieve something that driving game developers have been able to achieve for a while now, but Del Duca insists that the benefits will vastly outweigh the time and cost investment. “There are several benefits,” he explains. “Firstly, we are amazed by the results. The normal approach to AI in racing games is a juristic approach. We have a complex problem, but it’s so complex that we have to simplify it and try to understand what must be solved, what can be approximated and so on.
“Usual methods on artificial intelligence is that the actor knows the racing line and tries to follow the racing line. We had some rules, for example if there’s someone in front of you that is slower than you, you can calculate what to do when you approach it. For example, if the actor will approach a car within two seconds it can try to overtake it but if there’s a bend in 500m, it shouldn’t overtake because it can be dangerous, and so on. It can be very complex. It’s composed of hundreds and hundreds of rules that aren’t always right.
“This means that we can’t always obtain the results that we want. Especially for group behaviour. So if the AI is running alone, it’s one thing that’s simple enough, but when there are many bikes in a group, they need to think about different strategies like the driver behind you or trying to overtake the one in front of you by running a bit outside the track, but it will never be a natural behaviour. It will always be behaviour that follows some rules.”
But these rules can always be gamed once the player (or creator) understands how it thinks. One of the beautiful and curious things about neural networks is that behaviour can evolve and suddenly the teacher becomes the student.
“AI can be unpredictable and this is the main benefit,” Del Duca says. “If we are able to train an artificial intelligence to behave in the best way, it will decide what the best approach to a situation will be without us knowing why they decided this. This is fascinating and this is our main target right now. We have AI designers that continuously ask for new features: ‘we want to support the slipstream in this way, or this other way, in this game it works, but in this other game it doesn’t work’, and so on.
“In the future AI will take care of everything. It’s a bit like the artificial intelligence used, for example, for the translations that Google is using. It learns autonomously to make translations without someone inputting text or inputting variations. You leave it there and after six months it translates text way better than six months earlier. The forward learning approach is I think the way to go.”
Theoretically, this learning could continue when it’s in the hands of the players, meaning that the game’s AI drivers continue to get better even after launch, though this could lead to some kind of racing car Skynet and spell the end of humanity as we know it. Okay, maybe not that extreme, but it’s still not something the industry is quite ready for. “It could continue to learn post-launch,” Del Duca says. “But I don’t think this is something we want, because it would change the gameplay, basically. It would be too unpredictable. Also the learning process is very consuming in terms of CPU, so we have clusters of CPUs that train the AI because it’s thousands and thousands of laps and experiences.”