Playground Games' Chris Makin provides advice on how to make sure that tools do their jobs, based on an ethos of ‘prevention is better than cure’
Don’t let your tools become DOS attacks
Many of us have been guilty of it at one time or another: we knock together some quick tools to query the version management system, perhaps to check the status of files in the repository. Then suddenly multiple queries get fired at the server at the same time, especially at peak times.
The server is busy processing those requests, meaning that the response rate slows down. Rather than wait, users keep hitting refresh, which compounds the problem by causing TCP/IP exhaustion, a locked network and what is in effect a Denial-of-Service (DOS) attack.
There are several ways to stop this happening:
- Put in place protection rules within the local firewall, with minimum and maximum variables, plus limit the number of concurrent users.
- Tune the network stack so that the maximum number of ports are available.
- Minimise the ‘alive’ time. If using Perforce Helix, install Lockless Reads, which ensure that operations can continue, rather than being held until the read-lock is released.
Badly written and untested tools lead to scripting nightmares
What may be a team’s favourite tool can cause real problems due to the way the scripting has been implemented. For example, an editing tool is designed to speed up the creation and authoring process of game environments. This gives every file created a self-descriptive name with a UID. In testing this works great and example files have a name and path of around 54 characters.
A small change to the tool prior to implementation goes without testing. It works right? Some time later, it’s noticed that the tool is now reversing and adding the filename upon itself with 199 characters instead of the original 54.
This may sound fine, but it has a big impact on the back-end. Your data usage has now doubled. The more branching, the more integration, and the slower the whole process becomes, not to mention incredibly high usage of RAM and CPU utilisation.
The answer? Ideally, don’t use tools that have bad scripting behaviour. Users of Perforce Helix can use a trigger or P4 broker rule to inspect file and path length for irregularities.
Splitting code and other assets
Artists and coders have different ways of working and those can clash. For instance, coders tend to check in lots of relatively small files into the repository, while artists’ files are much larger and they will tend to batch them, perhaps for checking in at lunchtime, rather than as they go during the day.
The best solution is to have their files on separate servers. From a disaster recovery point of view, this also makes it easier to re-instate work if one element goes wrong. We have all servers overlaid so coders exist on art servers and vice versa, meaning that everyone still has visibility across the whole project.
Plan ahead, assume you’ve under-estimated on what’s needed
How big do you think your ultimate data set is going to be and provision accordingly? This may sound obvious to some, but it’s not uncommon for game projects to mushroom in size beyond forecasts or plans. It’s cheaper and less disruptive to get it right in the first place.
You can never have enough RAM – it’s also generally a lot cheaper than investing in additional hard disk drives as a way to reduce the workload on your storage area network. We have enough RAM in each server to hold all our databases and two working versions of our daily change data set.
Keep everything on separate drives to manage growth explosions
Again, it’s very easy to hit the ceiling in capacity. By separating out files onto different drives, then it is easier to deal with individual problems as they arise, rather than bringing everything to a halt. Also, if your company uses Perforce Helix, there’s a useful feature that stops the checking in of code and assets when a certain amount of available space is reached.
Don’t version just code, but also version tools, images, builds and everything in your version control system of choice. Otherwise your volume of systems and amount of validation increases, making it all harder to manage. Keep it simple. With Perforce Helix, you can easily scale horizontally, so it’s a useful way to deal with increased capacity.
Test tools, not just code and assets
Test tools in as real an environment as possible, otherwise you won’t know their impact until they are in place. Even if the right hardware isn’t available, still test with real data against lesser hardware and extrapolate the results.
Perforce Helix has native replication capabilities that will enable such testing with ease. It might feel like a laborious chore, but testing and identifying problems is much better than having to fix something once it has already been implemented.
It’s unrealistic to prevent problems occurring, but good housekeeping goes a long way towards stopping tools from attacking the very users they are meant to be supporting.
Chris Makin is IT infrastructure administrator at Forza Horizon developer Playground Games. Among other things, Chris’ role includes ensuring that the teams creating the games, whether coders or artists, have a smooth-running working environment and efficient tools. Prior to joining Playground, Chris held a similar role at Codemasters and SEGA Europe.