Can great facial animation really be created via a simple extension to the VO session?
Full performance capture is an elaborate process. It is costly, complex, and consumes time with abandon. The margin for error can feel wider than the uncanny valley, and the variables to consider are dizzying.
While the practice absolutely serves a purpose, for those looking for great facial animation synched with actors’ delivery of vocal material, there are alternatives. That’s according to audio powerhouse Wave, and animation service provider Cubic Motion.
The two recently collaborated – along with CG character specialist 3Lateral – on making an ambitious tech demo to showcase what can be done with a low impact extension to the established VO recording process and a little bit of teamwork.
The ‘Athanasius’ demo showcases the speech and facial expressions of an imposing fictional virtual character, which began life in a recording booth. The actor behind the performance delivered his lines in the normal VO context; albeit wearing a lightweight head mounted camera system.
With the audio recording and visual performance captured in unison, Cubic Motion was in a position to analyse the video for data that could be solved onto a 3D character. Then, with some aural refinement by Wave and a little tweaking using Cubic Motion’s bank of existing animation data, the final demo was ready.
The quality of the final result arguably rivals the output of far more expensive techniques, and all with minimal disruption to the typical VO recording process, insist those involved.
“For us the process in the actual booth was seamless, and created very little extra work,” says Wave’s head of game audio Anthony Matchett. “And at the end of the session, we could easily deliver the audio straight back to the guys from Cubic. As soon as we had a locked time in the audio, they came back to us for us to do a light amount of sound design just so the audio really matched the video.
“But from a client’s point of view, this whole thing will be really easy for them, and it works really well. It’s honestly very non-disruptive. Setting up the VO booth for the video capture was very simple, and took very little time.”
And the secret to getting the process of extracting high quality facial animation data from a VO session just right? It’s all about carefully selected collaborative partners and an open-mindedness to cooperation.
“As facial animators we want to create the highest quality work possible, and a number of factors contribute to that when you’re working with another company,” says Steven Caulkin, head of research at Cubic Motion. “Obviously you need to get the basics right; stuff like a seamless pipeline suited to the project, the synchronisation of data needs to be nicely handled, and you need to get somebody that will work well with you on that front and be accommodating.”
More important, though, adds Caulkin, is finding a partner that appreciates the elements of the process that are vital to your own company’s needs; the likes of setting up the capture equipment in the audio booth, and being given a chance to test the equipment well in advance.
“Being able to work together like that means we get better data and a better final animation,” insists Caulkin.
Cubic Motion’s biz dev director Simon Elms is quick to elaborate: “To get the right animation means picking the right audio partner that will understand the approach, and help with that process. Wave have been great for us in that regard.”
Wave and Cubic Motion’s purpose is clear; they want to prove the value of extending the voice over recording process to embrace facial animation data generation.
There is an opinion prevalent in some sectors of the games industry that processes not dissimilar to the one used for the Athanasius demo means making creative and technical sacrifices.
Take a look at the demo yourself, and you’ll see that such an assumption is far from correct.