Futuremark Explains Why VR Benchmarking is About More Than Just Numbers

Futuremark have now released the full version of their long awaited, dedicated virtual reality benchmark, VRMark. And, after months of research and development, the company has found itself having to redefine its own views on how the difficult subject of VR performance testing should be tackled.

Futuremark are developers of some of the world’s best known and most widely used performance testing software. In enthusiast PC gaming circles, their visually impressive proprietary synthetic gaming benchmark series 3DMark has been the basis for many a GPU fanboy debate over the years with every new version bringing with it a glimpse at what the forthcoming generation of PC gaming visuals might deliver and PC hardware fanatics can aspire to achieve.
Therefore, it was inevitable that once virtual reality reached the consumer phase, the company would take an active part in VRs renaissance, in fact with immersive gaming came lofty initial hardware requirements and a necessary obsession with low latency visuals and minimum frame rates of 90FPS. So surely a new Futuremark product, one focused purely on the needs of VR users, would be a slam dunk for the company. VRMark is the company’s first foray into the world of consumer VR performance testing and recently launched in full via Steam, offering up a selection of pure performance and experiential ‘benchmarks’, the latter viewable inside a VR headset.
However, as anyone who has experienced enough virtual reality across different platforms will tell you, putting a number on how ‘good’ a VR system performs is anything but simple. With dedicated VR headsets come complex proprietary rendering techniques and specialist dedicated display technology a lot of which simply hadn’t been done at a consumer level before. The biggest challenge however, the biggest set of variables Futuremark had to account for, was human physiology and the full gamut of possible human responses to a VR system.
Futuremark initially approached the issue from a pure, analytical perspective, as you might expect. You may remember that we went hands on with a very early version of the software last year which at the time came complete with some pretty expensive additional hardware. Futuremark’s aim at that time (at least in part), to measure the much coveted ‘motion to photons’ value – the time it takes for an image to reach the human eye, from render time to display. However, you’ll notice that if you’ve popped onto Steam to purchase the newly released VRMark, it does not list ‘USB oscilloscope’ or ‘photo-sensitive sensor’ as requirements. Why is that?
We asked Futuremark’s James Gallagher to enlighten us.
“After many months of testing, we’ve seen that there are more significant factors that affect the user’s experience,” he says, “Simply put, measuring the latency of popular headsets does not provide meaningful insight into the actual VR experience. What’s more, we’ve seen that it can be misleading to infer anything about VR performance based on latency figures alone.” Gallagher continues, “We’ve also found that the concept of ‘VR-ready’ is more subtle than a simple pass or fail. VR headsets use many clever techniques to compensate for latency and missed frames. Techniques like Asynchronous Timewarp, frame reprojection, motion prediction, and image warping are surprisingly effective.”
Gallagher is of course referring to techniques that almost all current consumer VR hardware vendors now employ to help deal with the rigours of hitting those required frame rates and the unpredictable nature of PC (and console in the case of PSVR) performance. All these techniques (Oculus has Asynchronous Timewarp and now Spacewarp, Valve’s SteamVR recently introduced Asynchronous Reprojection) work along similar lines to achieve a similar goal, to ensure that the motions you think you’re making in VR (say, when you turn your head) matches with what your eyes see inside the VR headset. The upshot is minimised judder and stuttering, two effects very likely to induce nausea in VR users.
“With VRMark, you can judge the effectiveness of these techniques for yourself,” says Gallagher, “This lets you judge the quality of the VR experience with your own eyes. You can see for yourself if you notice any latency, stuttering, or dropped frames.” And Gallagher shares something surprising about their research, “In our own tests, most people could not identify the under-performing system, even when the frame rate was consistently below the target. You may find that you can get a comfortable VR experience on relatively inexpensive hardware.”
To describe Futuremark’s VR benchmarking methodology for consumers in more detail, here’s James Gallagher explaining it in his own words.

[Futuremark are] recommending a combination of objective benchmark testing and subjective “see for yourself” testing. We think this is the best way to get the whole picture, especially for systems below the recommended spec for the Rift and the Vive.

The reason is that the concept of “VR-ready” is more subtle than a simple pass or fail.

On the one hand, a literal definition would say that to be truly VR-ready a system must be able to achieve a consistent frame rate of 90 FPS on the headset without dropping a single frame. In this case, every frame you see comes from the game or app. You are getting exactly the experience the developer wanted you to have. You would use VRMark benchmarks to test this case.

On the other hand, when a system is unable to maintain 90 FPS on the headset the VR SDK will try to compensate by using Asynchronous Time Warp or frame reprojection or other techniques. In this case, only some of the frames you see on the headset are the real frames from the game. The others are created by the SDK to fill in the gaps caused by missed frames. Now, if the SDK does such a good job of hiding the dropped frames that you cannot tell the difference between it and the pure 90 FPS experience, then you could perhaps say that this second system is VR-ready as well. You can use VRMark experience mode to test this case.

Here’s an example to illustrate:

System A:
VRMark Orange Room benchmark score: 6500
Average frame rate: 140 FPS

System B:
VRMark Orange Room benchmark score: 5000
Average frame rate: 109 FPS

System C:
VRMark Orange Room benchmark score: 3500
Average frame rate: 75 FPS

System D:
VRMark Orange Room benchmark score: 2000
Average frame rate: 40 FPS

The benchmark results show that system A and System B are both VR-ready for the Rift and the Vive in the pure sense. Both have enough performance to render every frame at 90 FPS when connected to a VR headset. But the difference in scores and average frame rate tells you that system A has more headroom for using higher settings or for running more demanding VR games and apps.

System C and system D did not achieve the target frame rate. So the question now is whether the VR SDKs can compensate for the missed frames? For that, you would use VRMark Orange Room experience mode with a connected headset.

You might find that you cannot tell the difference between system C and system B when using experience mode. Even though system C is regularly dropping frames, the SDK is able to compensate and hide the effects from the user. The VR experience is as good as a true VR-ready system.

With system D you might find that there are noticeable problems with the VR experience. The SDK is not able to compensate for the low frame rate. You might notice stuttering or other distracting effects.

From this, you would conclude:

System A is VR-ready with room to grow for more demanding experiences.
System B is VR-ready for games designed for the recommended performance requirements of the HTC Vive and Oculus Rift.
System C is technically not VR-ready but is still able to provide a good VR experience thanks to VR software techniques.
System D is not VR-ready and cannot provide a good VR experience.

I think many gamers will want to know that the system they are considering will be truly VR-ready in the technical and pure sense. You can only get that insight from a benchmark. You also need a benchmark test that runs on your monitor to see how far beyond 90 FPS a system can go. The VRMark Blue Room benchmark is a more demanding test that is ideal for comparing hardware that outperforms the Rift and Vive recommended spec.

At the other end of the scale, price-conscious gamers might be perfectly happy with a cheaper system that can appear to be VR-ready through technical tricks, for example, the new Oculus Rift minimum spec announced at Oculus Connect in October. These systems can be evaluated with the benchmark (how much will the VR SDK have to compensate) and with experience mode (how well does the SDK compensate).

With all of that laid out, I asked Gallagher to explain why, if Futuremark are now recommending people adopt a ‘see for yourselves’ methodology for VR benchmarking, why does he believe VRMark is needed at all? In theory any single VR application or game could be chosen to be used in the above methodology. Why should people invest in VRMark?

“I think the value of VRMark is that it gives you an easy way to make both these objective and subjective assessments using common content in one app,” he says, “The benchmark tests provide a convenient, easily repeated VR workload. They give you a pure test for VR-readiness. Experience mode gives you a way to judge the quality of the user experience on systems that don’t meet the pure definition.”

The latest VRMark is now on sale via Steam for use with the Oculus Rift, HTC Vive and OSVR compatible headsets. Current feedback on the title is mixed, with some criticising the lack of more extensive ‘pure’ benchmark functionality. Purely as a showcase for VR, the price (£14.99 / $19.99) seem perhaps a tad steep right now, especially considering a chunk of that pretty showcase (‘The Orange Room’) is available in the free demo version. That said, VRMark is a sight to behold in VR and along with the methodology above, there are many who many find the money worthwhile.

We’d love to hear your thoughts on Futuremark’s recommended methodologies your experiences with VRMark and thoughts on how VR behcmkarking may evolve over time in the comments below.
The post Futuremark Explains Why VR Benchmarking is About More Than Just Numbers appeared first on Road to VR.

Source: Futuremark Explains Why VR Benchmarking is About More Than Just Numbers