Graphics Cards

Radeon HD 5870 Reviews



NDAs up are up, reviews are plenty and the Radeon 5870 leaves too little good thoughts in my mind.

Summing the reviews up, for the impatient:

Performance you might have already read about, it's quite similar to AMD's internal testing, although deflated by 20%, averaging about 40% better performance than the GTX285, GTX 275 and Radeon HD 4890.

If you want the a great gaming rig without looking at cost, go for two Radeon HD 5870 in CrossFire(it's a lot more interesting than Triple SLI for lots of reasons). If you are more sensible or don't have the money, wait for the Radeon 5850 or go for the cheaper Radeon HD 4890 while giving up DX11 support for a while. If any of the new features interest you - and DX11 should - you really have to wait for the Radeon HD 5850, which is expected next week. $129 more for a card that won't deliver 50% more performance isn't a very good deal.

Eyeffinity is an interesting technolgy, exciting even, but the issue of cost and cheap monitors quality is too big IMO. For enthusiasts that should move you towards AMD's cards despite what Nvidia currently has on the market and what may come ahead. There's a strong issue when using three monitors, as you can only use two DVIs and the DisplayPort output, which needs either an expensive DP monitor or an active adapter that may cost more than $100.

Let's dwelve deeper now, into other problems that spoil an otherwise good GPU:

Power consumption and cooling

There's a problem with Furmark/OCCT and AMD cards, mentioned by Ryan Smith at AnandTech, which affects the RV770 also:
That problem reared its head a lot for the RV770 in particular, with the rise in popularity of stress testing programs like FurMark and OCCT. Although stress testers on the CPU side are nothing new, FurMark and OCCT heralded a new generation of GPU stress testers that were extremely effective in generating a maximum load. Unfortunately for RV770, the maximum possible load and the TDP are pretty far apart, which becomes a problem since the VRMs used in a card only need to be spec’d to meet the TDP of a card plus some safety room. They don’t need to be able to meet whatever the true maximum load of a card can be, as it should never happen.
"We never thought those 160 5-way shaders would actually be put good use, but we're calling it 800 shaders the same so we can have us some marketing uphand!"
Why is this? AMD believes that the instruction streams generated by OCCT and FurMark are entirely unrealistic.
"Madness, we say!"
They try to hit everything at once, and this is something that they don’t believe a game or even a GPGPU application would ever do.
Marketing, ya know. Like "Prescott", remember?
For this reason these programs are held in low regard by AMD, and in our discussions with them they referred to them as “power viruses”, a term that’s normally associated with malware. We don’t agree with the terminology, but in our testing we can’t disagree with AMD about the realism of their load – we can’t find anything that generates the same kind of loads as OCCT and FurMark.
Regardless of what AMD wants to call these stress testers, there was a real problem when they were run on RV770. The overcurrent situation they created was too much for the VRMs on many cards, and as a failsafe these cards would shut down to protect the VRMs. At a user level shutting down like this isn’t a very helpful failsafe mode. At a hardware level shutting down like this isn’t enough to protect the VRMs in all situations. Ultimately these programs were capable of permanently damaging RV770 cards, and AMD needed to do something about it.
"We gave in to the amount of shaders the marketing team wanted and now we're having to replace dead cards because we ignored the issue. We forgot those nasty overclockers. We just tricked them and now we're pissed at them because they've put our claim to the test. VRMs are expensive, you know?"

The solution from AMD this issue with "too efficient code" is have the driver detect the programs and cap the card's processing power, therefore reducing power and current output. On the new architecture AMD has decided to do this in hardware, so the card detects the high current output and starts throttling itself, indifferent of the temperature of the chip, the old metric for throttleing.
Nevertheless, AMD is right, you probably won't be able to code something so efficient for practical uses(see bandwidth below), so they're doing the right call. The marketing team is just being "markety".

Also, it's not just overclockers who care about OCCT/Furmark, but also those who like the cards well built and properly cooled. Let's look at the picture:


Unlike power consumption, load temperatures are all over the place. All of the AMD cards approach 90C, while NVIDIA’s cards are between 92C for an old 8800GT, and a relatively chilly 75C for the GTX 260. As far as the 5870 is concerned, this is solid proof that the half-slot exhaust vent isn’t going to cause any issues with cooling.
No seriously, the half-slot exhaust isn't going to cause any issues because you will restrain yourself from using F@H or any stream computing application that properly uses close to those 2.72TFLOPS because:
  • The card will throttle itself so the cheap VRMs won't burn.
  • The card will throttle itself down to lower clocks because of the inevitable gathering of dust on the heatsink that has caused the chip to jump a measly 20ºC up (gross underestimate for a few months) and it's now running at over 100ºC.
  • We just wanted to trick you with that pretty number and now we will throttle you, whether you like it or not. Stay away from efficient software!
Had I not seem more than one ATI card with bad cooling(GPU not VRM) and the card throttling ifself down, I would also ignore the problem.
This is what I found when looking at a well cooled card from a decent manufacturer, running "unproper" software:
Temperatures: 46.5ºC idle, 55ºC UT2004, 74ºC running Furmark's stress test.
A lot lower, isn't it? I feel safer this way but maybe I'm just silly.

The "fix" for Furmark doesn't really hurt anyone as it still performs its function admirably: it helps tweak the cooling system to be able to handle every worst case scenario and to not let the card burn when they do show up. You know you're still pushing the VRM's and the GPU to the new (imposed) extreme so it's all good.


GPGPU, misleading marketing and bandwidth:

2.72TFLOPs/s is the number AMD is touting for this new GPU. It's a nice number and with the improvements I talked about yesterday, it could very well swing AMD to the GPGPU market.
The problem is that, as you can see, someone from AMD has been very opened about theoretical vs practical TFLOPs/s when they mention that "AMD believes that the instruction streams generated by OCCT and FurMark are entirely unrealistic" and that it may destroy your card. As such, you won't get near that mark.

From the benchmarks it's also easy to see that the card is bandwidth limited and the rumored 384-bit bus is a requirement to reach those kinds of TFLOPs/s in practical applications. The Radeon 4890 had some bandwidth to spare but the increased global data share cache isn't enough to make up for the doubling of processing power with just a moderate increase in bandwidth from 124GB/s in the 4890 to 153GB/s in the 5870. Coincidently, that 23% increase in bandwidth is more reflected in benchmarks than shader power, at around 40%, sometimes less than that. It's a bigger increase in performance than in bandwidth but is accountable with the increase in global data share and the relaxed amount of available bandwidth that blessed the Radeon HD 4890.
Drivers are still fresh and should improve performance a bit, I just wouldn't expect it to be too much.
The new Radeon must be compared to the 4890 and not the 4870, remember that. The new card has around the same clock, double the shaders and costs about twice as much. Does it deliver double the performance? No, it delivers 30-50% and added features.

Bandwidth, then, brings us too the issue of chip area efficiency. The cards are too big and have too much resources wasted on useless amounts of shaders. The die is now biger at 334mm2 vs 260mm2 and consequently the price is also up: $379 for the 5870 and $259 for the 5850, which features 1440 shaders and a lower clock at 725MHz. That's a considerable bump from $299 and $199 prices at which the 4800 series launched and will not make such a mess of Nvidia's lineup - if it's launched soon enough, that is.

The previous Radeon HD 4000 cards hit the sweetspots just right, both in price, chip cost and performance. While the 3870 from previous generations was an underpowered card, the 2.5x increase in shaders provided enough power to compete with Nvidia and the GT200 based cards. The RV740 was even better, just building itself upon that success of die size/compute to bandwidth ratio, which was more balanced in the HD 4850 than on the 4870.
Today, these new cards are at least as expensive to build as the old G92a, probably more if TSMC's 40nm process is still a problem, due to the clear shader power overshoot that happenned with "Evergreen" chips.


The rest of the family

The plans for the rest of the architecture, as laid out by AMD:


There's also the upcoming "Juniper" GPU, which Ryan mentions that AMD might release with 14 SIMD blocks, or 1120 shaders:
The “new” member of the Evergreen family is Juniper, a part born out of the fact that Cypress was too big. Juniper is the part that’s going to let AMD compete in the <$200 category that the 4850 was launched in. It’s going to be a cut-down version of Cypress, and we know from AMD’s simulation testing that it’s going to be a 14 SIMD part.
That's a very interesting number and if we see it coupled with GDDR5 and a bus of reasonable width, the card may be interesting for both AMD and us, the customers. The card also isn't that far away, I would expect it around December.
"Hemlock" is expected around the same time and it's the dual 5870. Not much is know right now, other that it might cost $649.

Conclusion

I repeat myself: If you want the a great gaming rig without looking at cost, go for two Radeon HD 5870 in CrossFire(it's a lot more interesting than Triple SLI for lots of reasons). If you are more sensible or don't have the money, wait for the Radeon 5850 or go for the cheaper Radeon HD 4890 ($169 after rebate) while giving up for DX11 support for a while. The 5850 should perform comparably - also with a better price/performance ratio - and Nvidia might also be worth a look in the latter case, as the GTX 275 performs about on par and starts at around $209.
If any of the new features interest you - and DX11 should - you really should wait for the Radeon HD 5850, which is expected next week. $129 more for a card that won't deliver 50% more performance isn't a very good deal.

1 comment:

Anonymous said...

Very well said. I agree mostly. Amd decided to go big on specs numbers and gimmicks. But in the end the underlying cost cutting with the 256bit bus leaves them with a terribly inefficient device given the die size and 40nm process. In some games it struggles to outperform the gtx285 wwhich is just an overclocked 16month old gpu.

Post a Comment