Graphics Cards

GF104 Updates - More GeForce GTX 460 Details Emerge

It's already the 12th on this side of the Atlantic and the GF 104 SM and GPC diagrams popped up on my inbox - special thanks go to the anonymous reader who left them at the previous post about the chip.


If you've read the previous post, you know I was speculating about a 3x8 CUDA cores SM and no L2 cache on the GF104. Thankfully I was totally wrong about the L2, which is a good thing to have around when you're pushing for GPU computing everywhere. The number of cores are not 24 per SM but double at 48, or 16 more than the GF100 chip.

There were some changes on the front end but the register file size remains the same, so it's a good thing that Nvidia kept the L2 around - the GF100 had less registers per core than the GT200 and the GF104 has even less. Nevermind the other diagrams from Nvidia's Fermi paper stating 4096 registers per SM, I checked and they're wrong.

So far, per SM:
  • Same register file size (bad)
  • Double the SFUs - complex math operation, interpolation units (good)
  • Extra 16 CUDA cores per SM (needed a bigger register file)
  • Warp sheduler has double the dispatch units
  • Load/store units stay the same
  • Double the TMUs
  • L2 stays, unknown size(probably the same)





The GPC:
  • Has the 4 SMs now but now they're still two.
  • Half the polymorph engines total, tesselation performance will suffer
  • 64 TMUs total, 56 on the GTX 460
  • 24 ROPs on 192bit, 32 on 256bit cards seems likely.

It's pretty late here so I'll be bringing more details in the morning, for now enjoy the pricing on the cards:

No comments:

Post a Comment