Monday, July 6, 2015

The Myth of the Free ALUs

One of the myths which arised with the new generation of consoles was that arithmetic operations (aka ALUs) were from now on literally free. If long ago we used cube map LUTs to normalize vectors, on GCN it costs nothing. Sure, they do have some cost, but with 6-7 ALU cycles per byte read from VRAM recommended at the top memory bandwidth (which you even almost never get), they should be all hidden beyond the memory fetches latency.

Well, it turns out, they are not. And here are some of the reasons why.

Some ALUs are more expensive than others. 

If an extra mad would unlikely (I won quite significantly one day by optimizing out a redundant matrix multiplication from a tight loop) make a change, divisions or even trigonometry would. Even in an SSAO shader could benefit up to 13% performance increase when arithmetic is optimized.

There are no free lunches now. 

Everything you could have relied hardware to do before (cube map texcoord calculation, attribute interpolation etc.) is now done with ALUs. You can use nointerpolation to avoid, if needed.

The scalar/vector ALUs are unbalanced.

In case your shader utilizes too many SALUs, not only you may become scalar register bound, but also your VALU units might stall while waiting for a scalar op result. An example here is excessive lane swizzling: when I tried to use it as a "cheap shared memory replacement" for blur filter, I got a ridiculous SALU/VALU ratio.

Friday, December 26, 2014

Triangles got to go... or not?

Few months ago I had an argue with a colleague of mine of whether we still will be using triangles in 10 (or 15) years. While I naturally opposed the idea, I feel there is something into it. So why are we using triangles, really?

Saving by calculating in VS and interpolating in PS

In fact, this is less and less is a viable argument. Nowadays characters already feature near-to-one-pixel triangles. We are actually LOOSING in this case: we get 4-fold PS cost and even rasterizer choking on some architectures. Besides, one modern architectures VS parameter cache is a bottleneck, so it is even less expensive to recompute a view position in a PS, than calculate it in VS and interpolate.

To match DCC tools

It is actually quite awkward to model your characters with triangles - a lot of artists use tools like ZBrush where they sculpt with spheres instead.


Once again - it is not really optimal when you triangles become very tiny. Besides, one might use bounding volumes for other representations to clip invisible surface.

I probably see how one can gain benefit for fx/billboards - but that's quite a specific case. Again, overdraw due to unused space on those is a big issue.

All in all, to support triangles we now have a lot of overhead - VS, sometimes tessellation, rasterization - while sometimes they are nearly equivalent to final pixels (and this trend seem to continue).

I am curious to hear your opinions and rationales guys. Will triangles live forever or will be replaced with something else (voxels? splats?) in a decade or two?

Monday, November 10, 2014

Interview Feedback

What surprised me in the game industry when I moved to the West is the absolute absence of subj. I had done several interviews prior I joined Eidos, and in case it was not successful, nobody was giving any clue what was wrong. Even an HR would stop answering any emails.

It's hard for me to justify this. If you are afraid of legal issues - give a call instead of writing an email, same way as salary is usually discussed. It takes 5 minutes of your time - and it is nothing compared to the time you spent organizing and conduction the onsite.

It is especially crucial if you give a take-home test. Firstly, a candidate spends a weekend in implementing and polishing the assignment - so he/she deserves a small bit of your time being spent. Secondly, unlike onsite, he/she has absolutely no clue what he might have done wrong.

Game development companies are constantly complaining on how it is hard to get a qualified candidate, and that they need to pay significant ($15-30k) amount of money for headhunters and/or to relocate/make a visa for a candidate abroad. If you see a person is interested in gamedev/your company - why not explain him/her what are his/her weaknesses, so in 6 months or a year he/she studies and come better prepared next time? It costs virtually nothing.

P. S.
For the sake of justice, there are few companies that do give interview feedback, but the majority don't.

Thursday, November 6, 2014

Tessellation Topology Sucks... But It Doesn't Have To

Initially this all started with a deficiency (Easter egg? Gag?) found in DirectX documentation:
If you ever worked with a hardware tessellation in any GAPI, you would know that you cannot achieve the tessellation on the triangle above (or below right) with ANY combination. Instead, if you tessellate a triangle with a tessfactor = 5 (edge and inside), you would get an abomination like this:

Why is it bad? Obviously, compared to the former topology, the output triangles are no more equilateral, though the input one is. Moreover, even for a pretty uniform geometry you will get a 'spider web' effect at the vertices of input triangles:

Is there any rationale behind this? I think, the main reason to keep the latter tessellation is an ability to generate a meaningful result for an arbitrary inside tessellation factor: any route from inside a triangle from an input vertex to the opposite edge should hop through exactly inside_tess_factor - 1 output vertices.

Is that really needed? I think, not really. All tessellation algorithms I've seen so far do not actually care that much about the inside tess factor. Usually, a max or average edge tess factor is used.

The only problem with the former tessellation could be how to handle different edge tess factors, if adaptive tessellation is used. Well, one could simply eliminate the surplus output edge vertices (and the incident output edges), and split the resultant quads in two triangles each (sorry, no image for that). Yes, the tessellation would be imperfect - but this would be just locally for the transition area.

What are your ideas guys? Does it sound worth trying?

Monday, August 25, 2014

Computer Architecture - How To?

Some time ago I remember asking a more senior team member: how does one learn about how hardware operates on the low level? I was answered that this could be learned only from practice.
While I don't think that this is a completely invalid approach - since practical experience is always more valuable than a raw theory - I was searching for some theoretical basis I could get before diving into practice. I think I have something now what I can recommend.
Firstly, there is this course on hardware architecture, which is really good:
Beware though, it is a grad-level course, so the level is really very demanding. You should know what are caches, associativity, memory hierarchy etc. They recommend the following book:
I really recommend you start reading it from appendix, which serves a sort of a primer. The book is also very up-to-date, even more than the course itself.
The second book recommended ( provides a lot of low-level details on reservation stations/register renaming, very in-depth on a pair of particular architectures, and a comprehensive description of branch prediction algorithms. Despite it has been just republished, it is a bit outdated (e.g., Intel P6 microarchitecture is used as example), so it is really up to you to buy or to pass.
I hope this has been helpful, and as previously, feel free to comment and discuss.

Tuesday, August 19, 2014

SIGGRAPH Talk Accompanying Video

Thanks to our marketing department, I got a permission to post a video we showed during our talk which presents Thief artistic pipeline for reflection creation and setup. It might be somewhat unclear without comments, sorry for that:

Monday, August 18, 2014

SIGGRAPH 2014 Talk 'Reflections in Thief'

Hi guys!
I've decided to start a new dev blog where I'll write about stuff I'm working on as well as book reviews and my random thoughts on graphics in particular and games tech in general.

I don't promise regular updates, but, hopefully, I'll find some time to do them. For the start, I'll upload my SIGGRAPH talk (45.6 MB, .pptx) this year.

Enjoy, and I'll be more than happy to discuss it in the comments.