One thing to add though is that there are ticks and tocks in chip manufactoring.
This means after the decrease in feature size the manufacturing process is perfected so that the wavers have less errors.
This means after the decrease in die area it usually increases again - not to the extent to the fully compensate for the loss but some so the 8000 generation could be about 1.8-1.9 as efficent as the previous one. But as haploid23 said there could be optimizations for gaming without actual increase in processing power though.
The issue is GPUs have LOTS of stuff on them we don't need, optimizations for a particular task can make a huge differnce. If for example all the unnecessary stuff is left out and the chips would use different architecture instead we would have a performance advantage of the factor of 6-20 already.
This won't happen with GPUs though, because of the difficulty of Programming, our task is trivial compared to the effort being made in graphic engines.
This is one of the negative implications of Moores law (and it's interpretation) as the costs for a transistor goes down the time to design and program the chips increases exponentially. The implications of this already begin to manifest. So while we would could have 32 times the processing power by 2021 the manpower required would be 32 times as high as well, since there aren't that many hardware developers and programmers each transistor is used less efficiently.
The way out of this dilemma is starting the process all over again, meaning use a architecture more suited for our purpose and using increased volume to get it to a smaller feature size.
Preparations are already on the way