By "initial hash", do you mean the pre-image of the final hash?
For SHA-1, the initial hash is defined as: 67452301 efcdab89 98badcfe 10325476 c3d2e1f0. However, this value is used only for the first 512-bit block. After that, the final hash after processing the first block becomes the initial hash for the second 512-bit block. It is all about
Merkle–Damgård construction: you have some initial hash, some data in-between, and some final hash in the end, that is passed as input for the next block of data. In this way, SHA-1, SHA-256, or any hash function using this scheme, has to be defined only for a single block, and then it can be easily expanded into N blocks by splitting data into K-bit chunks during the preparation phase.
how can we perform the hash function backwards to reach something that is larger than the hash itself?
The size of the message is included only in the latest 512-bit block. That means, if you want to perform a single-block attack, then yes, you have to be careful, and create a message, where the last 64 bits define the size of the whole message (in bits), before that there are zeroes for padding, and a one as a separator. However, you can create for example two 512-bit blocks, then in the first block you can put anything anywhere, and the second block will be a constant, in this case it will always be equal to that:
80000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000200
Then, your hash function will work in this way:
67452301 efcdab89 98badcfe 10325476 c3d2e1f0 initial hash
........ ........ ........ ........ the first 512-bit block
........ ........ ........ ........
........ ........ ........ ........
........ ........ ........ ........
........ ........ ........ ........ ........ middle hash
80000000 00000000 00000000 00000000 the second 512-bit block
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000200
........ ........ ........ ........ ........ final hash
Then, if you will reach some identical "middle hash", it will move you to the identical "final hash". That means, you only have to focus on processing a single 512-bit block, because if that will ever be broken, it will also allow reaching any hash by just modifying a single 512-bit block, reaching identical hash in the middle, and the rest will stay the same, because identical operations will be repeated on identical data.
If you analyze the consequences of breaking the hash function, you will notice that replacing a single 512-bit block is enough to perform a lot of attacks. You can for example limit SHA-1 or SHA-256 to the first 16 rounds. Then you will notice, how easily you can create any hash you want, and how you can start from some empty message, and reach some 256-bit message that will lead you to the identical hash.
how can we perform the hash function backwards, knowing that there is no unique origin due to collisions?
You can take initial hash, and the first 512-bit block data as your input. Then, hash function will produce the middle hash. However, you can also take the middle hash, and the first 512-bit block, and by computing things backwards, it will produce the initial hash. You can go backwards or forwards, because a lot of operations are reversible: if you know that "c=a+b", and that is the default flow, then you can also do "a=c-b" instead, and just execute it backwards. It is all about having enough data.
If you use 512-bit blocks, and some hash on one side as your input, you can reach some hash on another side. This is different from practical attacks, because then you don't know the content of the 512-bit blocks, and you only know hashes on both ends.
Why isn't it provable that going backwards is impossible?
Going backwards is possible. You can reduce SHA-1 or SHA-256 into the first 16 rounds, and then you can go backwards, forwards, or even modify things in the middle, because then w-values can be set to any values you want. The only reason why you cannot do the same with the real hash functions, is that by going through all rounds, you reach a spaghetti of dependencies. Then, it is harder to see, which bits you have to touch to reach the values you want, but still, it is fully deterministic, and fully reversible, just more messy, when you can trace some bit, and notice that it is connected with for example 80 other bits.
Also, I shared an example, when I executed SHA-1 backwards. If you start from 2328fad3 75109b40 3f121481 02e1def6 cfd74e98 as your initial hash, and you use a block of all zeroes as your 512-bit block, then you reach the same value as your middle hash. The only reason why this attack cannot be used in practice, is that in SHA-1, the initial hash is defined as 67452301 efcdab89 98badcfe 10325476 c3d2e1f0. However, if you look at values in each round, you can clearly see that it is perfectly valid to compute things backwards, if you have enough data.
Why is NP != P required to prove it?
There are two different things: one is computing 512 bits in the middle, by knowing hashes from both ends. This is breaking hash function, and nobody knows, how to do that. However, knowing 512 bits in the middle, and some hash on one side, is another thing. When I say that going backwards is possible, I mean it is not only possible to go from initial through middle to final hash, but I mean you can start from the end, and by passing some data, you can execute hash function in the opposite direction: if you have left rotation by 5 bits, you can do left rotation by 27 bits instead. If you have addition modulo 2^32, you can use subtraction modulo 2^32 instead. And so on.