The scrypt code itself has two separate loops that use 4k memory each, from what I can tell, so >8k is good to have. Beyond that, it also uses salsa20/8 and SHA256, and I don't know how much memory those use up, if they ever even drop out of the registers.
FWIW my Athlon XP sees a bit more than 1/2 the performance per GHz per core( vs Phenom II) when hashing scrypt coins, but don't know how much is due to cache size, special instruction architecture(using amdfam10 for hashing on the phenom), or something else. Of course that's per core, so the Phenom II can do a whole hell of a lot more than just 2x overall.
In other words: Athlon hashes at 1.15k per sec at 2.4 GHz, Phenom hashes at 3.2k per sec at 3.8GHz per core. Also, the phenom did almost exactly 3k per sec when it was stock clocked at 3.4.