Hey greBit,
First off, really good questions; I left out a few of my thoughts from the paper, and you managed to hit a good portion of them! Thanks for taking the time to read my paper!
For offline use, one could use a storage peer and query all messages in a large address range to preserve anonymity, but as you mentioned, that wouldn't be possible for low-bandwidth users. One of the things I sort of glossed over in my paper that is important to offline use is the fact that instead of using special acknowledgement messages like Bitmessage, a peer can simply send another message back with an additional ephemeral key to let a peer know that he received his message. In other words, acknowledgment messages are sent in the form of regular messages.
With this in mind, I think the easiest way of approaching offline use would be the following scenario:
1) Alice sends a message to Bob, who is currently offline
2) After waiting a certain amount of time, Alice never receives a message from Bob acknowledging that he received the message
3) Alice then sends a permission packet to Bob to restart the conversation with a new ephemeral key, and then resend the previously missed message. One probably doesn't need to go through the whole permission packet process though, seeing as the one-time/ephemeral address concept I use is really only to mask bias in downloading messages (ie: the origin of broadcast for permission and ad packets is anonymized assuming the min spanning tree assumption)
4) Alice can continually resend permission packets every so often until Bob is back online
And yes, users need to be online to answer permission packets. No reply means the user if offline. Permission requests should be re-sent randomly, to preserve anonymity (ie: not have a constant amount of time hardcoded in the client)
Attackers making lots of connections is a subset of an area I'm still thinking about, that area being how to best bootstrap the network and how to handle conencting to new peers. If a peer could gurantee that he is connected to every peer in the network, or even a large subset, he could theoretically start to aproximate network topologies and reduce anonymity depending on how peers are choosing to broadcast ads and permission packets.
To experiment and to try and find out best practices for maximizing anonymity in proportion to the amount of users in the network, my next step is to create a visual simulator to aproximate network flow and p2p interactions. This will be used as a proof of concept to estimate the anonymity of users given different topologies, as well as show how the network can react and defend against attempted attacks. If this goes well, I'll then start client development.
As far as global packets and scalability, that is going to come down to coming up with clever ways of compressing address data, which is on my list of things to think about. Another thought I had regarding this was to allow messages to be tagged as belonging to a certain category, such as e-mail, files, or twitter-esque service messgae, and then let peers decide what categories/services they want to participate in and relay (ie: think of this as a multiplexing different bitmask networks into one client)
For your behavioral contract question, the beauty of ads is that by relaying an ad to another peer and making them aware of the message's presence, you are simply letting them know that you are serving as a contact for obtaining this message; you don't necessarily have the message cached when you go to pass on the ad. That way, you only ever checkup on a peer after they have requested a message from you and have received it, meaning you always know which files they should have from you.
When a peer disconnects, new contracts/agreements must be made. If a peer disconnects too frequently, the peers he was connected to will blacklist him.
With regards to blacklisting, I had only thought of it as being an individual list per peer when I originally wrote the paper. I'm not sure yet if network knowledge can be done in a trustless manner, as I'm still exploring that. One idea I had was that you could allocate weight/trust to a node everytime he successfully transfers a message that belongs to you, but it would need to be done in a way that doesn't create an attack vector for discovering which messages you're interested in.
Thanks again, and keep those questions coming