Ok, so I've got an update for you guys, sorry for the delay, its been a rough 24hrs.
Basically, everything started with a bad database upgrade for X13 compatability. Our DB is very tightly tuned for normal memory usage, and while in testing everything worked fine, the change put us just over the memory usage we should have been at, causing some swapping (disk usage instead of memory), causing slow enough performance to require a rollback and restart. The next 6hrs or so were spent making sure everything was running correctly. Because basically everything uses the DB, and we try to do as much in parallel as possible, its not as simple as "restart the script", so its a matter of restarting services in the right order, and getting everything back up and running takes a bit. DB failure is basically the worst type of failure (and this is the first time its ever happened in our ~6 month time lifetime), so it took a bit longer to make sure everything was working properly.
By the end of all the DB fixes, and getting (almost) everything working, it was way past sleep time, and I headed to bed. Only to be awakened 3 different times from DDOS attacks, and an unrelated server failure. DDOS attacks are pretty normal, we get them 3-4x per week. We do some magic, fight them off, and nothing is really visible to the miners outside of a bit higher latency (~50ms) during the attack. Same thing last night, just a few in a row, making the lack of sleep a bit worse.
By the time I got up this morning, it was very clear there were 2 remaining issues.
1) A piece of our share processor didn't get reset, and was still using an old db cache.
- For the vast majority of users, this had no effect, your account address was already cached, and everything went fine. However, if you were new to WP, or switched addresses in the last ~12hrs, it wasn't getting updated properly, and thus saw your address as "invalid". Unfortunately at this point, there isn't much I can do about it. We discard invalid address shares pretty low down to prevent against attacks we've had to deal with in the past. I'm really sorry this happened, and as far as I know, this is the first time we've had something like this happen. Everything should be fixed now, and it has been noted in my disaster recovery logs, so it shouldn't happen again.
2) We mined about 5hrs of a GuerillaCoin fork.
- GuerillaCoin got an update about 48hrs ago that changed the last POW block from 10k to 6k. I saw the change log (
https://github.com/guerillacoin/guerillacoin/commit/23b538d944ac1927d6c8273ea37b38584f36c69e), but didn't see the block number change. I unfortunately went to their forum thread, and read this sentence describing the 1.0.1 update:
Update - 20.33 UTC 22/06/2014
It has come to our attention that some users that downloaded the Windows wallet from github have been having synching issues. So we'd like to request that all our Windows users make sure they have the right version of the wallet, and if not redownload their wallet (this only applies if you have a version other than v1.0.1.1-1-gb8df644-beta under help->debug window)
So in short
only applies to windows users that downloaded their wallet from github
does not apply to mac users
does not apply to linux users
does not apply to windows users that downloaded their wallet from the sendspace linked at guerillacoin.com
applies if staking before block 6000
Saw the "does not apply to linux users", and went on my way. Which led to us not having the latest release, and mining on the fork. It has been updated to the latest, and invalid blocks processed/orphaned, so balances should be correct. Its just going to end up being a decent hit on the stats for the day.
For those of you who have been mining here a while, you know this isn't very common at all (we haven't had any sizable problems in months), and for those of you new here, sorry you had to see that early on (it really doesn't happen often).
tldr: Bad upgrades, DDOS attacks, lack of sleep, terrible patch notes, bad me. Everything fixed, sorry about that.