I benched it earlier and it's about 3% faster, you should see some improvement.
|
|
|
I hope botnet users get caught sooner or later But if a large coporate user sets his power to litecoin you can see a huge change in difficulty A friend of me has a enormous cpu power set at some boinc projects, if he would change that onto coin mining that would give a shock with that enormous clusters he has. I sure enough will not bring him on the idea, 1500 multi cpu blade server machines would hurt a lot I could dump thousands of CPUs onto litecoin via a supercomputing grid... but the loss from losing my job and the likelihood that I'd never be able to get another one in the IT industry would probably negate any potential profit I'd make through LTC.
|
|
|
I'm not sure if this is pump and dump or something else... The wall keeps moving down but the amount put up is getting progressively slower. I think this person might be trying to buy all the LTC possible, there isn't that much LTC out there so to buy all of it and keep it out of circulation is really easy at this point.
|
|
|
The effect should be pretty obvious.
1. Miners will hit it hard before the block size change, trying to hoard the coins as much as possible. 2. The price will rise, but only temporarily. 3. Miners, now getting more than 6 times less the amount of SC per watt, will stop mining. 4. The difficulty will fall because no one is stupid enough to mine more for less, and the price with it. 5. People will continue mining at the lower difficulty, effectively causing stabilizing the price exactly where it was before, if not lower due to lack of confidence or interest.
This is the same thing as if he [Coinhunter/realsolid] were to generate tons of coin just for himself (which he already is), as it increases the real value CH has (compared to USD/BTC) while decreasing the real value miners may obtain. It will effectively cause a difficulty and price crash.
|
|
|
Okay, I've been playing around with the code for a few days now. - The if/else statements can be avoided but have virtually no effect on the speed of GPU (while the CPU code seems to paradoxically benefit a little). This goes against everything CH said about if/else statements being difficult for the GPU. - The main problem with reaper's implementation is that it does allow per-compute unit parallelization. The code is run on the OCL kernel, but not in parallel; ideally the search() function that is executed should be handed 32 or more data sets to work on and then execute all data sets by a for loop, and then these should be synced with a barrier and output to the global memory. The current code hands one data set to one compute unit and likely takes a very hard hit in terms of parallelization. With all the coprocessors working on the data set, a speed up on the order of a magnitude for GPUs should be possible (several megahashes per second). Whoever has time to do this and cares, have a look at the OCL examples from nVidia and how they parallelize much more effectively than the current reaper code: http://developer.nvidia.com/opencl-sdk-code-samples
|
|
|
Attempted to try your code taco but I get a build error from /usr/lib/ld about -lOpenCL missing.
You need to install the opencl library
|
|
|
Here is the more optimized RSHash.cpp #include "RSHash.h" #include "Blake512.h" #include "SHA256.h" #include <stdint.h> #include <iostream> using std::cout; using std::endl;
#define PHI 0x9e3779b9 #define BLOCKHASH_1_PADSIZE (1024*1024*4)
typedef unsigned int uint32; typedef unsigned long long int uint64;
static uint32 BlockHash_1_Q[4096],BlockHash_1_c,BlockHash_1_i; unsigned char *BlockHash_1_MemoryPAD8; uint32 *BlockHash_1_MemoryPAD32;
uint32 BlockHash_1_rand(void) { uint32 x, r = 0xfffffffe; uint64 t, a = 18782LL; BlockHash_1_i = (BlockHash_1_i + 1) & 4095; t = a * BlockHash_1_Q[BlockHash_1_i] + BlockHash_1_c; BlockHash_1_c = (t >> 32); x = (t + BlockHash_1_c)&0xFFFFFFFF; (x < BlockHash_1_c) && ( x++ && BlockHash_1_c++ ); return (BlockHash_1_Q[BlockHash_1_i] = r - x); }
#include <cstdio>
void BlockHash_Init() { static unsigned char SomeArrogantText1[]="Back when I was born the world was different. As a kid I could run around the streets, build things in the forest, go to the beach and generally live a care free life. Sure I had video games and played them a fair amount but they didn't get in the way of living an adventurous life. The games back then were different too. They didn't require 40 hours of your life to finish. Oh the good old days, will you ever come back?"; static unsigned char SomeArrogantText2[]="Why do most humans not understand their shortcomings? The funny thing with the human brain is it makes everyone arrogant at their core. Sure some may fight it more than others but in every brain there is something telling them, HEY YOU ARE THE MOST IMPORTANT PERSON IN THE WORLD. THE CENTER OF THE UNIVERSE. But we can't all be that, can we? Well perhaps we can, introducing GODria, take 2 pills of this daily and you can be like RealSolid, lord of the universe."; static unsigned char SomeArrogantText3[]="What's up with kids like artforz that think it's good to attack other's work? He spent a year in the bitcoin scene riding on the fact he took some other guys SHA256 opencl code and made a miner out of it. Bravo artforz, meanwhile all the false praise goes to his head and he thinks he actually is a programmer. Real programmers innovate and create new work, they win through being better coders with better ideas. You're not real artforz, and I hear you like furries? What's up with that? You shouldn't go on IRC when you're drunk, people remember the weird stuff."; BlockHash_1_MemoryPAD8 = new unsigned char[BLOCKHASH_1_PADSIZE+8]; //need the +8 for memory overwrites BlockHash_1_MemoryPAD32 = (uint32*)BlockHash_1_MemoryPAD8;
BlockHash_1_Q[0] = 0x6970F271; BlockHash_1_Q[1] = 0x6970F271 + PHI; BlockHash_1_Q[2] = 0x6970F271 + PHI + PHI; for (int i = 3; i < 4096; ++i) BlockHash_1_Q[i] = BlockHash_1_Q[i - 3] ^ BlockHash_1_Q[i - 2] ^ PHI ^ i; BlockHash_1_c=362436; BlockHash_1_i=4095;
int count1=0,count2=0,count3=0; for(int x=0;x<(BLOCKHASH_1_PADSIZE/4)+2;++x) BlockHash_1_MemoryPAD32[x] = BlockHash_1_rand(); for(int x=0;x<BLOCKHASH_1_PADSIZE+8;++x) { switch(BlockHash_1_MemoryPAD8[x]&3) { case 0: BlockHash_1_MemoryPAD8[x] ^= SomeArrogantText1[count1++]; if(count1>=sizeof(SomeArrogantText1)) count1=0; break; case 1: BlockHash_1_MemoryPAD8[x] ^= SomeArrogantText2[count2++]; if(count2>=sizeof(SomeArrogantText2)) count2=0; break; case 2: BlockHash_1_MemoryPAD8[x] ^= SomeArrogantText3[count3++]; if(count3>=sizeof(SomeArrogantText3)) count3=0; break; case 3: BlockHash_1_MemoryPAD8[x] ^= 0xAA; break; } } }
void BlockHash_DeInit() { delete[] BlockHash_1_MemoryPAD8; }
const uint32 PAD_MASK = BLOCKHASH_1_PADSIZE-1; typedef unsigned char uchar;
bool BlockHash_1(unsigned char *p512bytes, unsigned char* final_hash) { //0->127 is the block header (128) //128->191 is blake(blockheader) (64) //192->511 is scratch work area (320)
unsigned char *work1 = p512bytes; unsigned char *work2=work1+128; unsigned char *work3=work1+192;
blake512_hash(work2,work1);
//setup the 320 scratch with some base values work3[0] = work2[15]; for(int x=1;x<320;++x) { work3[x-1] ^= work2[x&63]; (work3[x-1]<0x80) ? work3[x]=work2[(x+work3[x-1])&63] : work3[x]=work1[(x+work3[x-1])&127]; } #define READ_PAD8(offset) BlockHash_1_MemoryPAD8[(offset)&PAD_MASK] #define READ_PAD32(offset) (*((uint32*)&BlockHash_1_MemoryPAD8[(offset)&PAD_MASK]))
uint64 qCount = *((uint64*)&work3[310]); int nExtra=READ_PAD8(qCount+work3[300])>>3; for(int x=1;x<512+nExtra;++x) { qCount+= READ_PAD32( qCount ); qCount&0x87878700 && work3[qCount%320]++;
qCount-= READ_PAD8( qCount+work3[qCount%160] ); qCount&0x80000000 ? qCount+= READ_PAD8( qCount&0x8080FFFF ) : qCount+= READ_PAD32( qCount&0x7F60FAFB );
qCount+= READ_PAD32( qCount+work3[qCount%160] ); qCount&0xF0000000 && work3[qCount%320]++;
qCount+= READ_PAD32( *((uint32*)&work3[qCount&0xFF]) ); work3[x%320]=work2[x&63]^uchar(qCount);
qCount+= READ_PAD32( (qCount>>32)+work3[x%200] ); *((uint32*)&work3[qCount%316]) ^= (qCount>>24)&0xFFFFFFFF; ((qCount&0x07)==0x03) && x++; qCount-= READ_PAD8( (x*x) ); ((qCount&0x07)==0x01) && x++; }
Sha256(work1, final_hash); return true; }
I eliminated all of the cool if/else statements coinhunter was for some reason so proud of before, I'm not sure why exactly. It appears to get +3% or so performance for me. Wait. Isn't he a $150 an hour, unemployed coding genius who spent 20,000 hours on this? I think you are just misunderstanding the code. The multiGPU coding appears fucked, actually I seem to get lower hash rates using two GPUs instead of putting "device 0" in the config file and only using the first one. If you put "device 1" in the reaper config though, SEGFAULT! Edit: device 1 works on another motherboard, I'm wondering if maybe something is weird with my operating system on this computer... too tired, going home.
|
|
|
Bumping again, having a lot of fun here... optimized Coinhunter's code for mining and now I'm pulling 150 kh/s on a single GTX 570. I'll published later when I'm done with more optimizations. The OpenCL multidevice coding is totally fucked by the use of pthread and multigpu setups will take about a 70% performance hit for the second GPU. This can not be fixed by running separate instances, because there are segfaults if you set it to only use a 2nd or 3rd etc device.
|
|
|
void BlockHash_Init() { static unsigned char SomeArrogantText1[]="Back when I was born the world was different. As a kid I could run around the streets, build things in the forest, go to the beach and generally live a care free life. Sure I had video games and played them a fair amount but they didn't get in the way of living an adventurous life. The games back then were different too. They didn't require 40 hours of your life to finish. Oh the good old days, will you ever come back?"; static unsigned char SomeArrogantText2[]="Why do most humans not understand their shortcomings? The funny thing with the human brain is it makes everyone arrogant at their core. Sure some may fight it more than others but in every brain there is something telling them, HEY YOU ARE THE MOST IMPORTANT PERSON IN THE WORLD. THE CENTER OF THE UNIVERSE. But we can't all be that, can we? Well perhaps we can, introducing GODria, take 2 pills of this daily and you can be like RealSolid, lord of the universe."; static unsigned char SomeArrogantText3[]="What's up with kids like artforz that think it's good to attack other's work? He spent a year in the bitcoin scene riding on the fact he took some other guys SHA256 opencl code and made a miner out of it. Bravo artforz, meanwhile all the false praise goes to his head and he thinks he actually is a programmer. Real programmers innovate and create new work, they win through being better coders with better ideas. You're not real artforz, and I hear you like furries? What's up with that? You shouldn't go on IRC when you're drunk, people remember the weird stuff."; BlockHash_1_MemoryPAD8 = new unsigned char[BLOCKHASH_1_PADSIZE+8]; //need the +8 for memory overwrites BlockHash_1_MemoryPAD32 = (uint32*)BlockHash_1_MemoryPAD8;
BlockHash_1_Q[0] = 0x6970F271; BlockHash_1_Q[1] = 0x6970F271 + PHI; BlockHash_1_Q[2] = 0x6970F271 + PHI + PHI; for (int i = 3; i < 4096; i++) BlockHash_1_Q[i] = BlockHash_1_Q[i - 3] ^ BlockHash_1_Q[i - 2] ^ PHI ^ i; BlockHash_1_c=362436; BlockHash_1_i=4095;
int count1=0,count2=0,count3=0; for(int x=0;x<(BLOCKHASH_1_PADSIZE/4)+2;x++) BlockHash_1_MemoryPAD32[x] = BlockHash_1_rand(); for(int x=0;x<BLOCKHASH_1_PADSIZE+8;x++) { switch(BlockHash_1_MemoryPAD8[x]&3) { case 0: BlockHash_1_MemoryPAD8[x] ^= SomeArrogantText1[count1++]; if(count1>=sizeof(SomeArrogantText1)) count1=0; break; case 1: BlockHash_1_MemoryPAD8[x] ^= SomeArrogantText2[count2++]; if(count2>=sizeof(SomeArrogantText2)) count2=0; break; case 2: BlockHash_1_MemoryPAD8[x] ^= SomeArrogantText3[count3++]; if(count3>=sizeof(SomeArrogantText3)) count3=0; break; case 3: BlockHash_1_MemoryPAD8[x] ^= 0xAA; break; } } } Oh noes, furries. FOR SHAME
|
|
|
1. Premine 13 million SC 2. Arbitrarily raise difficulty to make SC more expensive 3. Sell premined coins 4. Massive temporary profits
Would you buy from the Central Bank of Hugo Chavez or the Central Bank of Robert Mugabe? CH thinks you will. This is amazing. Instead of printing his own money (inflation), CH thinks that by deflating his currency he can in effect do the same thing to the market and not have it collapse.
Well, good luck with that.
|
|
|
Okay, this fix is better and just restarts the miner if it dies. simply run this script in bash: until ./run_reaper.sh; do echo "Server 'run_reaper.sh' crashed with exit code $?. Respawning.." >&2 sleep 1 done where run_reaper.sh is the script to run reaper
|
|
|
reaper 0.7's error handling is befuckered at best... to keep it from crashing with libcurl, change: CURLcode code = curl_easy_perform(curl); if(code != CURLE_OK) { if (code == CURLE_COULDNT_CONNECT) { cout << "Could not connect. Server down?" << endl; } else { cout << "Error " << code << " submitting work. See http://curl.haxx.se/libcurl/c/libcurl-errors.html for error code explanations." << endl; } } curl_slist_free_all(headerlist); So that it loops and sleeps if the data to return is null edit: quickfix CURLcode code = curl_easy_perform(curl); while (code != CURLE_OK){ if (code == CURLE_COULDNT_CONNECT) { cout << "Could not connect. Server down?" << endl; sleep(5); } else { cout << "Error " << code << " submitting work. See http://curl.haxx.se/libcurl/c/libcurl-errors.html for error code explanations." << endl; sleep(5); } code = curl_easy_perform(curl); }
|
|
|
market manipulation much?
|
|
|
Anyone not concerned about the >51% power at lc.ozco.in ? CPU Botnet mining @ 2Mhash/S ?
You would need more than 4mh/s to 51% the network right now... ozcoin doesn't have that much power, even.
|
|
|
Is someone trying to 51% LTC? majestik on the dyndns pool is at 2.00 MH/s while "other" on the kicksass graphs is huge too. I think the network is around 6 MH/s right now.
|
|
|
In 5 years the standard number of cores on an x86 CPU will probably be 32 and they will run at 6GHz+ on stock clocks... So it's probably irrelevant
No they won't. 1) Number of cores will increase but it is more like 50% increase every 2 years (track the move from single core to double to quad to hex core). The number of cores isn't going to increase 8x in next 5 years. 2) You won't see a 6GHz+ chip (maybe not ever). Power draw increases by the SQUARE of frequency. So if you double the frequency you don't get 2x the power draw you get 4x the power draw (and 4x the heat). This is the entire reason for multi-core designs. Back in Pentium III days Intel has a long term timeline ... 10Ghz by 2010. We didn't quite make it there. http://www.geek.com/articles/chips/intel-predicts-10ghz-chips-by-2011-20000726/Have you noticed that frequency of fast CPU today isn't much faster than 2 years ago, and not significantly faster than 6 years ago. If hypothetically you could make a 6GHz+ chip and say it consumed 240W. You could get the same computational power by redesigning the chip to be more efficient (more instructions per clock aka Pentium IV -> Core 2 -> i7) and more cores and then clock it at ~3GHz and likely end up with ~120W TDP. Intel was attempting to break the ceiling on speed with the Prescott generation of CPUs by extending the pipeline in the chips -- didn't happen, but AMD is taking the approach again with Bulldozer. AMD wanted 4.5GHz stock clocks on Bulldozer this round but didn't pull it off; however, with the enhanced latencies and huge pipeline future versions of the chip should clock around there. All of Intel's current line easily clocks above 4GHz, with the mean being 4.5GHz. Further, this doesn't represent actual performance gains as new processes evolve that give performance gains without large new numbers of transistors. The point of the post was though: Soon CPUs will be so much faster than even if GPUs improved, they would still be behind.
|
|
|
How much of the source have you seen, viper?
The hashing algorithm. Edit: A pre-beta version, but it wasn't all too different from what coinhunter posted in these forums elsewhere. It is from BCX's claims on here and of his GPU miner. SC2 supporters are claiming SHA-3 hashing algorithms -- if this is true it may be very open to attacks by FPGA/GPU.
|
|
|
Good, I'll have an iterative solution to his recursive algorithm soon after.
|
|
|
12 hour later, sc2 is down another 30%.. Gg sc2
|
|
|
I don't see a problem with GPU unfriendlyness since the amount of caching would probably still be too large for future GPUs. It doesn't make much sense to pack much memory to a GPU alu since there isn't any real use for that.
GPU's already have 1 to 2 gigs of memory on the card, and they are very good at streaming data. Yes, scrypt uses random access to limit the benefit of the streaming memory, but that really only adds latency. Even if GPU alus don't get enough cache in the future, threading would let you hide the latency by having more processes. Take the extreme case, lets say litecoin is around in 15 or 20 years, are you really convinced that GPU's won't get cache or threading in that time? In 5 years the standard number of cores on an x86 CPU will probably be 32 and they will run at 6GHz+ on stock clocks... So it's probably irrelevant
|
|
|
|