Tamis
|
|
August 08, 2013, 08:39:26 AM |
|
8 blocks found during the night, 2 of them are orphans, thats 25%... Could this be related to connexion count ? My current vps all have 8 connections while on DO I could get up to 30 connexions per instance, could this be a reason ? Any ways to increase the connexion count ?
|
|
|
|
razibuzouzou
|
|
August 08, 2013, 08:41:14 AM |
|
Hi, Did anyone tried to play with the nL1CacheElements constant in prime.cpp ? Yesterday, I tried a set of values and was a bit puzzled by the results. It did not seem to affect the number of blocks found, however it had a strong impact on primesperday, chainspermin and chainsperday. I am using default mining settings: sievesize=1000000, etc. With default nL1CacheElements=200000 ; let's assume I measure a chainsperday of 1 With nL1CacheElements=10000, chainsperday is multiplied by 6, but primesperday is multiplied by 0.1 With nL1CacheElements=20000, chainsperday is multiplied by 2, but primesperday is multiplied by 0.8 With nL1CacheElements=65536, chainsperday is multiplied by 0.5 With nL1CacheElements=90000, chainsperday is multiplied by 0.6666 With nL1CacheElement=100000, chainsperday is multiplied by 1.6 With nL1CacheElements=400000, chainsperday is multiplied by 0.5555 I am guessing that chainsperday metric is affected somehow by the number of loops performed to combine the candidates arrays; but couldn't understand how. I know that primesperday is not an accurate performance metric, however I am wondering whether chainsperday is a reliable efficiency measurement. Maybe getting the best for both values indicates maximal efficiency, or not... Hi, In the modified version of the jhPrimeminer that is used for ypool.net I did some kind of auto tuning for the nL1CacheElements by measuring the time it takes to execute the Wave() function. Indeed the default is not the optimal most of the time. I also found that for some settings even higher (1.000.000+) nL1CacheElements performs better. Based on test using profiling tools I concluded that the most time consuming code is writing to the memory and also this is done less linearly but to more random positions. So I think in this case the CPU cache has less role. Maybe I'm not right, I never done these kind of tunings before. Just tried to remove the loop which processes blocks of nL1CacheElements. Performance seems a bit affected, chainsperday hit by a 0.82 factor. What do you think ? Are chainspermin variation due to nL1CacheElements value an indicator of performance improvement/degradation ?
|
|
|
|
mikaelh (OP)
|
|
August 08, 2013, 08:42:35 AM |
|
The nL1CacheElements constant was not meant to be changed and there's a good reason for that. nL1CacheElements needs to be a multiple of 64 on a 64-bit platform (or a multiple of 32 on a 32-bit platform). Otherwise the sieve code will start misbehaving which seems to be what razibuzouzou is seeing. The sieve will start producing false candidates which will distort the metrics. The tests/h is the number of candidates produced by the sieve which is directly impacted. The chains/day estimate relies on tests/h being accurate so that will also be heavily impacted.
With the latest version, there could also be a small negative impact on block rate if the sieve is misbehaving. The false candidates will always be marked as bi-twin candidates. If it in reality is a candidate of another type, it will not get recognized properly.
Summary: Don't touch nL1CacheElements unless you know exactly what you're doing.
|
|
|
|
paulthetafy
|
|
August 08, 2013, 08:50:50 AM |
|
The nL1CacheElements constant was not meant to be changed and there's a good reason for that. nL1CacheElements needs to be a multiple of 64 on a 64-bit platform (or a multiple of 32 on a 32-bit platform). Otherwise the sieve code will start misbehaving which seems to be what razibuzouzou is seeing. The sieve will start producing false candidates which will distort the metrics. The tests/h is the number of candidates produced by the sieve which is directly impacted. The chains/day estimate relies on tests/h being accurate so that will also be heavily impacted.
With the latest version, there could also be a small negative impact on block rate if the sieve is misbehaving. The false candidates will always be marked as bi-twin candidates. If it in reality is a candidate of another type, it will not get recognized properly.
Summary: Don't touch nL1CacheElements unless you know exactly what you're doing.
mikeal, what about using values that ARE a multiple of both 64 and 32 then? Is that safe and will it yield improved performance if done correctly?
|
|
|
|
mikaelh (OP)
|
|
August 08, 2013, 09:00:44 AM |
|
The nL1CacheElements constant was not meant to be changed and there's a good reason for that. nL1CacheElements needs to be a multiple of 64 on a 64-bit platform (or a multiple of 32 on a 32-bit platform). Otherwise the sieve code will start misbehaving which seems to be what razibuzouzou is seeing. The sieve will start producing false candidates which will distort the metrics. The tests/h is the number of candidates produced by the sieve which is directly impacted. The chains/day estimate relies on tests/h being accurate so that will also be heavily impacted.
With the latest version, there could also be a small negative impact on block rate if the sieve is misbehaving. The false candidates will always be marked as bi-twin candidates. If it in reality is a candidate of another type, it will not get recognized properly.
Summary: Don't touch nL1CacheElements unless you know exactly what you're doing.
mikeal, what about using values that ARE a multiple of both 64 and 32 then? Is that safe and will it yield improved performance if done correctly? If you use multiples of 64 (or 32), then that should work in theory. I haven't really tested that exhaustively though. I would expect minimal gains from adjusting it in most cases.
|
|
|
|
paulthetafy
|
|
August 08, 2013, 09:03:09 AM |
|
The nL1CacheElements constant was not meant to be changed and there's a good reason for that. nL1CacheElements needs to be a multiple of 64 on a 64-bit platform (or a multiple of 32 on a 32-bit platform). Otherwise the sieve code will start misbehaving which seems to be what razibuzouzou is seeing. The sieve will start producing false candidates which will distort the metrics. The tests/h is the number of candidates produced by the sieve which is directly impacted. The chains/day estimate relies on tests/h being accurate so that will also be heavily impacted.
With the latest version, there could also be a small negative impact on block rate if the sieve is misbehaving. The false candidates will always be marked as bi-twin candidates. If it in reality is a candidate of another type, it will not get recognized properly.
Summary: Don't touch nL1CacheElements unless you know exactly what you're doing.
mikeal, what about using values that ARE a multiple of both 64 and 32 then? Is that safe and will it yield improved performance if done correctly? If you use multiples of 64 (or 32), then that should work in theory. I haven't really tested that exhaustively though. I would expect minimal gains from adjusting it in most cases. thanks mikael, I tried values ~100000 higher and lower (but multiples of 64/32) and saw decreased performance both ways, so 200000 seems optimal for now. Shame, I was starting to think I could get 3x performance then
|
|
|
|
mumus
|
|
August 08, 2013, 09:50:38 AM |
|
The nL1CacheElements constant was not meant to be changed and there's a good reason for that. nL1CacheElements needs to be a multiple of 64 on a 64-bit platform (or a multiple of 32 on a 32-bit platform). Otherwise the sieve code will start misbehaving which seems to be what razibuzouzou is seeing. The sieve will start producing false candidates which will distort the metrics. The tests/h is the number of candidates produced by the sieve which is directly impacted. The chains/day estimate relies on tests/h being accurate so that will also be heavily impacted.
With the latest version, there could also be a small negative impact on block rate if the sieve is misbehaving. The false candidates will always be marked as bi-twin candidates. If it in reality is a candidate of another type, it will not get recognized properly.
Summary: Don't touch nL1CacheElements unless you know exactly what you're doing.
Mikael, thank you for the clarification. Indeed after checking the code I found the reason why it has to be a multiple of 64/32. It's a shame I didn't checked this before. Anyway I've run some more tests and using the default settings of the primecoin client there was no significant gain for different nL1CacheElements, but in the moment the sieve size and the sieve percentage was raised the nL1CacheElements size had more affect. For example with sievesize=2000000 and sievepercentage=50 a higher nL1CacheElements (1056000) speeds up the Wave by ~30%. (my CPU is i7) In any case because 50% is probably not the optimal sievepercentage this measurement is less relevant.
|
|
|
|
mumus
|
|
August 08, 2013, 11:43:58 AM |
|
The nL1CacheElements constant was not meant to be changed and there's a good reason for that. nL1CacheElements needs to be a multiple of 64 on a 64-bit platform (or a multiple of 32 on a 32-bit platform). Otherwise the sieve code will start misbehaving which seems to be what razibuzouzou is seeing. The sieve will start producing false candidates which will distort the metrics. The tests/h is the number of candidates produced by the sieve which is directly impacted. The chains/day estimate relies on tests/h being accurate so that will also be heavily impacted.
With the latest version, there could also be a small negative impact on block rate if the sieve is misbehaving. The false candidates will always be marked as bi-twin candidates. If it in reality is a candidate of another type, it will not get recognized properly.
Summary: Don't touch nL1CacheElements unless you know exactly what you're doing.
Mikael, thank you for the clarification. Indeed after checking the code I found the reason why it has to be a multiple of 64/32. It's a shame I didn't checked this before. Anyway I've run some more tests and using the default settings of the primecoin client there was no significant gain for different nL1CacheElements, but in the moment the sieve size and the sieve percentage was raised the nL1CacheElements size had more affect. For example with sievesize=2000000 and sievepercentage=50 a higher nL1CacheElements (1056000) speeds up the Wave by ~30%. (my CPU is i7) In any case because 50% is probably not the optimal sievepercentage this measurement is less relevant. BTW: I'm compiling the 64bit version of jhPrimeminer under Windows with Visual Studio, where the size of 'unsigned long' is 32bit and not 64bit like in Linux (GNU). I've tried replacing the "unsigned long" with uint64_t (unsigned long long) but it makes the Wave function run slower ( ~15-25%). I'm thinking maybe it worth a try to replace the "unsigned long' to uint32 in the Wave() function in primecoin (HP9) client where the nL1CacheElements is also used. Does it makes sense what I'm talking about?
|
|
|
|
mikaelh (OP)
|
|
August 08, 2013, 04:44:45 PM |
|
The nL1CacheElements constant was not meant to be changed and there's a good reason for that. nL1CacheElements needs to be a multiple of 64 on a 64-bit platform (or a multiple of 32 on a 32-bit platform). Otherwise the sieve code will start misbehaving which seems to be what razibuzouzou is seeing. The sieve will start producing false candidates which will distort the metrics. The tests/h is the number of candidates produced by the sieve which is directly impacted. The chains/day estimate relies on tests/h being accurate so that will also be heavily impacted.
With the latest version, there could also be a small negative impact on block rate if the sieve is misbehaving. The false candidates will always be marked as bi-twin candidates. If it in reality is a candidate of another type, it will not get recognized properly.
Summary: Don't touch nL1CacheElements unless you know exactly what you're doing.
Mikael, thank you for the clarification. Indeed after checking the code I found the reason why it has to be a multiple of 64/32. It's a shame I didn't checked this before. Anyway I've run some more tests and using the default settings of the primecoin client there was no significant gain for different nL1CacheElements, but in the moment the sieve size and the sieve percentage was raised the nL1CacheElements size had more affect. For example with sievesize=2000000 and sievepercentage=50 a higher nL1CacheElements (1056000) speeds up the Wave by ~30%. (my CPU is i7) In any case because 50% is probably not the optimal sievepercentage this measurement is less relevant. BTW: I'm compiling the 64bit version of jhPrimeminer under Windows with Visual Studio, where the size of 'unsigned long' is 32bit and not 64bit like in Linux (GNU). I've tried replacing the "unsigned long" with uint64_t (unsigned long long) but it makes the Wave function run slower ( ~15-25%). I'm thinking maybe it worth a try to replace the "unsigned long' to uint32 in the Wave() function in primecoin (HP9) client where the nL1CacheElements is also used. Does it makes sense what I'm talking about? Well, it's good to know that long is always 32-bits in Visual Studio. I haven't really worked with that compiler. One unfortunate side-effect of that is being unable to pass 64-bit values directly to GMP/MPIR because the interface uses unsigned longs. I think 64-bits integer are better for this part of the code because it performs logical operations on bits. Thus, you process twice more bits with 64-bits integers than with 32-bits. I think you may have a performance loss because Visual Studio does not vectorize 64-bits integers operations very well.
Yup, I also suspect that the Visual Studio compiler isn't optimizing properly there. But I haven't looked at the generated code. Back to the nL1CacheElements constant. I am having a small (>3%) but clear performance improvement with 256000 (chainsperday increases and primesperday is preserved). Tested and confirmed on two different architectures (AMD & Intel). I know it's micro-optimization garbage (sorry!), but Mikaelh, does this value sound good to you ?
Well, I did check that most Intel processors have a 32 kB L1 data cache while AMD processors typically have a 64 kB data cache. 256000 is pretty much pushing the L1 cache to its limits on Intel CPUs. If it also shows good performance in my testing, I will probably incorporate that change. The annoying issue with 256000 is that it doesn't divide 1M (sieve size) evenly. I have also considered discovering the L1 cache size automatically and adjusting the parameters based on that. The problem with that is that I don't have a lot of CPUs to test on.
|
|
|
|
Lyddite
Member
Offline
Activity: 98
Merit: 10
|
|
August 08, 2013, 07:00:01 PM |
|
In case you are interested in the validity of the chains per day metric, with the following setup running on 9 identical machines I managed 8 chains over 24 hours. 8/9=0.8888 chains per day. I would say, close enough for government work in this case. "blocks" : 105938, "chainspermin" : 3, "chainsperday" : 0.92945471, "currentblocksize" : 1000, "currentblocktx" : 0, "difficulty" : 9.59519291, "errors" : "", "generate" : true, "genproclimit" : -1, "roundsievepercentage" : 70, "primespersec" : 1707, "pooledtx" : 0, "sievepercentage" : 10, "sievesize" : 1100000, "testnet" : false
|
- Lyddite -
|
|
|
monsterer
Legendary
Offline
Activity: 1008
Merit: 1007
|
|
August 08, 2013, 07:31:11 PM |
|
hp9 gives me the following compile error on centos6 (the previous version compiled ok):
"In function `CKey::Verify(uint256, std::vector<unsigned char, std::allocator<unsigned char> > const&)': /root/primecoin-0.1.2-hp9/src/key.cpp:376: undefined reference to `ECDSA_verify' obj/key.o: In function `CKey::SetCompressedPubKey(bool)': /root/primecoin-0.1.2-hp9/src/key.cpp:125: undefined reference to `EC_KEY_set_conv_form'"
|
|
|
|
gigawatt
|
|
August 08, 2013, 07:37:23 PM |
|
In case you are interested in the validity of the chains per day metric, with the following setup running on 9 identical machines I managed 8 chains over 24 hours. 8/9=0.8888 chains per day. I would say, close enough for government work in this case. "blocks" : 105938, "chainspermin" : 3, "chainsperday" : 0.92945471, "currentblocksize" : 1000, "currentblocktx" : 0, "difficulty" : 9.59519291, "errors" : "", "generate" : true, "genproclimit" : -1, "roundsievepercentage" : 70, "primespersec" : 1707, "pooledtx" : 0, "sievepercentage" : 10, "sievesize" : 1100000, "testnet" : false Lucky you then. I have 8 machines running at ~1.0 chains/day + a few spare machines on the side. I got 3 blocks in the last 24 hours.
|
|
|
|
torbank
|
|
August 08, 2013, 08:10:45 PM |
|
hp9 gives me the following compile error on centos6 (the previous version compiled ok):
"In function `CKey::Verify(uint256, std::vector<unsigned char, std::allocator<unsigned char> > const&)': /root/primecoin-0.1.2-hp9/src/key.cpp:376: undefined reference to `ECDSA_verify' obj/key.o: In function `CKey::SetCompressedPubKey(bool)': /root/primecoin-0.1.2-hp9/src/key.cpp:125: undefined reference to `EC_KEY_set_conv_form'"
Is it a fresh install of CentOS? If so, you'll need this step: Step 2b. Compiling OpenSSL (for CentOS users) This step is only required if you're using CentOS. Red Hat has removed support for elliptic curve cryptography from the OpenSSL it supplies. Code: cd rm -rf openssl-1.0.1e.tar.gz openssl-1.0.1e wget http://ftp://ftp.pca.dfn.de/pub/tools/net/openssl/source/openssl-1.0.1e.tar.gztar xzvf openssl-1.0.1e.tar.gz cd openssl-1.0.1e ./config shared --prefix=/usr/local --libdir=lib make sudo make install https://bitcointalk.org/index.php?topic=259022.0
|
|
|
|
Ethera
Member
Offline
Activity: 69
Merit: 10
|
|
August 08, 2013, 08:18:58 PM |
|
In case you are interested in the validity of the chains per day metric, with the following setup running on 9 identical machines I managed 8 chains over 24 hours. 8/9=0.8888 chains per day. I would say, close enough for government work in this case. "blocks" : 105938, "chainspermin" : 3, "chainsperday" : 0.92945471, "currentblocksize" : 1000, "currentblocktx" : 0, "difficulty" : 9.59519291, "errors" : "", "generate" : true, "genproclimit" : -1, "roundsievepercentage" : 70, "primespersec" : 1707, "pooledtx" : 0, "sievepercentage" : 10, "sievesize" : 1100000, "testnet" : false Lucky you then. I have 8 machines running at ~1.0 chains/day + a few spare machines on the side. I got 3 blocks in the last 24 hours. "blocks" : 106059, "chainspermin" : 21, "chainsperday" : 2.44443307, "currentblocksize" : 1000, "currentblocktx" : 0, "difficulty" : 9.59809953, cant say that i find 2 blocks per machine per day.. i am lucky to hit 4 blocks per all 6 mashines per day (All are identical)
|
|
|
|
Lyddite
Member
Offline
Activity: 98
Merit: 10
|
|
August 08, 2013, 08:25:34 PM |
|
In case you are interested in the validity of the chains per day metric, with the following setup running on 9 identical machines I managed 8 chains over 24 hours.
8/9=0.8888 chains per day.
I would say, close enough for government work in this case.
Lucky you then. I have 8 machines running at ~1.0 chains/day + a few spare machines on the side. I got 3 blocks in the last 24 hours. Hmmm...hopefully 13 blocks tomorrow... Here's a useful oneliner if you want to see exactlya sorted list of when your blocks were found and if you are not comfortable with reading times and dates as seconds since 1970... UTC. primecoind listtransactions | grep blocktime | sed -e "s/.* : //" -e "s/,//" | sort | xargs -n 1 -I '{}' date --date=@'{}'
|
- Lyddite -
|
|
|
redphlegm
Sr. Member
Offline
Activity: 246
Merit: 250
My spoon is too big!
|
|
August 08, 2013, 08:26:02 PM |
|
In case you are interested in the validity of the chains per day metric, with the following setup running on 9 identical machines I managed 8 chains over 24 hours. 8/9=0.8888 chains per day. I would say, close enough for government work in this case. "blocks" : 105938, "chainspermin" : 3, "chainsperday" : 0.92945471, "currentblocksize" : 1000, "currentblocktx" : 0, "difficulty" : 9.59519291, "errors" : "", "generate" : true, "genproclimit" : -1, "roundsievepercentage" : 70, "primespersec" : 1707, "pooledtx" : 0, "sievepercentage" : 10, "sievesize" : 1100000, "testnet" : false Lucky you then. I have 8 machines running at ~1.0 chains/day + a few spare machines on the side. I got 3 blocks in the last 24 hours. "blocks" : 106059, "chainspermin" : 21, "chainsperday" : 2.44443307, "currentblocksize" : 1000, "currentblocktx" : 0, "difficulty" : 9.59809953, cant say that i find 2 blocks per machine per day.. i am lucky to hit 4 blocks per all 6 mashines per day (All are identical) If you set up one and cloned it, you may be dealing with conflicting / insufficient entropy.
|
Whiskey Fund: (BTC) 1whiSKeYMRevsJMAQwU8NY1YhvPPMjTbM | (Ψ) ALcoHoLsKUfdmGfHVXEShtqrEkasihVyqW
|
|
|
gigawatt
|
|
August 08, 2013, 09:09:00 PM |
|
"blocks" : 106059, "chainspermin" : 21, "chainsperday" : 2.44443307, "currentblocksize" : 1000, "currentblocktx" : 0, "difficulty" : 9.59809953,
cant say that i find 2 blocks per machine per day.. i am lucky to hit 4 blocks per all 6 mashines per day (All are identical)
If you set up one and cloned it, you may be dealing with conflicting / insufficient entropy. Quick and easy fix: apt-get install haveged -y update-rc.d haveged defaults service haveged start
|
|
|
|
paulthetafy
|
|
August 08, 2013, 09:50:52 PM |
|
You guys are confusing finding chains for finding blocks. Having a chains/d of 1 does NOT imply you should find a block a day on average.
|
|
|
|
Trillium
|
|
August 08, 2013, 10:34:21 PM |
|
You guys are confusing finding chains for finding blocks. Having a chains/d of 1 does NOT imply you should find a block a day on average.
Right, as has been pointed out, chainsperday is a count of how many chains should meet the integer difficulty requirement, not considering the fractional difficulty requirement that must also be met. Simple example: If your chains per day was 1.0 and diff was 9.0100 then you will probably find 1 block per day on average. If your chains per day was 1.0 and diff was 9.9985 then you probably won't find a block Please correct me if my understanding is wrong... However, like with any solo mining, you are still subject to variance ("luck"). You could find 4 blocks the first day you mine then find nothing for a week or more... etc
|
BTC:1AaaAAAAaAAE2L1PXM1x9VDNqvcrfa9He6
|
|
|
Tamis
|
|
August 08, 2013, 11:11:05 PM |
|
I have 50 vps running with chainsperday ranging from 1.18 to 0.9 and I find between 15 to 21 blocks per day. They were not cloned so entropy is not a problem.
|
|
|
|
|