mshome86
Newbie
Offline
Activity: 2
Merit: 0
|
|
April 05, 2015, 11:15:31 AM |
|
Please cudaminer for collecting coppelak or double keccak
|
|
|
|
|
|
|
|
In order to achieve higher forum ranks, you need both activity points and merit points.
|
|
|
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
|
|
|
enerbyte
|
|
April 05, 2015, 02:58:23 PM |
|
thanks but it's not what I want.
|
|
|
|
|
Grim
|
|
April 10, 2015, 09:44:04 AM |
|
Using good old cudaminer for nfactor 16 yacoin.
The reason? It's an Asus 750ti 4GB card.
-L 8 -l t64x1 -b 16384 -i 0
The settings give like 0.36 to 0.38 khash/s.
The interesting fact tho is that these are the same settings and the same hash a standard 2GB model can hash.
Setting -L any lower is possible and vram useage goes up but the hashing is way slower.
Any insight?
|
|
|
|
Grim
|
|
April 10, 2015, 04:13:38 PM Last edit: April 10, 2015, 05:00:00 PM by Grim |
|
Using good old cudaminer for nfactor 16 yacoin.
The reason? It's an Asus 750ti 4GB card.
-L 8 -l t64x1 -b 16384 -i 0
The settings give like 0.36 to 0.38 khash/s.
The interesting fact tho is that these are the same settings and the same hash a standard 2GB model can hash.
Setting -L any lower is possible and vram useage goes up but the hashing is way slower.
Any insight?
typedef struct scrypt_aligned_alloc_t { uint8_t *mem, *ptr; } scrypt_aligned_alloc;
#if defined(SCRYPT_TEST_SPEED) static uint8_t *mem_base = (uint8_t *)0; static size_t mem_bump = 0;
/* allocations are assumed to be multiples of 64 bytes and total allocations not to exceed ~1.01gb */ static scrypt_aligned_alloc scrypt_alloc(uint64_t size) { scrypt_aligned_alloc aa; if (!mem_base) { mem_base = (uint8_t *)malloc((1024 * 1024 * 1024) + (1024 * 1024) + (SCRYPT_BLOCK_BYTES - 1)); if (!mem_base) scrypt_fatal_error("scrypt: out of memory"); mem_base = (uint8_t *)(((size_t)mem_base + (SCRYPT_BLOCK_BYTES - 1)) & ~(SCRYPT_BLOCK_BYTES - 1)); } aa.mem = mem_base + mem_bump; aa.ptr = aa.mem; mem_bump += (size_t)size; return aa; }
static void scrypt_free(scrypt_aligned_alloc *aa) { mem_bump = 0; } #else static scrypt_aligned_alloc scrypt_alloc(uint64_t size) { static const size_t max_alloc = (size_t)-1; scrypt_aligned_alloc aa; size += (SCRYPT_BLOCK_BYTES - 1); if (size > max_alloc) scrypt_fatal_error("scrypt: not enough address space on this CPU to allocate required memory"); aa.mem = (uint8_t *)malloc((size_t)size); aa.ptr = (uint8_t *)(((size_t)aa.mem + (SCRYPT_BLOCK_BYTES - 1)) & ~(SCRYPT_BLOCK_BYTES - 1)); if (!aa.mem) scrypt_fatal_error("scrypt: out of memory"); return aa; }
static void scrypt_free(scrypt_aligned_alloc *aa) { free(aa->mem); } #endif
why is it limited to 1gb?
|
|
|
|
Nubminer
Member
Offline
Activity: 75
Merit: 10
|
|
April 13, 2015, 03:08:59 AM |
|
I am trying to load the cudaminer.sln in visual studio Commmunity 2013.. will this work. it fails every time so I am guessing not.. or maybe I missed a key step ?
|
|
|
|
cbuchner1 (OP)
|
|
April 13, 2015, 07:55:17 AM |
|
I am trying to load the cudaminer.sln in visual studio Commmunity 2013.. will this work. it fails every time so I am guessing not.. or maybe I missed a key step ?
the precompiled dependencies in the OP will only work for VS 2010. You would have to build all the dependencies with VS 2013 if you want to use this newer IDE version. Christian
|
|
|
|
Nubminer
Member
Offline
Activity: 75
Merit: 10
|
|
April 13, 2015, 12:09:09 PM |
|
I am trying to load the cudaminer.sln in visual studio Commmunity 2013.. will this work. it fails every time so I am guessing not.. or maybe I missed a key step ?
the precompiled dependencies in the OP will only work for VS 2010. You would have to build all the dependencies with VS 2013 if you want to use this newer IDE version. Christian if someone else runs the project through their VS 2010 and then shares it will it work on my pc ? if so is there a source somewhere.. I found one link to a downloadable program but it would not pass my antivirus i couldnt even open the webpage with chrome so i disabled its protection. then the resulting download was infested as well. it seems most stuff is nowadays. sad
|
|
|
|
|
djm34
Legendary
Offline
Activity: 1400
Merit: 1050
|
|
April 13, 2015, 12:52:56 PM |
|
I am trying to load the cudaminer.sln in visual studio Commmunity 2013.. will this work. it fails every time so I am guessing not.. or maybe I missed a key step ?
the precompiled dependencies in the OP will only work for VS 2010. You would have to build all the dependencies with VS 2013 if you want to use this newer IDE version. Christian if someone else runs the project through their VS 2010 and then shares it will it work on my pc ? if so is there a source somewhere.. I found one link to a downloadable program but it would not pass my antivirus i couldnt even open the webpage with chrome so i disabled its protection. then the resulting download was infested as well. it seems most stuff is nowadays. sad there is a "convert" link in vs2013 to convert a vs2010 project to vs2013, it should be more or less alone (might have a couple of errors to correct). Regarding the dependencies, I don't think there will be any problem (I still use the one I downloaded at last year)
|
djm34 facebook pageBTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
|
|
|
Nubminer
Member
Offline
Activity: 75
Merit: 10
|
|
April 13, 2015, 01:19:16 PM |
|
I currently am running ccminer that was mentioned in this thread https://bitcointalk.org/index.php?topic=656841.msg7429706#msg7429706 but I am only getting 500h/s on a gtx-980 and I think it should be more it is ccminer cryptonight by tsiv ... i have played with the -l (launch config number of threads x number of thread blocks ) and cant find a combo that gets me above the 500 mark I am trying to mine Monero coin not bitcoin
|
|
|
|
sp_
Legendary
Offline
Activity: 2898
Merit: 1087
Team Black developer
|
|
April 13, 2015, 01:56:48 PM |
|
My modded xmr miner does more than 500 on the 980x. Donate 0.2BTC and I will send it to you. Windows 32bit exe
|
|
|
|
Grim
|
|
April 13, 2015, 02:04:10 PM |
|
Why do you want to run cudaminer? All the algos in cudaminer are slow and not profitable.
ccminer is the best miner for NVIDIA
Hi sp_ , could you look into scrypt-jane algos and include them into ccminer? It's the last algo in cudaminer which is still profitable to mine with.
|
|
|
|
sp_
Legendary
Offline
Activity: 2898
Merit: 1087
Team Black developer
|
|
April 15, 2015, 08:32:32 AM |
|
I'm pretty sure the output of the last algorithm is used as the input of the next one for X11, precisely so you can't do that.
Yes you can. If each thread is working on a different hash. example 4 threads 4 hashes HASH1: x1->x2->x3-> HASH2: x4->x5->x6-> HASH3: x7->x8->x9-> HASH4: x10->x11 Swap the 4 hashes HASH4: x1->x2->x3-> HASH1: x4->x5->x6-> HASH2: x7->x8->x9-> HASH3: x10->x11 Swap the 4 hashes HASH3: x1->x2->x3-> HASH4: x4->x5->x6-> HASH1: x7->x8->x9-> HASH2: x10->x11 Swap the 4 hashes HASH2: x1->x2->x3-> HASH3: x4->x5->x6-> HASH4: x7->x8->x9-> HASH1: x10->x11 Complete Have you tried this wolf0?
|
|
|
|
smolen
|
|
April 15, 2015, 09:07:33 AM |
|
No, because it doesn't make sense for GPU - it WOULD, however, make TONS of sense for FPGA or ASIC.
It does. Well, just change wording a bit
|
Of course I gave you bad advice. Good one is way out of your price range.
|
|
|
|
smolen
|
|
April 16, 2015, 05:02:46 AM |
|
but... that require changes in each kernels (host launch code), and is not 50% faster, only a few percents
On AMD, for whirlpool it seems ALUs are sitting idle for ~half of time, waiting data from constants cache and local data storage.
|
Of course I gave you bad advice. Good one is way out of your price range.
|
|
|
sp_
Legendary
Offline
Activity: 2898
Merit: 1087
Team Black developer
|
|
April 16, 2015, 09:32:25 AM Last edit: April 16, 2015, 09:47:18 AM by sp_ |
|
I'm pretty sure the output of the last algorithm is used as the input of the next one for X11, precisely so you can't do that.
Yes you can. If each thread is working on a different hash. example 4 threads 4 hashes HASH1: x1->x2->x3-> HASH2: x4->x5->x6-> HASH3: x7->x8->x9-> HASH4: x10->x11 Swap the 4 hashes HASH4: x1->x2->x3-> HASH1: x4->x5->x6-> HASH2: x7->x8->x9-> HASH3: x10->x11 Swap the 4 hashes HASH3: x1->x2->x3-> HASH4: x4->x5->x6-> HASH1: x7->x8->x9-> HASH2: x10->x11 Swap the 4 hashes HASH2: x1->x2->x3-> HASH3: x4->x5->x6-> HASH4: x7->x8->x9-> HASH1: x10->x11 Complete Have you tried this wolf0? No, because it doesn't make sense for GPU - it WOULD, however, make TONS of sense for FPGA or ASIC. What if you have 4 gpu's in your rig and each thread is executed on a seperate gpu. x11 is then reduced to x2+. advantages: -Smaller kernals, bether register usage, less memory needed, more cache hits, more paralell threads -Hybrid mining is possible. (run AES algos on the AMD, and the rest on NVIDIA) disadvangtages: -throughput must be passed from gpu to gpu trough the pci-E to memory and back. -You need 4 gpu's (but the algorithm can be scalable to support x gpu's)
|
|
|
|
sp_
Legendary
Offline
Activity: 2898
Merit: 1087
Team Black developer
|
|
April 16, 2015, 10:42:59 AM |
|
What if you have 4 gpu's in your rig and each thread is executed on a seperate gpu. x11 is then reduced to x2+.
advantages: -Smaller kernals, bether register usage, less memory needed, more cache hits, more paralell threads -Hybrid mining is possible. (run AES algos on the AMD, and the rest on NVIDIA)
disadvangtages:
-throughput must be passed from gpu to gpu trough the pci-E to memory and back. -You need 4 gpu's (but the algorithm can be scalable to support x gpu's)
Terrible idea - implies a device-to-host copy followed by a host-to-device copy, or at least a device-to-device copy, if I'm understanding you. Terribly slow. You are probobly right. Perhaps faster with a crossfire cable and a 2 gpu setup
|
|
|
|
|