matrix961
|
|
June 29, 2014, 01:58:53 PM |
|
I was finally able to get it working on ubuntu 14.04. I just followed the tutorial below to setup cudaminer then compiled ccminer and it works great now. Here is the tutorial I found incase anyone else has trouble. http://d3adbra1n.wordpress.com/2014/05/03/cuda-miner-installation-on-a-fresh-ubuntu-14-04-lts/I'm still getting the fastest hashes using smaller thread count over many blocks vs more threads over lower block count. It's been very stable tho with no crashing at all as it hashes away.
|
|
|
|
tsiv
|
|
June 30, 2014, 07:28:42 AM |
|
I promised a bounty for the Nvidia miner even though all my Nvidia cards are in the cupboard Post a BTC donation address tsiv so I can send my share (0.2BTC as listed) Any chance somebody tried the Gtx580 or Gtx680 I have 2 of each which I might slap on a system if worthwhile I added some donation addresses on the project readme on Github for wallets that I have atm, copy&paste: BTC: 1JHDKp59t1RhHFXsTw2UQpR3F9BBz3R3cs DRK: XrHp267JNTVdw5P3dsBpqYfgTpWnzoESPQ JPC: Jb9hFeBgakCXvM5u27rTZoYR9j13JGmuc2 VTC: VwYsZFPb6KMeWuP4voiS9H1kqxcU9kGbsw XMR: 42uasNqYPnSaG3TwRtTeVbQ4aRY3n9jY6VXX3mfgerWt4ohDQLVaBPv3cYGKDXasTUVuLvhxetcuS16 ynt85czQ48mbSrWX I'm offering the 150 xmr pledge by Keyboard-Mash in the OP.
Overall I'm very satisfied with the program, and would gladly release partial/full bounty .. pending a quick answer for why I had to edit the registry to get the program to operate. Equipoise I will send you a few xmr outside of the bounty, thanks a lot for providing it! I would like to hear some more from tsiv.
It is actually explained on the project's front page on Github, I believe The initial release had the entire algorithm stuffed into a single huge CUDA kernel. Having to do the whole slow algorithm in one go had a tendency to take just a bit over 2 seconds per kernel launch, with 2 seconds being the timeout for Windows getting impatient and going "hmmh, I haven't heard from the GPU in 2 seconds. Must've crashed, better reset the driver." The registry tweak works around the problem by increasing the time that Windows allows the GPU to be "unresponsive" aka stuck running a CUDA kernel. This has been addressed in later releases, mainly by splitting the single huge kernel into smaller pieces and making parts of the hash faster. The slowest part is still quite slow, taking roughly 1.4 seconds with launch config 8x60 on a 750 Ti but it should stay well within the default 2 second window. There is something more to be done about the -l MxN. About the first number M: "First of all, your thread block size should always be a multiple of 32, because kernels issue instructions in warps (32 threads). For example, if you have a block size of 50 threads, the GPU will still issue commands to 64 threads and you'd just be wasting them." About the second number N: You could find it by gradually increasing it until your card stop working (showing impossible hash rate 3474958.52 H/s) and then restart is needed for maximum performance (but not for testing), because without restart my hash rate is felling 2x compared to the same options before the crash.
The "magical numbers" for 650M seems to be -l 128x5
I realize the 8x60 or 8x40 make absolutely no sense, they're something I ran into while trying out different values. The reasonable values would be based on the number of SMM/SMX on the GPU and 32 or 64 threads per block would make a lot of sense. I can't tell exactly why performance takes a dive if you try 64x5 for example, it should be a very good value to start at. Might have something to do with the huge amount of random global memory access in the second major loop of the algo, trying to do more work in parallel bottlenecks at the memory access? Good news is that I've since modified 2 of the 3 main loops to use 8 parallel threads per hash as opposed to the original 1 thread per hash. So essentially 8x60 leads to running 64 threads per block for those two loops. Still working on the last loops, it does seem a fair bit harder to make it more parallel.
|
|
|
|
equipoise
|
|
June 30, 2014, 07:34:04 AM |
|
^Any windows pre-compiled binaries for the new version I could try?
|
|
|
|
|
equipoise
|
|
June 30, 2014, 11:24:53 AM |
|
Cool. Thank you . I checked it few days ago for a release folder, but it seems I missed the '4 releases' button above.
|
|
|
|
Quicken
|
|
June 30, 2014, 02:21:51 PM |
|
I'm trying to use the pre-compiled ccminer-cryptonight_20140630_r2 ccminer on Windows 8.1 with a GTX750ti and seem to be having some problems getting results. I am pointing the miner at minexmr as indicated on their website: http://minexmr.com/with a batch file as follows: C:\monero\ccminer-cryptonight_20140630_r2\ccminer.exe -t 1 -d gtx750ti -o stratum+tcp://pool.minexmr.com:7777 -u <address> -p x At launch, I get a series of results like: GPU #0: GeForce GTX 750 Ti, using 40 blocks of 8 threads Pool set diff to 15000 GPU #0: GeForce GTX 750 Ti, 93.81 H/s then a popup says display driver stopped responding and has recovered. After that I see results with crazy high numbers of hashes like this: GPU #0: GeForce GTX 750 Ti, 163611988.12 H/s interspersed with 'stratum detected new block' but no accepted results within a half hour check period. I also tried downloading the previous release, but switching to that one makes the cmd.exe pop up and vanish immediately on my system (Windows 8.1, Driver 337.88). The GTX750ti is not attached to a display output. Any help appreciated. Not sure what's going wrong.
|
|
|
|
yellowduck2
|
|
June 30, 2014, 03:20:18 PM |
|
I promised a bounty for the Nvidia miner even though all my Nvidia cards are in the cupboard&nbsp; Post a BTC donation address tsiv so I can send my share (0.2BTC as listed) Any chance somebody tried the Gtx580 or Gtx680 I have 2 of each which I might slap on a system if worthwhile I added some donation addresses on the project readme on Github for wallets that I have atm, copy&amp;paste: BTC: 1JHDKp59t1RhHFXsTw2UQpR3F9BBz3R3cs DRK: XrHp267JNTVdw5P3dsBpqYfgTpWnzoESPQ JPC: Jb9hFeBgakCXvM5u27rTZoYR9j13JGmuc2 VTC: VwYsZFPb6KMeWuP4voiS9H1kqxcU9kGbsw XMR: 42uasNqYPnSaG3TwRtTeVbQ4aRY3n9jY6VXX3mfgerWt4ohDQLVaBPv3cYGKDXasTUVuLvhxetcuS16 ynt85czQ48mbSrWX I'm offering the 150 xmr pledge by Keyboard-Mash in the OP.
Overall I'm very satisfied with the program, and would gladly release partial/full bounty .. pending a quick answer for why I had to edit the registry to get the program to operate. Equipoise I will send you a few xmr outside of the bounty, thanks a lot for providing it! I would like to hear some more from tsiv.
It is actually explained on the project's front page on Github, I believe The initial release had the entire algorithm stuffed into a single huge CUDA kernel. Having to do the whole slow algorithm in one go had a tendency to take just a bit over 2 seconds per kernel launch, with 2 seconds being the timeout for Windows getting impatient and going "hmmh, I haven't heard from the GPU in 2 seconds. Must've crashed, better reset the driver." The registry tweak works around the problem by increasing the time that Windows allows the GPU to be "unresponsive" aka stuck running a CUDA kernel. This has been addressed in later releases, mainly by splitting the single huge kernel into smaller pieces and making parts of the hash faster. The slowest part is still quite slow, taking roughly 1.4 seconds with launch config 8x60 on a 750 Ti but it should stay well within the default 2 second window. There is something more to be done about the -l MxN. About the first number M: "First of all, your thread block size should always be a multiple of 32, because kernels issue instructions in warps (32 threads). For example, if you have a block size of 50 threads, the GPU will still issue commands to 64 threads and you'd just be wasting them." About the second number N: You could find it by gradually increasing it until your card stop working (showing impossible hash rate 3474958.52 H/s) and then restart is needed for maximum performance (but not for testing), because without restart my hash rate is felling 2x compared to the same options before the crash.
The "magical numbers" for 650M seems to be -l 128x5
I realize the 8x60 or 8x40 make absolutely no sense, they're something I ran into while trying out different values. The reasonable values would be based on the number of SMM/SMX on the GPU and 32 or 64 threads per block would make a lot of sense. I can't tell exactly why performance takes a dive if you try 64x5 for example, it should be a very good value to start at. Might have something to do with the huge amount of random global memory access in the second major loop of the algo, trying to do more work in parallel bottlenecks at the memory access? Good news is that I've since modified 2 of the 3 main loops to use 8 parallel threads per hash as opposed to the original 1 thread per hash. So essentially 8x60 leads to running 64 threads per block for those two loops. Still working on the last loops, it does seem a fair bit harder to make it more parallel. For wins 8.1 750ti , 8x60 will crash the driver. Lower setting 6x40 works but it affect performance. Anyway to fix that ?
|
|
|
|
tsiv
|
|
June 30, 2014, 03:30:36 PM |
|
I'm trying to use the pre-compiled ccminer-cryptonight_20140630_r2 ccminer on Windows 8.1 with a GTX750ti and seem to be having some problems getting results. I am pointing the miner at minexmr as indicated on their website: http://minexmr.com/with a batch file as follows: C:\monero\ccminer-cryptonight_20140630_r2\ccminer.exe -t 1 -d gtx750ti -o stratum+tcp://pool.minexmr.com:7777 -u <address> -p x At launch, I get a series of results like: GPU #0: GeForce GTX 750 Ti, using 40 blocks of 8 threads Pool set diff to 15000 GPU #0: GeForce GTX 750 Ti, 93.81 H/s then a popup says display driver stopped responding and has recovered. After that I see results with crazy high numbers of hashes like this: GPU #0: GeForce GTX 750 Ti, 163611988.12 H/s interspersed with 'stratum detected new block' but no accepted results within a half hour check period. I also tried downloading the previous release, but switching to that one makes the cmd.exe pop up and vanish immediately on my system (Windows 8.1, Driver 337.88). The GTX750ti is not attached to a display output. Any help appreciated. Not sure what's going wrong. Pretty sure it's still a TDR issue, the biggest part of the cryptonight core get still run as a single launch and it just might take that 2 seconds and Windows with default TDR delay considers the GPU stuck and does a driver reset. https://bitcointalk.org/index.php?topic=656841.msg7529269#msg7529269 for a workaround. I plan on looking at splitting the work down, at quick glance it looks like it could be run piece by piece. Will probably hurt performance a bit, have to save and reload the encryption keys on every kernel launch and launches themselves have some overhead. My thought was to make it a cmd line option, allowing the user to decide how much (or if) they want to split it up. Maybe add a few microseconds of sleep between the launches, stop the display freezing for 1+ seconds at a time and make the computer at least semi-usable.
|
|
|
|
Quicken
|
|
June 30, 2014, 04:06:06 PM |
|
I tried the TDR delay reg edit on my rig, and it now seems to be running stably at 45-50 H/s (40* , 750 ti, diff 15000. I just got my first accepted after about 10 mins so it seems to be working, thanks. Is the hash rate a bit low though?
|
|
|
|
matrix961
|
|
June 30, 2014, 05:30:18 PM |
|
I tried the TDR delay reg edit on my rig, and it now seems to be running stably at 45-50 H/s (40* , 750 ti, diff 15000. I just got my first accepted after about 10 mins so it seems to be working, thanks. Is the hash rate a bit low though? What do you have in your command line for the "-l" parameter? If you don't specify the parameter it should be 8x40 by default. The 750 Ti should be hashing in the 200's I believe.
|
|
|
|
aminorex
Legendary
Offline
Activity: 1596
Merit: 1030
Sine secretum non libertas
|
|
June 30, 2014, 06:28:01 PM Last edit: June 30, 2014, 06:51:09 PM by aminorex |
|
Is there a pre-existing XMR bounty wallet for the ATI miner?
If not, perhaps HardwarePal could make one, publish the view key? I don't want to hold it, just send to it.
|
Give a man a fish and he eats for a day. Give a man a Poisson distribution and he eats at random times independent of one another, at a constant known rate.
|
|
|
kbm
Member
Offline
Activity: 84
Merit: 10
|
|
June 30, 2014, 07:33:32 PM |
|
Is there a pre-existing XMR bounty wallet for the ATI miner?
If not, perhaps HardwarePal could make one, publish the view key? I don't want to hold it, just send to it.
I think smooth was collecting, but if HardwarePal is hiring someone directly then that would probably be okay, so long as the view key is published. Maybe we should check with smooth to see if he's collected anything? It doesn't seem like anyones working on this bounty besides HardwarePal, so my part of the ATI bounty still stands to be claimed by what's being worked on (150 XMR - Keyboard-Mash), and it looks like Tsiv will be claiming the Nvidia miner bounty. Tsiv can you please provide an XMR address and viewkey here?
|
Thanks
|
|
|
TooDumbForBitcoin
Legendary
Offline
Activity: 1638
Merit: 1001
|
|
June 30, 2014, 08:02:45 PM |
|
I'm confused by the title of this thread.
Is the bounty for the miner, or is the bounty for providing a bounty for the miner?
|
|
|
|
HardwarePal
|
|
June 30, 2014, 09:40:41 PM Last edit: June 30, 2014, 10:31:50 PM by HardwarePal |
|
Due to all the talk about Claymores Closed Source 5% Gpu Miner, I have paid Wolf to release his OpenCL for the Gpu miner on github and have some of the opensource community contribute and himself aswell. The initial idea was to pay him 10BTC to do the project and release a working miner. I have changed plans due to Wolf being tired + pool owners wanting to cut Claymores 5% which could cause other bigger problems. I paid him a total of 3BTC to release the code. He will be updating on the main Monero thread in a few hours. Link from Wolf0: https://bitcointalk.org/index.php?topic=671784.0
|
|
|
|
HardwarePal
|
|
June 30, 2014, 10:05:07 PM |
|
I promised a bounty for the Nvidia miner even though all my Nvidia cards are in the cupboard Post a BTC donation address tsiv so I can send my share (0.2BTC as listed) Any chance somebody tried the Gtx580 or Gtx680 I have 2 of each which I might slap on a system if worthwhile Tsiv donations and updates Sent my promised 0.2BTC now (0.20133330 to be exact)
|
|
|
|
equipoise
|
|
July 01, 2014, 08:01:51 AM |
|
I promised a bounty for the Nvidia miner even though all my Nvidia cards are in the cupboard&nbsp; Post a BTC donation address tsiv so I can send my share (0.2BTC as listed) Any chance somebody tried the Gtx580 or Gtx680 I have 2 of each which I might slap on a system if worthwhile I added some donation addresses on the project readme on Github for wallets that I have atm, copy&amp;paste: BTC: 1JHDKp59t1RhHFXsTw2UQpR3F9BBz3R3cs DRK: XrHp267JNTVdw5P3dsBpqYfgTpWnzoESPQ JPC: Jb9hFeBgakCXvM5u27rTZoYR9j13JGmuc2 VTC: VwYsZFPb6KMeWuP4voiS9H1kqxcU9kGbsw XMR: 42uasNqYPnSaG3TwRtTeVbQ4aRY3n9jY6VXX3mfgerWt4ohDQLVaBPv3cYGKDXasTUVuLvhxetcuS16 ynt85czQ48mbSrWX I'm offering the 150 xmr pledge by Keyboard-Mash in the OP.
Overall I'm very satisfied with the program, and would gladly release partial/full bounty .. pending a quick answer for why I had to edit the registry to get the program to operate. Equipoise I will send you a few xmr outside of the bounty, thanks a lot for providing it! I would like to hear some more from tsiv.
It is actually explained on the project's front page on Github, I believe The initial release had the entire algorithm stuffed into a single huge CUDA kernel. Having to do the whole slow algorithm in one go had a tendency to take just a bit over 2 seconds per kernel launch, with 2 seconds being the timeout for Windows getting impatient and going "hmmh, I haven't heard from the GPU in 2 seconds. Must've crashed, better reset the driver." The registry tweak works around the problem by increasing the time that Windows allows the GPU to be "unresponsive" aka stuck running a CUDA kernel. This has been addressed in later releases, mainly by splitting the single huge kernel into smaller pieces and making parts of the hash faster. The slowest part is still quite slow, taking roughly 1.4 seconds with launch config 8x60 on a 750 Ti but it should stay well within the default 2 second window. There is something more to be done about the -l MxN. About the first number M: "First of all, your thread block size should always be a multiple of 32, because kernels issue instructions in warps (32 threads). For example, if you have a block size of 50 threads, the GPU will still issue commands to 64 threads and you'd just be wasting them." About the second number N: You could find it by gradually increasing it until your card stop working (showing impossible hash rate 3474958.52 H/s) and then restart is needed for maximum performance (but not for testing), because without restart my hash rate is felling 2x compared to the same options before the crash.
The "magical numbers" for 650M seems to be -l 128x5
I realize the 8x60 or 8x40 make absolutely no sense, they're something I ran into while trying out different values. The reasonable values would be based on the number of SMM/SMX on the GPU and 32 or 64 threads per block would make a lot of sense. I can't tell exactly why performance takes a dive if you try 64x5 for example, it should be a very good value to start at. Might have something to do with the huge amount of random global memory access in the second major loop of the algo, trying to do more work in parallel bottlenecks at the memory access? Good news is that I've since modified 2 of the 3 main loops to use 8 parallel threads per hash as opposed to the original 1 thread per hash. So essentially 8x60 leads to running 64 threads per block for those two loops. Still working on the last loops, it does seem a fair bit harder to make it more parallel. For wins 8.1 750ti , 8x60 will crash the driver. Lower setting 6x40 works but it affect performance. Anyway to fix that ? It's making windows to think the driver crashed because it's taking long enough without returning a result (windows default is 2 seconds for the GPU) and that's why windows restarts the driver and ccminer won't work and show you impossible hash rate. I made the timeout 40 seconds on my machine (tried with 10 first - it worked for some time, but then it crashed again). This made my laptop second nvidea card to start happily mining with about 22 H/s. Don't do this if you don't have a second video card, because your PC will become completely unusable while mining (if you are cpu mining on the same machine run the cpu miner before running the gpu miner, because otherwise it'll become difficult for you to even start the cpu miner). Here is a link to a .reg file, which will set the timeout to 40 seconds - just double click it and it'll add the setting to the registry (it'll ask you if you are sure). Then you should restart your windows and ccminer should work after the restart. If you find it useful don't forget to tip me Smiley https://www.dropbox.com/s/ci8b3h7oxtvd6dq/TdrDelaySetTo40.regThere is something more to be done about the -l MxN. About the first number M: "First of all, your thread block size should always be a multiple of 32, because kernels issue instructions in warps (32 threads). For example, if you have a block size of 50 threads, the GPU will still issue commands to 64 threads and you'd just be wasting them." About the second number N: You could find it by gradually increasing it until your card stop working (showing impossible hash rate 3474958.52 H/s) and then restart is needed for maximum performance (but not for testing), because without restart my hash rate is felling 2x compared to the same options before the crash. The "magical numbers" for 650M seems to be -l 128x5 XMR tip: 4AyRmUcxzefB5quumzK3HNE4zmCiGc8vhG6fE1oJpGVyVZF7fvDgSpt3MzgLfQ6Q1719xQhmfkM9Z2u NXgDMqYhjJVmc6KX Thank you!
|
|
|
|
superresistant
Legendary
Offline
Activity: 2156
Merit: 1131
|
|
July 01, 2014, 08:04:23 AM |
|
Due to all the talk about Claymores Closed Source 5% Gpu Miner, I have paid Wolf to release his OpenCL for the Gpu miner on github and have some of the opensource community contribute and himself aswell. The initial idea was to pay him 10BTC to do the project and release a working miner. I have changed plans due to Wolf being tired + pool owners wanting to cut Claymores 5% which could cause other bigger problems. I paid him a total of 3BTC to release the code. He will be updating on the main Monero thread in a few hours. Link from Wolf0: https://bitcointalk.org/index.php?topic=671784.0Oh it's you Africanos lol ? Let's do this miner !
|
|
|
|
Quicken
|
|
July 01, 2014, 09:13:31 AM |
|
I tried the TDR delay reg edit on my rig, and it now seems to be running stably at 45-50 H/s (40* , 750 ti, diff 15000. I just got my first accepted after about 10 mins so it seems to be working, thanks. Is the hash rate a bit low though? What do you have in your command line for the "-l" parameter? If you don't specify the parameter it should be 8x40 by default. The 750 Ti should be hashing in the 200's I believe. No command line "-l". Default 40x8. Tried 80x8 and 40x16, but they were worse. Surely Windows can't be nerfing the GPU so badly?
|
|
|
|
hero18688
|
|
July 01, 2014, 10:35:02 AM |
|
I can't run tsiv's ccminer under win8.1.Driver crash everytime I start ccminer.Any solution?
|
|
|
|
superresistant
Legendary
Offline
Activity: 2156
Merit: 1131
|
|
July 01, 2014, 10:36:48 AM |
|
I can't run tsiv's ccminer under win8.1.Driver crash everytime I start ccminer.Any solution?
The NVIDIA drivers will "time out" and give you a crash message. This is Windows basically detecting that the drivers haven't responded properly for a while, and so it stops them and of course your mining quits as well. You can usually get around this via a registry edit, but in some cases even that may not work all the time so be prepared to fiddle around a bit. The registry hack is easy enough: Run "regedit.exe" from the Start Menu. Navigate to "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers" Right-click on the right panel and create a new 32-bit DWORD value. Name the key "TdrDelay" and assign it a value of anywhere from 10 to 30 (decimal -- 0A to 1E hex). Reboot and you should be set.
|
|
|
|
|