Okay I've thought about it some more and I think the best way forwards is to do away with the --lookup-gap and --thread-concurrency options and just get the user to input the shader count of their device and let cgminer choose. This will always set lookup gap to 2 and set the thread concurrency to the highest it can be that is a multiple of the shader count which would be much easier for users to configure. This then leaves you with the choice of gpu threads and intensity still. Otherwise there is far too much to configure, and most people won't be able to figure out how to do it. I plan to include a readme with the shader count of the various hardware like in that earlier post. What do people think of this?
Eg for a 6850 and 5770: cgminer --shaders 960,800
|
|
|
There might not be enough info in opencl to do this from cgminer. What does cgminer.exe -n report on these cards?
[2012-07-23 17:43:40] CL Platform 0 vendor: Advanced Micro Devices, Inc. [2012-07-23 17:43:40] CL Platform 0 name: AMD Accelerated Parallel Processing [2012-07-23 17:43:40] CL Platform 0 version: OpenCL 1.2 AMD-APP (938.1) [2012-07-23 17:43:40] Platform 0 devices: 2 [2012-07-23 17:43:40] 0 Barts [2012-07-23 17:43:40] 1 Juniper [2012-07-23 17:43:40] GPU 0 AMD Radeon HD 6800 Series hardware monitoring enabl ed [2012-07-23 17:43:40] GPU 1 ATI Radeon HD 5700 Series hardware monitoring enabl ed [2012-07-23 17:43:40] 2 GPU devices max detected
Why? What do you mean? I'm mining right now. I mean so that cgminer could detect the best settings automatically. See it says 6800 not what 68x0 ? Same with 5700, not what 57x0. It mattters.
|
|
|
My new highs: 6850: 262kh/s cgminer --scrypt --worksize 128 --gpu-engine 900 --gpu-memclock 500 --lookup-gap 2 --thread-concurrency 7680 -g 1 --intensity 17
5770: 207kh/s cgminer --scrypt --worksize 256 --gpu-engine 885 --gpu-memclock 1235 --lookup-gap 2 --thread-concurrency 8000 -g 1 --intensity 18
There might not be enough info in opencl to do this from cgminer. What does cgminer.exe -n report on these cards?
|
|
|
Enable windows error reporting (WER), if the C05 exception is in cgminer, re-compile it and create a mapfile for it. Post the WER report and your map file. WER report should tell you if exception is in one of the dlls or the cgminer exe. The offset should tell you where it crashed, unless the stack is blown...
Here's the windows error log. Log Name: Application Source: Application Error Date: 7/23/2012 11:43:21 AM Event ID: 1000 Task Category: (100) Level: Error Keywords: Classic User: N/A Description: Faulting application name: cgminer.exe, version: 0.0.0.0, time stamp: 0x4ff6bd65 Faulting module name: libpdcurses.dll, version: 0.0.0.0, time stamp: 0x4f460f95 Exception code: 0xc0000005 Fault offset: 0x00009ccc Faulting process id: 0xb34 Faulting application start time: 0x01cd68e9dec30d48 Faulting application path: C:\Users\Steve\Desktop\cgminer\cgminer.exe Faulting module path: C:\Users\Steve\Desktop\cgminer\libpdcurses.dll Report Id: 210d0efe-d4dd-11e1-84bd-bcaec5308e15
cgminer is unable to display everything is wants to display. libcurses.dll is catching an exception so I'd run cgminer in a quiet mode. People with many FPGAs and small window size were reporting similar error in the past. I don't remember if it was the same offset, search this forum. Quiet mode didn't work. Real-quiet mode did, thanks for the help. Any idea why this is happening? I'd like to be able to see the standard cgminer output. I've tried increasing the prompt defaults gradually to ridiculous sizes (extending off the display on 1920x1080 resolution) with no success. Maybe I'm misunderstanding the cause of the crash still. Darn, but yes the displaying library pdcurses is the issue. You can try the old fashioned -T mode which is just plain text. At least you'll have some output.
|
|
|
Am I misunderstanding, I thought it was supposed to select the best values? It selected 8192 for both of them.
Yeah, I'll stick with my settings that are supposed to produce invalids but aren't. My 6850 runs at 6144 and I can take it up 15+ in intensity. My 5770 running at 3584 and I push intensity 18 on it without any invalids.
No it wont pick the best. Read again. It will pick the largest. Then find the multiple of your shaders less than or equal to that and YOU set that manually. That's why I put that list of shader counts below... 6850 is 960 5770 is 800 Therefore you should try tc of 6850: 7680 5770: 8000 OH. I read wrong, thank you for clarifying. And those actually pull better numbers. Now I can give my friend (and anyone else) better number to try too. Awesome. Great. I guess I can put in a database into the software to lookup the best values after querying the device... maybe soon.
|
|
|
Am I misunderstanding, I thought it was supposed to select the best values? It selected 8192 for both of them.
Yeah, I'll stick with my settings that are supposed to produce invalids but aren't. My 6850 runs at 6144 and I can take it up 15+ in intensity. My 5770 running at 3584 and I push intensity 18 on it without any invalids.
No it wont pick the best. Read again. It will pick the largest. Then find the multiple of your shaders less than or equal to that and YOU set that manually. That's why I put that list of shader counts below... 6850 is 960 5770 is 800 Therefore you should try tc of 6850: 7680 5770: 8000
|
|
|
Am I misunderstanding, I thought it was supposed to select the best values? It selected 8192 for both of them.
Yeah, I'll stick with my settings that are supposed to produce invalids but aren't. My 6850 runs at 6144 and I can take it up 15+ in intensity. My 5770 running at 3584 and I push intensity 18 on it without any invalids.
No it wont pick the best. Read again. It will pick the largest. Then find the multiple of your shaders less than or equal to that and YOU set that manually. That's why I put that list of shader counts below...
|
|
|
Ok, I've been having a somewhat weird problem with cgminer for a while now. I've tried searching but I couldn't find any relevant info. I'm running windows 7 x64. When I start cgminer it crashes immediately. There's no error message shown, even with -D command line option. I get a "cgminer has stopped working" dialog from windows. It doesn't seem to be related to my devices, it happens if I use one or two of my 7970s, or no GPUs and just the two Cairnsmore1's I have connected. I have tried multiple versions of cgminer (2.5.0, 2.3.4(Enterpoint modified)) , and the issue occurs on both of them. I've downloaded a fresh cgminer and the issue occurs with it as well. I have installed multiple AMD driver versions (12.3, 12.4, 12.6, 12.7) and the issue occurs with all of them. To my knowledge nothing has changed since cgminer was working properly.
I can get cgminer to run by restarting windows (I have cgminer included in the startup folder to run automatically) and not touching anything. It'll run properly until I try to move the cgminer window or open a browser or media app. Then it crashes in exactly the same way, and crashes with each subsequent restart.
Now the strangest part is that this happened to me about a month ago out of the blue like now. I tried everything I could think of to fix it, and got frustrated and walked away for a while. I came back (maybe 12 hours later) and ran cgminer and it worked flawlessly up until last night.
I used the debugging option via visual studios and was able to get this, no clue if it'll be of help at all:
Unhandled exception at 0x62209ccc in cgminer.exe: 0xC0000005: Access violation reading location 0x6223efac.
If anyone could help I'd appreciate it, I can't think of anything else I can try to get this working again (other than just waiting it out, for whatever reason that worked before). Thanks guys.
Tried increasing the size of your dos prompt window before starting it?
|
|
|
So I implore you to check the share rate generation and pretty much ignore the reported hashrate when comparing notes. Remember that cgminer AND raper use virtually identical kernels so should hash at virtually identical rates. Finally, the word is out. Now I don't need to post that long post I was telling about on BTC-e.com chat. Thanks developer, hopefully Graet will understand that he just didn't tweak his Reaper enough... Back on-topic, thanks for the development. Although my results are lower with CGMiner than with Reaper, I like the program. The reason why it is running slower... I have 2 6950's, one with 1536 shaders and one with 1408 shader. I set reaper to use 4*1536 = 6144 concurrency and when I do this with CGMiner and add some intensity (say 18, just like reaper), the 1536 will produce a good hashrate (the same as reaper), but the 1408 shader counting card will throttle itself somewhere between 80 and 100% load. It's worse with a higher intensity. Is it possible to separate the concurrency per GPU? Can I simply enter 6144,5632 ? Maybe that will work around the problem. Reaper is showing 100% load on both cards with a combined hashing rate of 873.7KH/s, 1.2% stales on a slight overclock (30+MHz core on 1536-shader and 50+MHz core on 1408-shader). This weird load-thing was my only concern. P.s. it's not the temperature... it would throttle all the way back to 0% if it is overheating. Note that you WILL get better hashrates from using cgminer because of massively improved behaviour with respect to getting work, submitting shares and managing longpolls. So no, I don't believe the results are equivalent overall. Check the help... --lookup-gap <arg> Set GPU lookup gap for scrypt mining, comma separated--thread-concurrency <arg> Set GPU thread concurrency for scrypt mining, comma separated
|
|
|
Okay so I've had an extensive discussion with mtrlt about the code and I've done a lot of debugging and I've learnt a lot. First of all, the values you can safely plug into linux are NOT compatible with what you can plug into windows. There are different restrictions on the allocatable memory dependent on driver/OS combination. Therefore you cannot compare results from the two. Second, there IS a MEANINGFUL upper limit to aggression or in this case, intensity. It is where the power of 2 is greater than the concurrent threads. Eg concurrent threads of 8192 has an upper limit of 13 intensity because 2^13 is 8192. You CAN go over this value, but you are absolutely guaranteed to start producing invalid results. How many invalid results you get for the potential rise in hashrate is highly hardware dependent. The previous release code did no boundary checking or any testing of the device. I have now updated the git tree to test just how much memory it can allocate and it will now AUTOMATICALLY TUNE to the maximum values that are likely to work. I suggest you start it in debug mode with -D to see what it reports as the concurrent threads, and then find the value that is the largest multiple of number of shaders in the device. Eg a 6950 has 134217728 max memory, this works out to concurrency 2048 but it only has 1408 shaders so setting concurrent_threads to 1408 will likely make it faster. Changing lookup_gap has 2 effects. The larger it is, the higher you can go with thread_concurrency. However, speed also is dependent on architecture design, and virtually all GPUs are fastest at a gap of 2. If you choose a custom gap without choosing a thread concurrency, cgminer will choose the concurrency for you. If you don't choose a gap, it will select 2 for you. About GPU threads: You should run as many as you can start without cgminer crashing or failing. They do NOT correlate with shaders, compute units, ram or anything else as any meaningful multiple or anything like that. Now finally, and you can believe me or not on this, but raper sends work to the GPU WITHOUT CHECKING if it was accepted, and gets the return buffer WITHOUT CHECKING if it actually did any work, and then adds the number of hashes it would have expected the GPU to do with that work sent to it. This means that when you start with lots of threads, some of them may not even be doing anything. Or if you've set some borderline invalid values, it will appear to be working fine, report back a big hashrate, but generate less valid shares. So I implore you to check the share rate generation and pretty much ignore the reported hashrate when comparing notes. Remember that cgminer AND raper use virtually identical kernels so should hash at virtually identical rates. Summary: Start cgminer without setting worksize, vectors, lookup gap or thread concurrency, but in debug mode with -D -T (I made this example up, not sure what it really is) [2012-07-23 21:07:18] GPU 0: selecting lookup gap of 2 [2012-07-23 21:07:18] GPU 0: selecting thread concurrency of 2048 Then if you're on a 5770, you can google it has 800 processing elements, so pick the highest multiple of that while staying under the thread concurrency, so 1600. The nearest power of 2 is 2048 so an intensity of 12. Give that a go and let's see what happens. I expect different results on windows and linux. Use this table as a guide for what multiples to make concurrent threads. GPU Processing Elements 7750 512 7770 640 7850 1024 7870 1280 7950 1792 7970 2048
6850 960 6870 1120 6950 1408 6970 1536 6990 (6970x2)
6570 480 6670 480 6790 800
6450 160
5670 400 5750 720 5770 800 5830 1120 5850 1440 5870 1600 5970 (5870x2)
|
|
|
...and the signal to noise ratio drops to the usual levels
|
|
|
Hello, I am trying to run cgminer with 1 BFL single. I compiled cgminer with --enable-bitforce and libudev support alan@alan-Vostro-1500:~/bc/cgminer$ lsusb ... Bus 002 Device 002: ID 0403:6014 Future Technology Devices International, Ltd FT232H Single HS USB-UART/FIFO IC ... alan@alan-Vostro-1500:~/bc/cgminer$ ls /dev/serial/by-id/ usb-Butterfly_Labs_Inc._BitFORCE_SHA256-if00-port0 alan@alan-Vostro-1500:~/bc/cgminer$ ls /dev/ttyUSB* /dev/ttyUSB0 alan@alan-Vostro-1500:~/bc/cgminer$ ./cgminer -S 0 All devices disabled, cannot mine! alan@alan-Vostro-1500:~/bc/cgminer$ ./cgminer -S /dev/ttyUSB0 All devices disabled, cannot mine! alan@alan-Vostro-1500:~/bc/cgminer$
Any ideas? As per readme, tried?: sudo modprobe ftdi_sio vendor=0x0403 product=0x6014
|
|
|
I've noticed in reaper, if you tell it to run 5 threads it says it starting 10! Don't know if that has anything to do with it??
There's definitely not enough resources at 10240 gap 2 to run 5 actual threads on a 7970, and since the values don't line up with raper, who knows what it's doing, so long as we get good performance... I've lost interest at looking at the raper code itself since I got the kernel working well.
|
|
|
Yeah I'm not really sure why I can use 10240 on raper and not on this since they're structurally virtually the same kernels.
|
|
|
I don't get an increase in rejects with 4 threads, and I let it choose the default worksize for 7970 which is 64, not 256.
|
|
|
Yes this definitely needs sdk2.6+
EDIT: and make sure you delete any .bins already generated with older SDKs if you upgrade.
|
|
|
Found a sweet spot with my 7970s. Memory 1375 Engine 1135. Increasing engine slows it down beyond that. There is definitely a relationship between engine and memory clock, scrypt settings and even motherboard speed. Assuming higher is better is not going to necessarily be true.
|
|
|
I found some REALLY interesting results with my 6850 and my 5770! 6850: https://i.imgur.com/fZgEk.png5770: https://i.imgur.com/ti0Cr.pngSo, the memory clock on the 6850 had no effect on my hashrate (at least at those high settings)! I set it down to 500 and the hashrate didn't change. Now I wasn't paying attention to the share count, but it seemed to be sending as much as it should be. Overclocking the memory of a 6850 reduces hashrate. Likes worksize 128. The 5770 isn't like that. Higher memory doesn't change hashrate, but lower memory clock lowers it. Two threads pulled an extra few kh/s. A concurrency of somewhere in 3000s is what your looking for, for the 5770. Likes worksize 256. I'll do more testing later, but I'm happy with my results. This is amazing, great work ckolivas. And it seems stable already. Thanks I'm pretty sure we'll need to create a database with suitable values. Every card seems to want something completely different. My 7970s really need all 1375MHz of memory or hashrate drops off. Ironically they seem to do the opposite with engine clock - there's a ceiling to how high the hashrate gets and then turning the engine up further doesn't speed things up any more.
|
|
|
ckolivas, could you also set a user-definable parameter to adjust the difficulty needed for submitting a share?
Instead of reading off the difficulty from the pool? I guess so... but not right now. Oh, sorry, I didn't realize that each pool sends its difficulty requirement to the miner. Just when I was mining with P2Pool (Bitcoin), it would send difficulty 1 shares to my local P2Pool node even though the P2Pool share difficulty is much higher. Yes cgminer supports higher difficulty shares, which is why I've been trying to get BTC pools to start supporting it with all this faster hardware coming around. Ironically with LTC being much easier to mine difficulty 1 shares, the LTC pools needed to support higher difficulty shares first. P2pool however doesn't actually ask cgminer for higher difficulty shares, it asks for difficulty 1 shares and then internally decides if it's a "true share" based on the target difficulty it meets. It works either way, but cgminer uses less CPU so it makes much more sense to allow the mining software to do the testing. In the scrypt version of cgminer, it actually does the target difficulty even on the GPU for anything less than 4,294,967,295 where difficulty 1 shares on litecoin are only 65,535.
|
|
|
with the new, i have set the -I to 16, and i get 210, with almost 0 stales, reaper could get 210 MAX, with 10% stales, this is soo goood!
Yup, increase in intensity certainly helped. Now at 338 at 16 and 340 at 17. At 18, kh/s jumps around a lot. Time is averaged over 5 seconds, and at those high intensities the threads wont report in for over 5 seconds potentially so you'll get serious aliasing artefacts.
|
|
|
|