|
May 01, 2017, 08:55:11 AM Last edit: May 01, 2017, 09:12:00 AM by kobazik |
|
Hi,
I'm running Ubuntu 16.04 with claymore 9.3 and amdgpu-pro drivers 17.10. There are six XFX RX480 8GB cards in the system with low power bios 29mh/s. Five out of six cards are mining fine but GPU0 is only doing 4mh/s.
ETH: 05/01/17-10:47:41 - New job from eu1.ethermine.org:4444 ETH - Total Speed: 147.562 Mh/s, Total Shares: 18, Rejected: 0, Time: 00:05 ETH: GPU0 4.168 Mh/s, GPU1 28.694 Mh/s, GPU2 28.676 Mh/s, GPU3 28.688 Mh/s, GPU4 28.681 Mh/s, GPU5 28.656 Mh/s GPU0 t=43C fan=56%, GPU1 t=60C fan=56%, GPU2 t=56C fan=56%, GPU3 t=57C fan=56%, GPU4 t=59C fan=56%, GPU5 t=54C fan=56%
I had the same setup running Windows previously and all cards were mining with ~29mh/s.
# lspci |grep VGA 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06) 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 67df (rev c7) 02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 67df (rev c7) 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 67df (rev c7) 04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 67df (rev c7) 05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 67df (rev c7) 06:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 67df (rev c7)
=================== ROCm System Management Interface =================== ============================================================================ GPU[0] : GPU ID: 0x0402 GPU[1] : GPU ID: 0x67df GPU[2] : GPU ID: 0x67df GPU[3] : GPU ID: 0x67df GPU[4] : GPU ID: 0x67df GPU[5] : GPU ID: 0x67df GPU[6] : GPU ID: 0x67df ============================================================================ =================== End of ROCm SMI Log ===================
For some reason roc-smi cannot detect clocks on the problematic card:
# ./rocm-smi -d 1 -a
=================== ROCm System Management Interface =================== ============================================================================ GPU[1] : GPU ID: 0x67df ============================================================================ ============================================================================ GPU[1] : Temperature: 44.0c ============================================================================ ============================================================================ GPU[1] : Unable to determine current clocks. Check dmesg or GPU temperature ============================================================================ GPU[1] : Fan Level: 145 (56.86)% ============================================================================ ============================================================================ GPU[1] : Current PowerPlay Level: auto ============================================================================ ============================================================================ GPU[1] : Current OverDrive value: 0% ============================================================================ ============================================================================ GPU[1] : Compute Power Profile not supported ============================================================================ ============================================================================ GPU[1] : Supported GPU clock frequencies on GPU1 GPU[1] : 0: 300Mhz GPU[1] : 1: 515Mhz GPU[1] : 2: 775Mhz GPU[1] : 3: 915Mhz GPU[1] : 4: 940Mhz GPU[1] : 5: 975Mhz GPU[1] : 6: 1000Mhz GPU[1] : 7: 1095Mhz GPU[1] : GPU[1] : Supported GPU Memory clock frequencies on GPU1 GPU[1] : 0: 300Mhz GPU[1] : 1: 2080Mhz GPU[1] : ============================================================================ =================== End of ROCm SMI Log ===================
Other cards e.g GPU2 is fine
#./rocm-smi -d 2 -a
=================== ROCm System Management Interface =================== ============================================================================ GPU[2] : GPU ID: 0x67df ============================================================================ ============================================================================ GPU[2] : Temperature: 59.0c ============================================================================ ============================================================================ GPU[2] : GPU Clock Level: 7 (1095Mhz) GPU[2] : GPU Memory Clock Level: 1 (2080Mhz) ============================================================================ ============================================================================ GPU[2] : Fan Level: 145 (56.86)% ============================================================================ ============================================================================ GPU[2] : Current PowerPlay Level: auto ============================================================================ ============================================================================ GPU[2] : Current OverDrive value: 0% ============================================================================ ============================================================================ GPU[2] : Compute Power Profile not supported ============================================================================ ============================================================================ GPU[2] : Supported GPU clock frequencies on GPU2 GPU[2] : 0: 300Mhz GPU[2] : 1: 515Mhz GPU[2] : 2: 775Mhz GPU[2] : 3: 915Mhz GPU[2] : 4: 940Mhz GPU[2] : 5: 975Mhz GPU[2] : 6: 1000Mhz GPU[2] : 7: 1095Mhz * GPU[2] : GPU[2] : Supported GPU Memory clock frequencies on GPU2 GPU[2] : 0: 300Mhz GPU[2] : 1: 2080Mhz * GPU[2] : ============================================================================ =================== End of ROCm SMI Log ===================
I can also see amdgpdu powerplay and voltage errors in dmesg:
[ 9.135564] amdgpu: [powerplay] amdgpu: powerplay sw initialized [ 9.135998] amdgpu 0000:06:00.0: fence driver on ring 0 use gpu addr 0x0000000200000008, cpu addr 0xffff8802baefd008 [ 9.136556] amdgpu 0000:06:00.0: fence driver on ring 1 use gpu addr 0x0000000200000018, cpu addr 0xffff8802baefd018 [ 9.137110] amdgpu 0000:06:00.0: fence driver on ring 2 use gpu addr 0x0000000200000028, cpu addr 0xffff8802baefd028 [ 9.137597] amdgpu 0000:06:00.0: fence driver on ring 3 use gpu addr 0x0000000200000038, cpu addr 0xffff8802baefd038 [ 9.138125] amdgpu 0000:06:00.0: fence driver on ring 4 use gpu addr 0x0000000200000048, cpu addr 0xffff8802baefd048 [ 9.138558] amdgpu 0000:06:00.0: fence driver on ring 5 use gpu addr 0x0000000200000058, cpu addr 0xffff8802baefd058 [ 9.138979] amdgpu 0000:06:00.0: fence driver on ring 6 use gpu addr 0x0000000200000068, cpu addr 0xffff8802baefd068 [ 9.139446] amdgpu 0000:06:00.0: fence driver on ring 7 use gpu addr 0x0000000200000078, cpu addr 0xffff8802baefd078 [ 9.139960] amdgpu 0000:06:00.0: fence driver on ring 8 use gpu addr 0x0000000200000088, cpu addr 0xffff8802baefd088 [ 9.140408] amdgpu 0000:06:00.0: fence driver on ring 9 use gpu addr 0x0000000200000098, cpu addr 0xffff8802baefd098 [ 9.140826] amdgpu 0000:06:00.0: fence driver on ring 10 use gpu addr 0x00000002000000a8, cpu addr 0xffff8802baefd0a8 [ 9.141243] amdgpu 0000:06:00.0: fence driver on ring 11 use gpu addr 0x00000002000000b8, cpu addr 0xffff8802baefd0b8 [ 9.145606] amdgpu 0000:06:00.0: fence driver on ring 12 use gpu addr 0x0000000001165420, cpu addr 0xffffc9000c25a420 [ 9.146956] amdgpu 0000:06:00.0: fence driver on ring 13 use gpu addr 0x00000002000000d8, cpu addr 0xffff8802baefd0d8 [ 9.147444] amdgpu 0000:06:00.0: fence driver on ring 14 use gpu addr 0x00000002000000e8, cpu addr 0xffff8802baefd0e8 [ 9.187800] amdgpu: [powerplay] [AVFS] Something is broken. See log! [ 9.190613] amdgpu: [powerplay] Can't find requested voltage id in vdd_dep_on_sclk table! [ 9.191538] amdgpu: [powerplay] Can't find requested voltage id in vdd_dep_on_sclk table! [ 9.192014] amdgpu: [powerplay] Can't find requested voltage id in vdd_dep_on_sclk table! [ 9.192491] amdgpu: [powerplay] Can't find requested voltage id in vdd_dep_on_sclk table! [ 9.192965] amdgpu: [powerplay] Can't find requested voltage id in vdd_dep_on_sclk table! [ 9.193427] amdgpu: [powerplay] Can't find requested voltage id in vdd_dep_on_sclk table! [ 9.193875] amdgpu: [powerplay] Can't find requested voltage id in vdd_dep_on_sclk table! [ 9.194307] amdgpu: [powerplay] Can't find requested voltage id in vdd_dep_on_sclk table! [ 9.219785] [drm] amdgpu: freesync_module init done ffff8802bae10c60. [ 9.452036] amdgpu 0000:06:00.0: fb6: amdgpudrmfb frame buffer device
|