Psolver (OP)
Newbie
Offline
Activity: 6
Merit: 0
|
|
May 08, 2021, 08:46:19 PM |
|
Hi all, I'm both new to the mining experience and also to this forum... I stumbled upon a Antminer T17 the other day, throwing an error, so the price was right, and I could not resist to buy it I'm quite used to try and troubleshoot things from my daily work, but since I'm quite new and unexperienced to mining, I need some expert help on getting by right now. I will try to explain what I've done; First startup (Also se part of kernel log below): All 3 boards in the T17 connected as "standard"; - 1 card mines - 2 card does not mine Restarting with all only card "1" connected to socket 0 on the control card; - Card "1" mines Restarting with all only card "2" connected to socket 0 on the control card; - Card "2" mines The kernel log for "First startup" gives me this info: 2021-05-08 20:26:18 power_api.c:86:get_average_voltage: chain[0], voltage is: 17.034316 2021-05-08 20:26:20 power_api.c:86:get_average_voltage: chain[1], voltage is: 16.513857 2021-05-08 20:26:23 power_api.c:86:get_average_voltage: chain[2], voltage is: 15.558662 2021-05-08 20:26:23 power_api.c:97:get_average_voltage: aveage voltage is: 16.368945 2021-05-08 20:26:23 power_api.c:182:set_iic_power_by_voltage: now set voltage to : 17.000000 2021-05-08 20:26:23 uart.c:80:set_baud: set fpga_baud = 115200, fpga_divider = 26 2021-05-08 20:26:33 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[0]: find 0 asic, times 0 2021-05-08 20:26:43 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[0]: find 0 asic, times 1 2021-05-08 20:26:52 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[0]: find 0 asic, times 2 2021-05-08 20:26:52 driver-btm-api.c:1069:check_asic_number: Chain 0 only find 0 asic, will power off hash board 0 2021-05-08 20:27:04 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[1]: find 12 asic, times 0 2021-05-08 20:27:14 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[1]: find 12 asic, times 1 2021-05-08 20:27:24 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[1]: find 12 asic, times 2 2021-05-08 20:27:24 driver-btm-api.c:1069:check_asic_number: Chain 1 only find 12 asic, will power off hash board 1 2021-05-08 20:27:36 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[2]: find 0 asic, times 0 2021-05-08 20:27:45 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[2]: find 0 asic, times 1 2021-05-08 20:27:55 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[2]: find 0 asic, times 2 2021-05-08 20:27:55 driver-btm-api.c:1069:check_asic_number: Chain 2 only find 0 asic, will power off hash board 2 So my question is, based on that :
Could this be a ECC/PSU problem since the voltage on one of the cards is 15.558662?
|
|
|
|
mikeywith
Legendary
Offline
Activity: 2380
Merit: 6579
be constructive or S.T.F.U
|
|
May 09, 2021, 05:52:29 AM |
|
All 3 boards in the T17 connected as "standard"; - 1 card mines - 2 card does not mine Restarting with all only card "1" connected to socket 0 on the control card; - Card "1" mines
Restarting with all only card "2" connected to socket 0 on the control card; - Card "2" mines
I find it hard to understand this part, it could be your explanation or I am just getting old, but my best guess is that all hash boards work fine as long as you only run 1 hash board! If the above is correct, then your issue is most likely a bad PSU or low AC input voltage.
|
|
|
|
wndsnb
|
|
May 09, 2021, 01:06:08 PM |
|
From the log you posted, it looks like none of the boards are working when they are all connected. For the hashboard to work, all 30 asics need to be found. 2021-05-08 20:26:52 driver-btm-api.c:1069:check_asic_number: Chain 0 only find 0 asic, will power off hash board 0
2021-05-08 20:27:24 driver-btm-api.c:1069:check_asic_number: Chain 1 only find 12 asic, will power off hash board 1
2021-05-08 20:27:55 driver-btm-api.c:1069:check_asic_number: Chain 2 only find 0 asic, will power off hash board 2 The "get_average_voltage" messages shows the measurement from each hashboard of the PSU voltage. There is only one main supply voltage, so the three boards are measuring the same supply voltage. It could be the PSU is just shutting down and the 3 measurements are showing the voltage drop after the supply shut down, notice from the timestamps that there are a few seconds between each reading. 2021-05-08 20:26:18 power_api.c:86:get_average_voltage: chain[0], voltage is: 17.034316 2021-05-08 20:26:20 power_api.c:86:get_average_voltage: chain[1], voltage is: 16.513857 2021-05-08 20:26:23 power_api.c:86:get_average_voltage: chain[2], voltage is: 15.558662 2021-05-08 20:26:23 power_api.c:97:get_average_voltage: aveage voltage is: 16.368945 2021-05-08 20:26:23 power_api.c:182:set_iic_power_by_voltage: now set voltage to : 17.000000 From the little info you've given, I'd guess the most probable problem would be a bad PSU or input voltage as mikeywith said. Although I'm not sure how one board would work when the log you posted shows the PSU turning off before the boards start hashing. Before they start hashing, all 3 boards together are using much less power than a single board would use while actually hashing. We might be able to get a better idea of what the issue is if you post some more logs and screenshots of the status screen. - Log of 1 board working when all 3 boards are connected
- Logs of each board working when connected indiviudally
Also, what is the AC voltage you are powering these with? You should verify with a voltmeter.
|
Have some dead Bitmain 17 series hashboards or full miners? I'll buy them ... send me a PM with what you have and I'll make you an offer!
|
|
|
Psolver (OP)
Newbie
Offline
Activity: 6
Merit: 0
|
|
May 13, 2021, 06:18:26 AM |
|
Ok, so I decided to pick the machine apart and clean it, and man, there was some stuff in there that probably shouldn't be... I guess they don't put mosquitos and bees in there from the factory? Anyway, I compressor aired the loose parts, (spray)cleaned them with electrical cleaning spray, compressor aired again, put togehter, waited a day before start up. So now, whenever I run the machine, and troubleshoot it by moving the hashboards to different slots in the "controlboard", moving the datacables to different positions etc, I end up getting the same results; - Find 0 Asics on chain 0 - Find 12 Asics on chain 1 - Find 0 Asics on chain 2So for the board on chain 0 and 2 I will try the folllowing this video https://www.youtube.com/watch?v=5bdRJFGLuc0 with help of some additional info in the comments field of the video. Kernel log at the moment with all 3 boards connected (Shortened due to character exceed): Booting Linux on physical CPU 0x0 Linux version 4.6.0-xilinx-gff8137b-dirty (lzq@armdev2) (gcc version 4.8.3 20140320 (prerelease) (Sourcery CodeBench Lite 2014.05-23) ) #25 SMP PREEMPT Fri Nov 23 15:30:52 CST 2018 CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=18c5387d CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache Machine model: Xilinx Zynq cma: Reserved 16 MiB at 0x0e000000 Memory policy: Data cache writealloc On node 0 totalpages: 61440 free_area_init_node: node 0, pgdat c0b39280, node_mem_map cde10000 Normal zone: 480 pages used for memmap Normal zone: 0 pages reserved Normal zone: 61440 pages, LIFO batch:15 percpu: Embedded 12 pages/cpu @cddf1000 s19776 r8192 d21184 u49152 pcpu-alloc: s19776 r8192 d21184 u49152 alloc=12*4096 pcpu-alloc: [0] 0 [0] 1 Built 1 zonelists in Zone order, mobility grouping on. Total pages: 60960 Kernel command line: mem=240M console=ttyPS0,115200 ramdisk_size=33554432 root=/dev/ram rw earlyprintk PID hash table entries: 1024 (order: 0, 4096 bytes) Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) Memory: 203752K/245760K available (6345K kernel code, 231K rwdata, 1896K rodata, 1024K init, 223K bss, 25624K reserved, 16384K cma-reserved, 0K highmem) Virtual kernel memory layout: vector : 0xffff0000 - 0xffff1000 ( 4 kB) fixmap : 0xffc00000 - 0xfff00000 (3072 kB) vmalloc : 0xcf800000 - 0xff800000 ( 768 MB) lowmem : 0xc0000000 - 0xcf000000 ( 240 MB) pkmap : 0xbfe00000 - 0xc0000000 ( 2 MB) modules : 0xbf000000 - 0xbfe00000 ( 14 MB) .text : 0xc0008000 - 0xc090c424 (9234 kB) .init : 0xc0a00000 - 0xc0b00000 (1024 kB) .data : 0xc0b00000 - 0xc0b39fe0 ( 232 kB) .bss : 0xc0b39fe0 - 0xc0b71c28 ( 224 kB) Preemptible hierarchical RCU implementation. Build-time adjustment of leaf fanout to 32. RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2. RCU: Adjusting geometry for rcu_fanout_leaf=32, nr_cpu_ids=2 NR_IRQS:16 nr_irqs:16 16 efuse mapped to cf800000 ps7-slcr mapped to cf802000 L2C: platform modifies aux control register: 0x72360000 -> 0x72760000 L2C: DT/platform modifies aux control register: 0x72360000 -> 0x72760000 L2C-310 erratum 769419 enabled L2C-310 enabling early BRESP for Cortex-A9 L2C-310 full line of zeros enabled for Cortex-A9 L2C-310 ID prefetch enabled, offset 1 lines L2C-310 dynamic clock gating enabled, standby mode enabled L2C-310 cache controller enabled, 8 ways, 512 kB L2C-310: CACHE_ID 0x410000c8, AUX_CTRL 0x76760001 zynq_clock_init: clkc starts at cf802100 Zynq clock init sched_clock: 64 bits at 333MHz, resolution 3ns, wraps every 4398046511103ns clocksource: arm_global_timer: mask: 0xffffffffffffffff max_cycles: 0x4ce07af025, max_idle_ns: 440795209040 ns Switching to timer-based delay loop, resolution 3ns clocksource: ttc_clocksource: mask: 0xffff max_cycles: 0xffff, max_idle_ns: 537538477 ns ps7-ttc #0 at cf80a000, irq=18 Console: colour dummy device 80x30 Calibrating delay loop (skipped), value calculated using timer frequency.. 666.66 BogoMIPS (lpj=3333333) pid_max: default: 32768 minimum: 301 Mount-cache hash table entries: 1024 (order: 0, 4096 bytes) Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes) CPU: Testing write buffer coherency: ok CPU0: thread -1, cpu 0, socket 0, mpidr 80000000 Setting up static identity map for 0x100000 - 0x100058 CPU1: failed to boot: -1 Brought up 1 CPUs SMP: Total of 1 processors activated (666.66 BogoMIPS). CPU: All CPU(s) started in SVC mode. devtmpfs: initialized VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4 clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns pinctrl core: initialized pinctrl subsystem NET: Registered protocol family 16 DMA: preallocated 256 KiB pool for atomic coherent allocations cpuidle: using governor menu hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers. hw-breakpoint: maximum watchpoint size is 4 bytes. zynq-ocm f800c000.ps7-ocmc: ZYNQ OCM pool: 256 KiB @ 0xcf880000 vgaarb: loaded SCSI subsystem initialized usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb media: Linux media interface: v0.10 Linux video capture interface: v2.00 pps_core: LinuxPPS API ver. 1 registered pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it> PTP clock support registered EDAC MC: Ver: 3.0.0 Advanced Linux Sound Architecture Driver Initialized. clocksource: Switched to clocksource arm_global_timer NET: Registered protocol family 2 TCP established hash table entries: 2048 (order: 1, 8192 bytes) TCP bind hash table entries: 2048 (order: 2, 16384 bytes) TCP: Hash tables configured (established 2048 bind 2048) UDP hash table entries: 256 (order: 1, 8192 bytes) UDP-Lite hash table entries: 256 (order: 1, 8192 bytes) NET: Registered protocol family 1 RPC: Registered named UNIX socket transport module. RPC: Registered udp transport module. RPC: Registered tcp transport module. RPC: Registered tcp NFSv4.1 backchannel transport module. PCI: CLS 0 bytes, default 64 Trying to unpack rootfs image as initramfs... rootfs image is not initramfs (no cpio magic); looks like an initrd Freeing initrd memory: 12584K (cceb7000 - cdb01000) hw perfevents: enabled with armv7_cortex_a9 PMU driver, 7 counters available futex hash table entries: 512 (order: 3, 32768 bytes) workingset: timestamp_bits=28 max_order=16 bucket_order=0 jffs2: version 2.2. (NAND) (SUMMARY) © 2001-2006 Red Hat, Inc. io scheduler noop registered io scheduler deadline registered io scheduler cfq registered (default) dma-pl330 f8003000.ps7-dma: Loaded driver for PL330 DMAC-241330 dma-pl330 f8003000.ps7-dma: DBUFF-128x8bytes Num_Chans-8 Num_Peri-4 Num_Events-16 e0000000.serial: ttyPS0 at MMIO 0xe0000000 (irq = 158, base_baud = 6249999) is a xuartps console [ttyPS0] enabled xdevcfg f8007000.ps7-dev-cfg: ioremap 0xf8007000 to cf86e000 [drm] Initialized drm 1.1.0 20060810 brd: module loaded loop: module loaded CAN device driver interface gpiod_set_value: invalid GPIO libphy: MACB_mii_bus: probed macb e000b000.ethernet eth0: Cadence GEM rev 0x00020118 at 0xe000b000 irq 31 (00:0a:35:00:00:00) Generic PHY e000b000.etherne:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=e000b000.etherne:00, irq=-1) e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k e1000e: Copyright(c) 1999 - 2015 Intel Corporation. ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver ehci-pci: EHCI PCI platform driver usbcore: registered new interface driver usb-storage mousedev: PS/2 mouse device common for all mice i2c /dev entries driver Xilinx Zynq CpuIdle Driver started sdhci: Secure Digital Host Controller Interface driver sdhci: Copyright(c) Pierre Ossman sdhci-pltfm: SDHCI platform and OF driver helper mmc0: SDHCI controller on e0100000.ps7-sdio [e0100000.ps7-sdio] using ADMA ledtrig-cpu: registered to indicate activity on CPUs usbcore: registered new interface driver usbhid usbhid: USB HID core driver nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda nand: Micron MT29F2G08ABAGAWP nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 128 nand: WARNING: pl35x-nand: the ECC used on your system is too weak compared to the one required by the NAND chip Bad block table found at page 131008, version 0x01 Bad block table found at page 130944, version 0x01 6 ofpart partitions found on MTD device pl35x-nand Creating 6 MTD partitions on "pl35x-nand": 0x000000000000-0x000002800000 : "BOOT.bin-env-dts-kernel" 0x000002800000-0x000004800000 : "ramfs" 0x000004800000-0x000005000000 : "configs" 0x000005000000-0x000006000000 : "reserve" 0x000006000000-0x000008000000 : "ramfs-bak" 0x000008000000-0x000010000000 : "reserve1" NET: Registered protocol family 10 sit: IPv6 over IPv4 tunneling driver NET: Registered protocol family 17 can: controller area network core (rev 20120528 abi 9) NET: Registered protocol family 29 can: raw protocol (rev 20120528) can: broadcast manager protocol (rev 20120528 t) can: netlink gateway (rev 20130117) max_hops=1 zynq_pm_ioremap: no compatible node found for 'xlnx,zynq-ddrc-a05' zynq_pm_late_init: Unable to map DDRC IO memory. Registering SWP/SWPB emulation handler hctosys: unable to open rtc device (rtc0) ALSA device list: No soundcards found. RAMDISK: gzip image found at block 0 EXT4-fs (ram0): couldn't mount as ext3 due to feature incompatibilities EXT4-fs (ram0): mounted filesystem without journal. Opts: (null) VFS: Mounted root (ext4 filesystem) on device 1:0. devtmpfs: mounted Freeing unused kernel memory: 1024K (c0a00000 - c0b00000) EXT4-fs (ram0): re-mounted. Opts: block_validity,delalloc,barrier,user_xattr random: dd urandom read with 0 bits of entropy available ubi0: attaching mtd2 ubi0: scanning is finished ubi0: attached mtd2 (name "configs", size 8 MiB) ubi0: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes ubi0: min./max. I/O unit sizes: 2048/2048, sub-page size 2048 ubi0: VID header offset: 2048 (aligned 2048), data offset: 4096 ubi0: good PEBs: 64, bad PEBs: 0, corrupted PEBs: 0 ubi0: user volume: 1, internal volumes: 1, max. volumes count: 128 ubi0: max/mean erase counter: 3/1, WL threshold: 4096, image sequence number: 1433474905 ubi0: available PEBs: 0, total reserved PEBs: 64, PEBs reserved for bad PEB handling: 40 ubi0: background thread "ubi_bgt0d" started, PID 708 UBIFS (ubi0:0): background thread "ubifs_bgt0_0" started, PID 711 UBIFS (ubi0:0): recovery needed UBIFS (ubi0:0): recovery completed UBIFS (ubi0:0): UBIFS: mounted UBI device 0, volume 0, name "configs" UBIFS (ubi0:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes UBIFS (ubi0:0): FS size: 1396736 bytes (1 MiB, 11 LEBs), journal size 888833 bytes (0 MiB, 5 LEBs) UBIFS (ubi0:0): reserved for root: 65970 bytes (64 KiB) UBIFS (ubi0:0): media format: w4/r0 (latest is w4/r0), UUID 0935B86A-4714-4EB9-9FD3-96EC09C62EF1, small LPT model ubi1: attaching mtd5 ubi1: scanning is finished ubi1: attached mtd5 (name "reserve1", size 128 MiB) ubi1: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes ubi1: min./max. I/O unit sizes: 2048/2048, sub-page size 2048 ubi1: VID header offset: 2048 (aligned 2048), data offset: 4096 ubi1: good PEBs: 1020, bad PEBs: 4, corrupted PEBs: 0 ubi1: user volume: 1, internal volumes: 1, max. volumes count: 128 ubi1: max/mean erase counter: 1/0, WL threshold: 4096, image sequence number: 1861296417 ubi1: available PEBs: 0, total reserved PEBs: 1020, PEBs reserved for bad PEB handling: 36 ubi1: background thread "ubi_bgt1d" started, PID 720 UBIFS (ubi1:0): background thread "ubifs_bgt1_0" started, PID 723 UBIFS (ubi1:0): recovery needed UBIFS (ubi1:0): recovery completed UBIFS (ubi1:0): UBIFS: mounted UBI device 1, volume 0, name "reserve1" UBIFS (ubi1:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes UBIFS (ubi1:0): FS size: 123039744 bytes (117 MiB, 969 LEBs), journal size 6221824 bytes (5 MiB, 49 LEBs) UBIFS (ubi1:0): reserved for root: 4952683 bytes (4836 KiB) UBIFS (ubi1:0): media format: w4/r0 (latest is w4/r0), UUID 57BDEC5B-CB8F-4CAA-8972-67DEDDA4EA08, small LPT model IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready macb e000b000.ethernet eth0: unable to generate target frequency: 25000000 Hz macb e000b000.ethernet eth0: link up (100/Full) IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready In axi fpga driver! request_mem_region OK! AXI fpga dev virtual address is 0xcfb38000 *base_vir_addr = 0xab013 In fpga mem driver! request_mem_region OK! fpga mem virtual address is 0xd2000000 random: nonblocking pool is initialized
------------>
2021-05-13 05:51:42 driver-btm-api.c:663:init_freq_mode: This is scan-user version 2021-05-13 05:51:42 driver-btm-api.c:2028:bitmain_soc_init: opt_multi_version = 1 2021-05-13 05:51:42 driver-btm-api.c:2029:bitmain_soc_init: opt_bitmain_ab = 1 2021-05-13 05:51:42 driver-btm-api.c:2030:bitmain_soc_init: opt_bitmain_work_mode = 254 2021-05-13 05:51:42 driver-btm-api.c:2031:bitmain_soc_init: Miner compile time: Thu Apr 23 16:29:07 CST 2020 type: Antminer T17 2021-05-13 05:51:42 driver-btm-api.c:2032:bitmain_soc_init: commit version: 1c5be6f 2020-04-20 16:18:14, build by: lol 2020-04-23 16:35:04 2021-05-13 05:51:42 driver-btm-api.c:1844:show_sn: no SN got, please write SN to /nvdata/sn 2021-05-13 05:51:42 driver-btm-api.c:1167:miner_device_init: Detect 256MB control board of XILINX 2021-05-13 05:51:42 driver-btm-api.c:1115:init_fan_parameter: fan_eft : 0 fan_pwm : 0 2021-05-13 05:51:42 thread.c:885:create_read_nonce_reg_thread: create thread 2021-05-13 05:51:48 driver-btm-api.c:1099:init_miner_version: miner ID : 806cf5864e104814 2021-05-13 05:51:48 driver-btm-api.c:1105:init_miner_version: FPGA Version = 0xB013 2021-05-13 05:51:50 eeprom.c:431:check_pattern_test_level: L1 board 2021-05-13 05:51:52 eeprom.c:431:check_pattern_test_level: L1 board 2021-05-13 05:51:54 eeprom.c:431:check_pattern_test_level: L1 board 2021-05-13 05:51:54 driver-btm-api.c:737:get_product_id: product_id[0] = 1 2021-05-13 05:51:54 driver-btm-api.c:737:get_product_id: product_id[1] = 1 2021-05-13 05:51:54 driver-btm-api.c:737:get_product_id: product_id[2] = 1 2021-05-13 05:51:54 driver-btm-api.c:1666:get_ccdly_opt: ccdly_opt[0] = 1 2021-05-13 05:51:54 driver-btm-api.c:1666:get_ccdly_opt: ccdly_opt[1] = 1 2021-05-13 05:51:54 driver-btm-api.c:1666:get_ccdly_opt: ccdly_opt[2] = 1 2021-05-13 05:51:54 driver-btm-api.c:1919:bitmain_board_init: g_ccdly_opt = 1 2021-05-13 05:51:54 driver-btm-api.c:676:_set_project_type: project:2 2021-05-13 05:51:54 driver-btm-api.c:706:_set_project_type: Project type: Antminer T17 2021-05-13 05:51:54 driver-btm-api.c:717:dump_pcb_bom_version: Chain [0] PCB Version: 0x0100 2021-05-13 05:51:54 driver-btm-api.c:718:dump_pcb_bom_version: Chain [0] BOM Version: 0x0100 2021-05-13 05:51:54 driver-btm-api.c:717:dump_pcb_bom_version: Chain [1] PCB Version: 0x0100 2021-05-13 05:51:54 driver-btm-api.c:718:dump_pcb_bom_version: Chain [1] BOM Version: 0x0100 2021-05-13 05:51:54 driver-btm-api.c:717:dump_pcb_bom_version: Chain [2] PCB Version: 0x0100 2021-05-13 05:51:54 driver-btm-api.c:718:dump_pcb_bom_version: Chain [2] BOM Version: 0x0100 2021-05-13 05:51:55 driver-btm-api.c:1939:bitmain_board_init: Fan check passed. 2021-05-13 05:51:57 board.c:36:jump_and_app_check_restore_pic: chain[0] PIC jump to app 2021-05-13 05:52:00 board.c:40:jump_and_app_check_restore_pic: Check chain[0] PIC fw version=0xb9 2021-05-13 05:52:02 board.c:36:jump_and_app_check_restore_pic: chain[1] PIC jump to app 2021-05-13 05:52:05 board.c:40:jump_and_app_check_restore_pic: Check chain[1] PIC fw version=0xb9 2021-05-13 05:52:07 board.c:36:jump_and_app_check_restore_pic: chain[2] PIC jump to app 2021-05-13 05:52:10 board.c:40:jump_and_app_check_restore_pic: Check chain[2] PIC fw version=0xb9 2021-05-13 05:52:10 thread.c:880:create_pic_heart_beat_thread: create thread 2021-05-13 05:52:10 power_api.c:55:power_init: power init ... 2021-05-13 05:52:10 driver-btm-api.c:1949:bitmain_board_init: Enter 30s sleep to make sure power release finish. 2021-05-13 05:52:43 power_api.c:232:set_iic_power_to_highest_voltage: setting to voltage: 17.00 ... 2021-05-13 05:52:48 power_api.c:124:check_voltage_multi: retry time: 0 2021-05-13 05:52:50 power_api.c:86:get_average_voltage: chain[0], voltage is: 17.132285 2021-05-13 05:52:52 power_api.c:86:get_average_voltage: chain[1], voltage is: 17.175146 2021-05-13 05:52:53 power_api.c:86:get_average_voltage: chain[2], voltage is: 17.138408 2021-05-13 05:52:53 power_api.c:97:get_average_voltage: aveage voltage is: 17.148613 2021-05-13 05:52:53 power_api.c:182:set_iic_power_by_voltage: now set voltage to : 17.000000 2021-05-13 05:52:54 uart.c:80:set_baud: set fpga_baud = 115200, fpga_divider = 26 2021-05-13 05:53:05 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[0]: find 0 asic, times 0 2021-05-13 05:53:14 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[0]: find 0 asic, times 1 2021-05-13 05:53:25 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[0]: find 0 asic, times 2 2021-05-13 05:53:25 driver-btm-api.c:1069:check_asic_number: Chain 0 only find 0 asic, will power off hash board 0 2021-05-13 05:53:37 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[1]: find 12 asic, times 0 2021-05-13 05:53:47 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[1]: find 12 asic, times 1 2021-05-13 05:53:57 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[1]: find 12 asic, times 2 2021-05-13 05:53:57 driver-btm-api.c:1069:check_asic_number: Chain 1 only find 12 asic, will power off hash board 1 2021-05-13 05:54:08 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[2]: find 0 asic, times 0 2021-05-13 05:54:18 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[2]: find 0 asic, times 1 2021-05-13 05:54:28 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[2]: find 0 asic, times 2 2021-05-13 05:54:28 driver-btm-api.c:1069:check_asic_number: Chain 2 only find 0 asic, will power off hash board 2 2021-05-13 05:54:29 driver-btm-api.c:205:set_miner_status: STATUS_INIT 2021-05-13 05:54:34 driver-btm-api.c:205:set_miner_status: STATUS_OKAY 2021-05-13 05:54:36 driver-btm-c5_socketb.c:1049:main: poweroff hash board and enter sleep mode ... 2021-05-13 05:54:39 driver-btm-api.c:1325:dhash_chip_send_job: Version num 4
|
|
|
|
Psolver (OP)
Newbie
Offline
Activity: 6
Merit: 0
|
|
May 13, 2021, 01:03:34 PM |
|
Some more update: Electricity: I measured and got 232v from the outlets to drive the T17. Board 2: No action taken yet Board 1: I removed the cooling alu on chip 12 + 29 and 30 (counting from incoming electricity) There were 2 solder balls that I removed on two different chips. Board 0: On this one I've removed all alus and measured some resistance towards ground. I've found one place on the board that I feel is a bit suspisious; When measuring from the red areas (to ground) in the picture below, I get the following readings: 1: 6.7 kOhm 2: 6.87 kOhm 3: 6.7 kOhm Also some of the other measure-points at the "1, 2, 3" places differ in the same way when measuring towards ground; pos 2 does not follow the pattern. Could it be that this specific chip marked in black is faulty?https://imgur.com/a/G1zo0Zu
|
|
|
|
Psolver (OP)
Newbie
Offline
Activity: 6
Merit: 0
|
|
May 14, 2021, 05:47:29 AM |
|
Now I believe something is going in the right direction! Additional update:I upgraded the os to try and get some more info, and this is what it looks like right now, with all 3 boards connected. It's just started, but can anyone see if it looks "ok" (although 2 of the boards needs to be checked for problems/solder balls etc) https://imgur.com/a/kOAdwwlBut from having a totaly malfunctional T17 I hope this is better...
|
|
|
|
mikeywith
Legendary
Offline
Activity: 2380
Merit: 6579
be constructive or S.T.F.U
|
|
May 14, 2021, 07:49:02 AM |
|
It seems like you swapped the hashboards and now you are referencing them in the wrong way, can we stick to calling them chain 0,1,2 according to the kernel log where 0 in the log = 1 in the miner status page, 1 = 2 and 2=3 , just to avoid confusion. According to the last image hashboard 2 ( 3 on the status page) looks perfect finding all 30 Asics, board 0 ( 1 on the miner status page) shows 21 Asics, which suggests that there is an issue either with the 21st asic or the 22nd chip. For now I think you should focus on chain 0 which is showing 21 asics, it should be easier to fix, have you seen the repair manual for T17? please find here https://www.zeusbtc.com/NewsDetails.asp?ID=187 , there is a PDF which you can download from there, it's in Chinese, but should be "easy" to understand. I would also wait for wndsnb to respond, he has the most knowledge in fixing these hashboards.
|
|
|
|
Psolver (OP)
Newbie
Offline
Activity: 6
Merit: 0
|
|
May 14, 2021, 02:35:06 PM |
|
It seems like you swapped the hashboards and now you are referencing them in the wrong way, can we stick to calling them chain 0,1,2 according to the kernel log where 0 in the log = 1 in the miner status page, 1 = 2 and 2=3 , just to avoid confusion. Absolutely, I will try my best, but as I have little knowledge about the chains and boards I have a (probably simple to answer) question: Are the outlets on the control-board ALWAYS the same chain no., or do they alter depending on how the boards / how many boards that are connected? According to the last image hashboard 2 ( 3 on the status page) looks perfect finding all 30 Asics, board 0 ( 1 on the miner status page) shows 21 Asics, which suggests that there is an issue either with the 21st asic or the 22nd chip. ... For now I think you should focus on chain 0 which is showing 21 asics, it should be easier to fix, have you seen the repair manual for T17? please find here https://www.zeusbtc.com/NewsDetails.asp?ID=187 , there is a PDF which you can download from there, it's in Chinese, but should be "easy" to understand. Now here's a strang thing that happened as I started working on the board for chain 1 (21 asics found): 1. I removed the data cable to verify that chain 1 with "21 asics found" was in the specific slot of the miner. 2. I removed the board and looked at chip 20-21-22 (from both directions) to see if I found anything suspicious. 3. Resoldered back cooling flanges and put the board back in the same slot. 4. Started up the miner and chain 1 now found all the 30 asics BUT... chain 3 it only found 3 asics on board I get a feeling that something else than the boards themselves are causing this, but I'm just guessing. My next step is to try and measure a board to see that all chips are ok. Does anyone have info about what currents I need to apply to the board in order to get some measurments, or can this be found in the previous link?
|
|
|
|
mikeywith
Legendary
Offline
Activity: 2380
Merit: 6579
be constructive or S.T.F.U
|
|
May 15, 2021, 08:19:21 AM |
|
Are the outlets on the control-board ALWAYS the same chain no., or do they alter depending on how the boards / how many boards that are connected?
It has been a while since I played with these gears, but IIRC they will always display the same chain, it would be best if you just troubleshoot them 1 by 1 to avoid confusion. 4. Started up the miner and chain 1 now found all the 30 asics BUT... chain 3 it only found 3 asics on board
I get a feeling that something else than the boards themselves are causing this, but I'm just guessing.
My next step is to try and measure a board to see that all chips are ok.
There is a tiny chance that the PSU is faulty and causes this, but it's not very likely to be the case, it's very common for a dashboard to show 30 Asics and then after a reboot, it will show 3 or 0 Asics, so this could very well be just a coincidence, this is why it's ALWAYS best to test 1 board at the time, so now since chain 1 (previously showing 21 Asics) is showing 30 Asics, let it run alone for 12-24 hours, if it sticks and nothing goes wrong, you can be somehow sure that it has been fixed and then put it aside and start with chain 3 doing the same process all over again, testing the board for a few mins can be misleading and will give you a lot of false results. Does anyone have info about what currents I need to apply to the board in order to get some measurments, or can this be found in the previous link?
Do you mean voltage? if it's what you mean then it is 21v DC.
|
|
|
|
Psolver (OP)
Newbie
Offline
Activity: 6
Merit: 0
|
|
May 16, 2021, 06:55:07 AM |
|
Thanks for all the help so far!
I will start focusing on one board at the time from now.
If I understand correctly, I can apply 21v to the boards metal clamps (with +/- on the correct place offcourse), and then measure the testpoints on the board?
|
|
|
|
mikeywith
Legendary
Offline
Activity: 2380
Merit: 6579
be constructive or S.T.F.U
|
|
May 16, 2021, 08:09:30 AM |
|
If I understand correctly, I can apply 21v to the boards metal clamps (with +/- on the correct place offcourse), and then measure the testpoints on the board?
I don't that is possible without a tester/fixture tool, my understanding is that the current won't flow in the hashboard without a control board, and the normal control board will only supply the hashboard with power for a very short period of time, it will stop once the asic count fails, so you have a very short window to test, I am not sure if that is doable, but I guess it is if you perfectly time when does the current flow based on the kernel log or x seconds from powering on. The main feature of the fixture tools despite their brand is that they keep the current on the hash board for as long as you want it to be there so you can measure the voltage around the chips, I think it would be best to reach out to wndsnb to confirm the above.
|
|
|
|
wndsnb
|
|
May 16, 2021, 12:13:58 PM |
|
If I understand correctly, I can apply 21v to the boards metal clamps (with +/- on the correct place offcourse), and then measure the testpoints on the board?
The board won't do anything unless the correct commands are sent from a control board. It needs to command the pic microcontroller to turn on power to the chips on the board, and then it needs to send a command to the chips in order to see any signals at the test points. It is possible to just use the control board from the miner, and make up some cables (need to be 4awg to 6awg cables) to run from the psu to the hashboard so you can run it on a bench so you can access the test points. It is very slow though because it takes so long to boot up and then you only get 3 chances to measure anything when it is checking the ASIC count, after the 3rd try it just shuts down the board and you need to start again. The normal Bitmain style test jig (that is just a S17+ control board with special firmware), does the same thing except it boots a bit faster and then runs a test pattern to verify the operation and performance of each chip if all chips are found. For troubleshooting a board that isn't finding all the chips, it is still very slow, maybe a couple of minutes per round of 3 ASIC counts. The tester from Asic.Repair ( https://tester.asic.repair/en), far superior to the standard test jig for troubleshooting boards that aren't finding all chips. It allows you to run the ASIC count test continuously about once a second with no boot-up time. I have both and rarely use the standard test fixutre any more.
|
Have some dead Bitmain 17 series hashboards or full miners? I'll buy them ... send me a PM with what you have and I'll make you an offer!
|
|
|
Dwarf miner group
Newbie
Offline
Activity: 1
Merit: 0
|
|
August 30, 2022, 04:42:17 PM |
|
This is the problem of the hashboard, and to fix it, you need a tester to repeat the signals for you to find the failure.You need to be familiar with electronic science
|
|
|
|
MinerMEDIC
Member
Offline
Activity: 166
Merit: 82
EET/NASA intern 2013 Bitmain/MicroBT/IPC cert
|
|
August 30, 2022, 07:28:30 PM |
|
This is the problem of the hashboard, and to fix it, you need a tester to repeat the signals for you to find the failure.You need to be familiar with electronic science
Before you shot this poor guy down with a less than helpful answer you should’ve double checked your translation. There’s no such thing as electronic science. It’s like saying food pizza or seated car. AND to correct you, you didn’t need a test gig to trace signals. When the log fills up and says zero ASIC the control board is sending the relevant signals until shutting down hashboard. The only thing a test jig does is allow you to conveniently retry with a button push, as well as the option of to run a stress test. Sorry for the rant gmaxwell et el but sheesh, he’s received so much helpful advice before this.
|
---Hi, I'm Juergen "Jay" & I TEACH and REPAIR ASIC HASHBOARDS-- Purdue AS EET -- MinerMEDIC is NOW FREELANCE in Chicago!
|
|
|
Rodemi
Jr. Member
Offline
Activity: 46
Merit: 14
|
|
September 07, 2022, 07:41:43 AM |
|
|
|
|
|
|