Bitcoin Forum
February 05, 2026, 04:31:52 AM *
News: Latest Bitcoin Core release: 30.2 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: ckpool top-of-hour (:00) drops  (Read 20 times)
Z3r0XG (OP)
Newbie
*
Offline Offline

Activity: 6
Merit: 2


View Profile
February 04, 2026, 06:49:55 PM
Last edit: February 04, 2026, 07:13:56 PM by Z3r0XG
Merited by nc50lc (1)
 #1

I'm a little stumped on this one. On my own fork of ckpool I see a steady heartbeat of drops near the :00 on the hour, every hour. I thought this was something on my server, or a glitch in ckstats, but I crawled through my logs and these are legit drops, the reason being record usually as a client side disconnect. My own devices on the pool seem stable - a mix of ASICs and esp32 miners. The esp32s go offline sometimes but that's just their nature.

Then I noticed a similar pattern in solo.ckpool.org ckstats, and also in solo.nerdminers.org, which is running its own fork - the same steady heart beat of drops at :00.

I saw that both sites provide /pool/pool.status so I started scraping those changes over time, then compared all three to see if the behavior persisted. All three show a decent delta in drops happening around the :00 timestamp, then almost immediate reconnection by those clients. My pool has about 500 clients, nerdminers.org has many more, solo.ckpool is one of the most popular pools out there - similar behavior.

I have a dropidle setting set to 300s in my configuration, not sure what it's set to elsewhere. Even if it were set to "1 hour" I would expect this to fire 1 hour after server start, not :00. I'm just a hack when it comes to code but I looked through mine and the original code and there doesn't seem to be anything that is based on events being triggered by the real clock (outside of an unused file rotation function?). Everything seems to be based on elapsed time, which means, again, I would expect if the heartbeats existed at all, they would be offset by server start for each pool - but they're not. ckpool services ASICs, nerdminers.org I believe is only esp32 miners, mine is a mix of both (maybe it's common to the esp32 miners AND controllers of some of the popular ASIC devices?).

I looked through the NerdminerV2 firmware repo and did not see anything there that would trigger at :00 either. Nothing obvious in stratum comms that I captured - the only thing that makes sense at the moment is that I missed something in the ckpool code that can explain it (a hardcoded top of our something or other).

Heck, I know there's mixed feelings about AI but I had it run through my data as well:

Quote
Pattern Confirmed – Across three independent pool servers (heliospool.com, nerdminers.org, ckpool.org), minute‑to‑minute deltas in Users, Workers, and Disconnected show consistent spikes at wall‑clock minute :00 (and :01). The magnitude is 2–6× above the average delta, indicating a synchronized drop event.

Disconnection Characteristics – The ckpool‑process‑ckpool‑lhr.log shows actual disconnections at 06:00:01–06:00:48 with error 104 (“Connection reset by peer”). Affected clients were idle 8–48 seconds, far below the pool’s dropidle threshold (300 s), ruling out pool‑initiated idle‑client cleanup.

Stratum Communication – The stratum log (port_3333_all.log) shows normal mining.submit and result:true messages at 06:00:00. Message frequency (170–300 per second) does not drop at the :00 mark, eliminating a pool‑side communication stall as the cause.

Pool‑Internal Timers – No ckpool code performs actions specifically at minute‑0 of each hour. The only wall‑clock‑hour‑sensitive functions (rotating_log, rotating_filename) are defined but never called. The statsupdate thread decays hourly hash‑rate averages continuously, not as a discrete reset.

Client‑Side Firmware – NerdMinerV2 firmware lacks hourly wall‑clock timers; its stat‑saving intervals are relative to miner start time, not wall‑clock :00. The pool‑inactivity detection (60‑second timeout) would cause random disconnects, not synchronized hourly drops.

User‑Agent Distribution – Disconnected clients are predominantly NMMiner (closed‑source) and Nerd Miner (open‑source), indicating the issue spans multiple client implementations but is not exclusive to a single firmware. (<-- this is slightly meaningless as these are the predominant miners on my pool)

TCP Reset Mechanism – The pool’s nolinger_socket sets SO_LINGER with a zero timeout, causing a TCP RST when the pool closes a socket. This results in “Connection reset by peer” on the client side. However, the pool only closes sockets for idle clients, zombies, or errors—none of which match the observed idle times.

Has anyone else noticed this behavior on their pool? Is there something obvious I missed that could explain this?
Z3r0XG (OP)
Newbie
*
Offline Offline

Activity: 6
Merit: 2


View Profile
February 04, 2026, 11:17:44 PM
 #2

Out of curiosity I ran a tcpdump on the server around the :00 timeframe with 10 drops seen.

TCP RST packets: 15 total, with 5 concentrated within the first minute of the hour.
ICMP errors: Three “Destination unreachable (Host unreachable)” packets from one client.
No surge in TCP retransmissions.

I have no evidence of this being the cause, but the pools overlap in clientele in one specific way, esp32 devices - either as miners (mine and nerdminers.org) or ASICs controlled by esp32 devices (mine and ckpool.org). It's funny, I asked Kano if he saw anything similar on his pool and he stated he did not, assuming this is because he blocks everything with the word "Nerd" in it (not sure if Bitaxe is also blocked). Even though the Nerdminer firmware did not show anything bound to a real clock as far as timing is concerned, I believe the network stack sits outside this - maybe these devices have some bug in once an hour DHCP, which could could be bound to :00.
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!