datrus (OP)
Newbie
Offline
Activity: 9
Merit: 3
|
 |
February 03, 2025, 10:23:07 AM Last edit: February 16, 2025, 02:29:40 AM by datrus |
|
I’m experimenting with a proof-of-work design that leverages tensor operations. The goal is to create a PoW algorithm where mining hardware could also be effective for AI/ML tasks. In theory, this could allow mining equipment to be repurposed for AI workloads when not mining—and might help decentralize AI compute resources. I’m particularly interested in feedback on: The technical feasibility of designing a PoW that benefits from tensor operations Potential challenges in aligning performance between mining and AI/ML tasks Any ideas on how to further ensure that hardware used for mining has real utility outside of cryptocurrency mining I’ve published some early-stage code and documentation on GitHub here: https://github.com/tenscoin/tenscoin https://github.com/nf-dj/robocoin. I’d really appreciate any constructive feedback or thoughts on the approach. Thanks in advance!
|
|
|
|
ABCbits
Legendary
Offline
Activity: 3332
Merit: 8987
|
 |
February 04, 2025, 09:45:51 AM Merited by vapourminer (1) |
|
Some thought and comment, 1. I saw this page when tried visiting link you shared.  2. Have you done research about Proof of Useful Work? 3. Since there are already hardware dedicated for AI/ML task, people who can mine the coin profitably is the creator and owner of those hardware.
|
|
|
|
datrus (OP)
Newbie
Offline
Activity: 9
Merit: 3
|
 |
February 04, 2025, 10:18:15 AM Last edit: February 16, 2025, 02:29:54 AM by datrus |
|
sorry, github blocked the account just now. guess they think trying to pump a scam coin or smth like that. (to be clear, this is an experimental repo, coins are not for sale etc) I reuploaded to this different account: https://github.com/nf-dj/robocoinYes I researched proof of useful work before. Here it's a bit different and simpler imo, just trying to make a proof of work based on the ILP problem. (meaning A.x is hard to invert when x is binary). Trying to design the pow such that if miners are optimized for that pow, they are also optimized for the type of computation needed for AI workloads. (rounds of matmults+nonlinerarity, with big weight matrices) So mining doesn't produce useful work directly in itself. It's just that if someone optimizes the hw and infra to mine this pow (designs custom chip, memory etc), it aligns with the same type of computation needed for AI (bc deep learning etc is based on same type of deep rounds of matmults). Meaning the same hw/infra can be reused (but not at the same time as mining, bc the pow uses random weight matrices seeded from the block header). Imo still a benefit compared to all the effort and energy going into sha256 mining, and should be relatively simple to implement, but any feedback welcome.
|
|
|
|
gmaxwell
Moderator
Legendary
Online
Activity: 4480
Merit: 9524
|
 |
February 05, 2025, 01:34:55 AM |
|
There are many important properties of POW that need to be maintained that make it extremely hard to use some other 'useful' work for it. POW generally needs to be progress free, optimization free, approximation free, trivial to generate problem instances, and trivial to validate. Virtually all 'useful' functions violate one or more of these criteria.
But much more significantly making proof of work have value outside of the system undermines its purpose for consensus.
The idea behind POW consensus in Bitcoin and similar systems is that if you expend energy to create a block and that block doesn't end up in the eventual consensus chain (because you were mining off a fork or making a consensus invalid blocks-- e.g. attacking) then the energy (and the cost of that energy) is wasted.
If the POW is independently valuable work then an attacker can still gain that value while producing blocks that won't end up in the ultimate chain. So the security provided by mining is only proportional to the cost of the energy *above* whatever side-effect value the mining has.
Perhaps you could say that some side effect value would be neutral if the mining was otherwise just as good, but inevitably other important properties (like optimization freeness and/or approximation freeness... or just validation cost) get compromised making it a loss.
|
|
|
|
datrus (OP)
Newbie
Offline
Activity: 9
Merit: 3
|
 |
February 05, 2025, 02:19:41 AM Last edit: February 12, 2025, 11:15:09 AM by datrus |
|
"The idea behind POW consensus in Bitcoin and similar systems is that if you expend energy to create a block and that block doesn't end up in the eventual consensus chain (because you were mining off a fork or making a consensus invalid blocks-- e.g. attacking) then the energy (and the cost of that energy) is wasted."
=> think I understand this important point, however with the pow i'm experimenting, miners don't produce useful work while mining. however, the pow implies that miners optimized for mining (chips etc) are also optimized for deep learning workloads. i mean, for any given period of time they have to choose between mining and doing useful (ai) work, the same energy is not used for both. imo, think might still be useful, bc incentivizes development of chips, memory etc for mining that can also have some other use (for ex when the chip gets replaced by newer generation and is no longer profitable for mining). so what i mean is that if miners decide to mine using that pow, the energy during that period of time will still be "wasted" and so doesn't hinder consensus afaik (let me know if i'm wrong). distinction is that usage of that pow could potentially help development and dissemination of chips that can also be used for ai etc. (energy saved is not the one spent mining, but external one spent optimizing miners such as custom hw, infra etc and allowing reuse when non profitable mining)
not related to previous point, but technically the pow is implemented using multiple of rounds of mat mults (using ternary weights for simplicity, same kind of computation used in 1.58bit llms for ex) and noise bias derived from nonces and security relies on the ILP problem.
|
|
|
|
ABCbits
Legendary
Offline
Activity: 3332
Merit: 8987
|
 |
February 05, 2025, 10:13:59 AM Merited by vapourminer (1) |
|
sorry, github blocked the account just now. guess they think trying to pump a scam coin or smth like that. (to be clear, this is an experimental repo, coins are not for sale etc) I reuploaded to this different account: https://github.com/nf-dj/tenscoinGitHub is being weird here, when obviously fake/shady repo about Bitcoin wallet rarely removed. So mining doesn't produce useful work directly in itself. It's just that if someone optimizes the hw and infra to mine this pow (designs custom chip, memory etc), it aligns with the same type of computation needed for AI (bc deep learning etc is based on same type of deep rounds of matmults). Meaning the same hw/infra can be reused (but not at the same time as mining, bc the pow uses random weight matrices seeded from the block header). Imo still a benefit compared to all the effort and energy going into sha256 mining, and should be relatively simple to implement, but any feedback welcome.
I think i get your point now. Price of old mining ASIC (which have lower efficient) is much lower than it's initial price, but i see people willing to buy/rent older generation of CPU/GPU. But in practice, hardware which can be re-purposed for AI/ML operation alone isn't enough. People want good software support, which is reason Nvidia GPU is extremely popular when hardware dedicated for AI/ML exists.
|
|
|
|
datrus (OP)
Newbie
Offline
Activity: 9
Merit: 3
|
 |
February 05, 2025, 01:05:24 PM Last edit: February 06, 2025, 12:33:40 PM by datrus |
|
Yes agree that mining hardware that is optimized for this pow won't replace completely nvidia gpus which are much more versatile etc. Also the pow is kept deliberately very simple (conceptually), just deep rounds of ternary matmuls (same kind of computation used for inference of 1.58bit llms, another reason use ternary weights is that the accumulator in matmuls fits within fp16/bf16 mantissa bc weight matrices in the pow are only 256x256 and bf16 has 7-bit mantissa, so it's convenient to run on existing hardware, most of existing don't support ternary weights). Also activation layers are different (ai usually uses multiples types like relu/gelu/softmax etc, here the pow is just using a relu layer). Intention is to make the type of computation similar enough that optimized miners would be ai-capable and vice versa. (for ex non-gpu hardware like groq would also be very efficient at mining this pow, due to the type of computation it's good for: heavy matmuls, high bandwidth data streams, and lot of on-chip memory)
|
|
|
|
BlackHatCoiner
Legendary
Offline
Activity: 1792
Merit: 8677
|
 |
February 08, 2025, 04:21:57 PM |
|
If what you're looking to accomplish is create an incentive for developing a better chip for AI, then that is already existent. As we speak, the big tech companies like Nvidia have employed top engineers for improving this particular kind of efficiency. The incentive does not have to come from a process like bitcoin mining.
"Miners" in the AI era are people with GPUs that can rent them. The "currency" they mine is compute tokens.
|
|
|
|
datrus (OP)
Newbie
Offline
Activity: 9
Merit: 3
|
 |
February 09, 2025, 06:07:29 AM Last edit: February 12, 2025, 11:15:35 AM by datrus |
|
Agree there are already plenty of incentives to produce more efficient and powerful ai hardware. With this project, experimenting to design a pow that "forces" miners to also be capable of the same kind of "useful" computation. (and keeping the pow as simple as possible, not specific to particular type of gpu etc, so that future type of hardware good at matmul rounds will even be better at mining, for ex future photonic or analog chips etc specialized for this type of ai compute. in other words trying to make the pow specific to type of computation needed for ai, but not tied to specific hardware that is currently best for it)
Afaik previously litecoin attempted to prevent asics by using scrypt for pow and being memory bound, to try to keep mining more decentralized, but asics were still developed for it that also serve no other purpose than mining and this contributes to centralization. Here, trying to make it so that if miners are optimized for this pow, they are necessarily also capable of useful "ai" computation, regardless if using custom asic developed for that pow or not. (in fact if make better asics for it all the better, their utility should favor their wider dissemination)
In order to achieve this, the pow is computing rounds of (ternary) mat mults, and trying to adjust biases and non-linerarity such that computation is both hard to reverse and useful for ai. By making it necessary for miners to also be capable of useful work (though not at same time as mining), maybe this can also help prevent decentralization of mining power (like LTC aimed to achieve).
|
|
|
|
NotATether
Legendary
Offline
Activity: 2058
Merit: 8801
Search? Try talksearch.io
|
 |
February 09, 2025, 03:31:27 PM |
|
Mining is not ML-centric
Mining in most cryptocurrencies involves doing some hard computation many times over a given period to try to find a soution.
It is not trying to approximate the solution to an input problem using neurons or anything like that.
So it is fundamentally out of scope for a crypto mining project to attempt to utilize AI.
The results are evenly spread out, so using probability theory is futile.
|
|
|
|
datrus (OP)
Newbie
Offline
Activity: 9
Merit: 3
|
 |
February 09, 2025, 04:36:13 PM Last edit: February 10, 2025, 06:08:40 AM by datrus |
|
Agree that utilizing AI to mine PoW of existing crypto currencies doesn't make any sense. (for ex sha256 used in btc is designed so that it's computationally irreducible and so can't use ai to find a nonce faster than a miner that is just randomly trying out nonces).
But the thing I am experimenting is not using AI to mine existing PoWs. It's to design a new PoW so that the forward pass (the hash, equivalent to sha256 in btc) is similar to a forward pass (inference) in a deep neural network. Afaik not obvious that this is not technically feasible (let me know why if that's wrong). The point of making the forward pass similar to inference in a neural net is so that miners that are optimized for that pow are also optimized at dnn inference. Afaik such a pass can be deterministic, not approximate or probabilistic. Mining is then finding inputs to the neural net such that the output satisfies some target difficulty. Afaik, it should be possible to make the layers of the net so that it satisfies the same properties as a hash (evenly spread out etc). In fact many existing hashes and ciphers are also based on rounds of diffusion (can be linear and permutations) and confusion (for ex s-boxes). What i'm experimenting is implementing these layers of diffusion and confusion using neural network primitives (so that miners are forced to be efficient at running those primitives).
At the moment the way it works is model weights are generated randomly from the block header (256x256*64 weights). And then miners can search for nonces such that the output of the network satisfies target (for example by building a pytorch model initialized with random weights derived from the block header and then doing batched inference in the mining loop until find the right output). The non-linearities (activation layers) between rounds are such that rounds are not easily reversible, evenly spread out etc. (same properties as a one way hash)
So the PoW doesn't involve any approximations or probalistic solutions, but just in experimenting with new pow such that computing the forward pass (hash) involves the same kind of operations as those needed for deep learning inference (so that optimized miners are also useful for that task). The hash used in that PoW won't be any better than sha256 (in fact won't be a good general purpose hash bc it's using more compute in forward pass, no reason to use neural network primitives if just want to implement a general purpose hash), but should have similar properties and difference is that would require miners to be good at work that looks like deep learning inference. (matmults, accessing weights etc).
|
|
|
|
zeuner
Member

Offline
Activity: 240
Merit: 22
|
 |
February 19, 2025, 01:12:41 AM Merited by vapourminer (2) |
|
The technical feasibility of designing a PoW that benefits from tensor operations Potential challenges in aligning performance between mining and AI/ML tasks
How did you plan to achieve the characteristic of being expensive/upscalable to compute (mine), and cheap and quick to validate? Also, how to avoid having vast amounts of data in the blockchain? A rather natural design in the given context might be based on training model parameters on some input data towards sufficiently low loss, so block validators only need to infer and check. But for alignment with real AI/ML workloads, the input and model parameter data would be huge.
|
|
|
|
datrus (OP)
Newbie
Offline
Activity: 9
Merit: 3
|
 |
February 23, 2025, 06:52:24 AM |
|
Validation corresponds to a single inference pass in neural network terms. (which is fast) Mining is much harder and involves millions or more of inferences passes (depending on difficulty). For ex the genesis block was mined using about 16M inference passes (24 leading zeros difficulty), which took about 5min at 4 TOPS (used coreml and apple neural engine of my macbook m1 to mine using this script: https://github.com/nf-dj/robocoin/blob/main/test_pow/pow_coreml.py). Reason is that inference pass can't be easily reversed (mining problem corresponds to a ILP, integer linear programming problem). More technical details about the pow is explained on this page: https://github.com/nf-dj/robocoin/blob/main/tens_pow.md. This pow doesn't imply there is more data on the blockchain (there is no more vast amount of data on chain than for btc). Reason is because the random matrix weights used for mining are derived from the block header using chacha20. (so only block header is on chain, like for btc, not the full matrix weights used for mining). Let me know if need more clarification, especially interested in validating the security of the pow, thanks.
|
|
|
|
tromp
Legendary
Offline
Activity: 1010
Merit: 1139
|
 |
February 23, 2025, 07:55:09 AM Merited by vapourminer (1) |
|
How did you plan to achieve the characteristic of being expensive/upscalable to compute (mine), and cheap and quick to validate?
Validation corresponds to a single inference pass in neural network terms. (which is fast) Mining is much harder and involves millions or more of inferences passes (depending on difficulty).\
In other words, this is not a new Proof-of-work algorithm. It's still the Hashcash POW used in Bitcoin, that computes a deterministic value for any given block header using some hash function, and compares it against a target value that's inversely proportional to difficulty. The difference is in the hash function used for Hashcash. Instead of Bitcoin's SHA256d, this proposal uses a tensor (i.e. matrix multiplication) based hash function. It's not an asymmetric PoW like the memory-hard Cuckoo Cycle or Equihash, where verification differs from (and is WAY faster than) a solution attempt.
|
|
|
|
datrus (OP)
Newbie
Offline
Activity: 9
Merit: 3
|
 |
February 23, 2025, 11:02:48 AM Last edit: February 23, 2025, 01:23:15 PM by datrus |
|
What is new is that it forces optimized miners to be good a neural net inference. (some reasons to do so are explained here: https://github.com/nf-dj/robocoin, not aware other pows attempting to do this) To do so the hashing function is replaced by rounds of ternary matmuls, each round being NP hard to reverse ( https://github.com/nf-dj/robocoin/blob/main/tens_pow.md, verification is polynomial, solution attempt is exponential) In the future imagine robots with powerful NPUs mining during their sleep haha (more "smartness" tops == more mining power). (would have no chance mining sha256d or memory hard pows agains asics or even small fpgas, with this pow optimizing miner == optimizing nn inference, which is arguably "useful" type of compute, compared to "useless" type of compute consisting of optimizing sha rounds or random large memory accesses)
|
|
|
|
tromp
Legendary
Offline
Activity: 1010
Merit: 1139
|
 |
February 23, 2025, 06:15:40 PM |
|
verification is polynomial, solution attempt is exponential
Wrong; a solution attempt is exactly one iteration of this loop in GenerateBlock [1]: while (max_tries > 0 && block.nNonce < std::numeric_limits<uint32_t>::max() && !CheckProofOfWork(block.GetPoWHashPrecomputed(ctx), block.nBits, chainman.GetConsensus()) && !chainman.m_interrupt) { ++block.nNonce; --max_tries; }
which computes a single tens_hash, just as verification does. The only thing potentially exponential is doing it max_tries times. [1] https://github.com/nf-dj/robocoin/blob/main/src/rpc/mining.cpp#L146-L151
|
|
|
|
datrus (OP)
Newbie
Offline
Activity: 9
Merit: 3
|
 |
February 24, 2025, 02:23:27 AM Last edit: February 24, 2025, 02:48:12 AM by datrus |
|
No, here is an explanation of this part of code: This code is from original bitcoin: https://github.com/bitcoin/bitcoin/blob/master/src/rpc/mining.cpp. while (max_tries > 0 && block.nNonce < std::numeric_limits<uint32_t>::max() && !CheckProofOfWork(block.GetHash(), block.nBits, chainman.GetConsensus()) && !chainman.m_interrupt) { ++block.nNonce; --max_tries; }
(what it does is iterate on nonce to find a nonce that satisfies the PoW, this takes exponential time obviously since it's brute forcing on nonce space) This is the code in my fork (robocoin) that replaces the PoW: https://github.com/nf-dj/robocoin/blame/main/src/rpc/mining.cpp#L146-L151uint256 seed = block.GetHash(); TensHashContext* ctx = tens_hash_init(seed.begin()); if (!ctx) { return false; }
while (max_tries > 0 && block.nNonce < std::numeric_limits<uint32_t>::max() && !CheckProofOfWork(block.GetPoWHashPrecomputed(ctx), block.nBits, chainman.GetConsensus()) && !chainman.m_interrupt) { ++block.nNonce; --max_tries; }
The only difference in that part of code is that it calls tens_hash_init before the loop, because this is needed to initialize the random ternary matrices used during the PoW. (would not be efficient to allocate and initialize them in the PoW loop, matrices are seeded from the block header hash and remain constant during the mining loop) This is also exponential same as in btc, bc also bruteforcing nonces in same way. Difference is that the forward pass is neural network inference (rounds of ternary matmults), instead of rounds of sha. Cpu implementation of the hash used in pow loop is here: https://github.com/nf-dj/robocoin/blob/main/src/crypto/tens_pow/tens_hash.cppAlso note that this part is just to make bitcoind/robocoind node able to mine (so can mine with bitcoin-cli etc), but this miner implementation is not efficient because only using cpu. Also implemented some more optimized miners using gpu and npu here: (keep matrix/neural net weights in npu/gpu mem, use inference batching etc) https://github.com/nf-dj/robocoin/blob/main/test_pow/pow_coreml.pyhttps://github.com/nf-dj/robocoin/blob/main/test_pow/pow_pytorch.pyThese implementations use coreml and pytorch respectively to mine much faster (since bottleneck of the pow is neural net inference) and connect to node using rpc. The coreml version is needed to make use of ANE (apple neural engine) on macs, since pytorch only supports gpu/metal. In bitcoin mining, the bottleneck is forward passes of sha256 rounds, in my experimental fork, trying to make minimal changes so that the bottleneck is forward passes of neural net inference. (reason is bc this is arguably more "useful" work to optimize miners to be good at using energy for) Hope this helps to clarify why the mining loop code in GenerateBlock is exponential run time relative to the target difficulty, just as it is in the btc case.
|
|
|
|
|