After much reading, still not clear on how a miner knows what transactions it should attempt to confirm. I know this:
* Transactions are broadcast.
* Miners queue them in a list of "pending" transactions.
So far, you've got an accurate understanding.
* At some point, a miner will take that list of pending transactions, create a hash, and then attempt to solve the hashing puzzle as proof-of-work.
Not exactly. You're close, but you're misunderstanding some of the finer details.
The miner (or pool) will choose which transactions they want to confirm from that list of transactions. They will choose an order for the transactions and compute a "merkle root" which is a hash that can be used to verify the order of the chosen transactions. That merkle root, along with some other important information is used to build a header for the block. They will then hash the header (using a double SHA256 hash) and check to see if the value of the resulting hash is lower than the current target difficulty. If it is not, then they will change something in the header (typically a value called a "nonce") and try hashing it again. They will repeat this process until they either receive a valid header from the network or they find a hash that is lower than the target difficulty. When either of those two things happens, they will start over again.
The nonce has 4294967296 possible values. If they have tried all the values, then they will typically either change the timestamp in the header, or change something in the chosen transaction list and re-calculate the merkle root.
* Once solved, the miner would broadcast what they've done
Correct.
and other miners validate their work and add the block of transactions to the block chain.
Not just other miners. All full nodes on the network will validate the work and the entire contents of the block before they raccept it or relay it to any other nodes or miners.
My questions about this are as follows:
1. Since the hashing puzzle is based on a hash of the transactions that are proposed to be included in the next block, the miner must determine when to start that work. The moment they start, they will, of course, be working from a block of N transactions. How do they know when to start? I mean, the number of transactions in each block is not fixed, right?
They start a new block as soon as they either solve the current block that they are working on, or as soon as they receive a valid block relayed from a peer. While they are working on the block they can, at any time they like, add more transactions and/or remove transactions from the list that they are attempting to confirm and then re-compute the merkle root.
2. While a miner is working on solving the hashing puzzle for the N transactions in the block they are working on, new (pending) transactions may arrive. So now the miner has N+1 transactions. Do they now just start over with a N+1 transaction block or does the miner ignore new transactions until (a) they solve the puzzle or (b) are notified that a new block has been added to the block chain?
Whichever they prefer. If the new transactions pay a higher fee per kilobyte than some of the transactions that they are currently working on, then they might want to remove some of the low fee transactions to make room for the higher paying transactions. There are no protocol rules about which valid transactions are included or when to change the list. Are you asking what most miners (or pools) choose to do?
3. If new blocks added cause miners to start over, and if the number of transactions in a block is somewhat arbitrary, it seems like a miner would constantly be starting over as new transactions & blocks are broadcast, preventing them from ever getting any work done. I guess somebody, somewhere, is getting work done but just trying to view this from a miner's perspective.
There is never any progress made on the work being done.
As an analogy:
Think about the proof-of-work like rolling 10 six-sided dice. I tell you that the current difficulty is to roll 7 sixes in a single roll. You pick up all 10 dice and toss them. If there are 7 (or more) sixes, you've completed the proof-of-work and get to "win the prize". If there are not 7 (or more) sixes, then you pick up all the dice and throw them again. You repeat this process until either you roll 7 sixes in a single roll or you see that one of your competitors successfully rolls 7 (or more) sixes.
There is no "progress" being made. You could succeed on the first try, or it could take all day.
Once someone "wins" the "game" everyone starts over again.
In bitcoin, the hashing is like the rolling of the dice. There is no way to know what the result will be until after it has been computed. If the miner doesn't "win", then they just do it again (with a different input), and again, and again until either they win or someone else does. The result is so unpredictable that you can treat it as random for the purposes of a proof-of-work. "Winning" just means that you get to broadcast your block and that others will add it to their blockchain. Then everyone starts the "game" of calculating hashes again (with a new block).