The problem is that it is really not possible to check every new post for plagiarism because the cost of checking an additional post will grow for every additional post written. For example, if there are 100 posts that exist on the forum, the cost of checking a new post against all existing posts is 100 units. Once there are 1000 posts on the forum, the cost of checking a single new post against all existing posts is 1000 units. For each additional post made, it costs one additional unit to check a single additional post. This is obviously not sustainable.
Thanks for chiming in. Discussing these things is always interesting. You are talking about the time complexity of such a search and match algorithm.
Right. As the number of posts increase, so does the amount of time it takes to check one additional post.
You'd first need a set of master data with all possible 6 word snippets of text from all the existing posts. (provided someone is copying only from existing Bitcoin posts). This would then have to be compared with the set of snippets formed from every new post. While this could be done, I believe the space and memory requirements would be pretty huge.
You are describing one way in which all current posts could be checked for plagiarism (at least plagiarism by copying other users' posts).
What you describe is missing two things. Existing posts would not be checked for plagiarism, and if a post is written in the future and is subsequently plagiarized, the setup you describe would not catch it.