..what happened to OP? Did he give up on his billion-dollar idea?
No, I am still here. A while ago this whole board got crashed when some kind of a hack happened and I was devastated that my whole thread was lost. I couldn't find my thread at all, whole whole site was down. Recently, while talking to some others about this wonderful but lost thread where I got schooled by all of you for trying to imagine something awesome, I did an internet search and by coincidence this very thread showed up in the search results! I couldn't believe it when I opened the thread and found this website back and my thread back. Now I'm copying out all of this text and saving it for future reference (well, except for the many many flames I got, who needs to remember that!?) Your explanations were quite invaluable to me in understanding.
However, some of you are just too ardent in believing that this is impossible and cannot be done, and I still believe that encoding can, if done correctly, result in a significantly smaller filesize than the original file. And I've devised a method of showing it might work. Essentially, it works like this:
Layer 0:
Input Any File and Convert whatever language into binary. Now grab the binary data in chunks of 4.
Layers 1 & 2:
http://imageshack.com/a/img46/6823/b4g0.jpgConvert Using the Table Above. Initially, yes, this does double the size of the file (since we'll be taking ONE 8-bit character and turning it into TWO 8-bit characters), but it must be done to begin the process of formatting into a recognizable structure. Each layer has its own engine and rules that must be followed.
All Binary is converted into Ascii characters (inside a text document), letters small-a "a" to Large H - "H" aAbBcCdDeEfFgGhH for layer 1. Now ALL of the Binary data is converted into letters a to H (small and Large) characters. And that is saved into a text file.
Now in Layer 2 (which is actually a group of sub-layers needed to convert all the Layer 1 "text" data) out of Layer 1 form into Layer 2 Form, so all the data is in an entirely new form, small i "i" to Large Z going iIjJkK .. to .. zZ. Its too hard to explain it all right here, so I'll move on for now, but this process is rather complex and lengthy, but must be done before going on to Layer 3.
Layer 3: Another Grouping of Layers dedicated to changing all the i-Z data from Layer 2 into the numbers and symbols used in Ascii.
http://imageshack.com/a/img189/6539/9j3f.jpgThis complex system slowly painstakingly changes all Layer 2 data into Layer 3 data, where now all the data is now symbols and numbers.
Now at this point, you have 3-point system for changing all the data interchangeably back and forth, but what does that do? How can that encode (and shrink data)?
Here is how: Imagine a Crossword Puzzle 20 spaces long, where the previous data sets were sorted into the cells, so that every 21st index forward (drops down and back to the extreme left cell) and is now under the space above it. So every 21st space, you have a new row. Now, the software looks for patterns according to the Layer Tables I've drawn out above. They are not complete tables, as I've said the tables are multi-layered to account for all of the variables in the data sets. Let's say you had this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
E g A D f D B g H a c A C d c F d E H A
D B b C F g d C d a F a a C F c c B c h
f b a D A H g h h C b H d c C d G F f E
3 Rows of 20 Columns. Look at Row 1 and Column 10, and then look below it at Row 2 Column 20. Both are "small a"s. They can be encoded by exchanging Row 1 Column 10 with a reference from Table 2 above. See the Table 2, where it says "aa" and below it there is a "i" ? So then this happens:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
E g A D f D B g H i c A C d c F d E H A
B b C F g d C d F F a a C F c c B c h f
b a D A H g h h C b H d c C d G F f E
The 2nd row "a" below the new "i" above it is deleted, and the entire sequence proceeding it is shifted left one space through the entire table.
When all of the data that can be encoded is removed, you begin to see the compression (or encoding whatever you would call it).
The whole thing is based on the Ascii Character Table, which could be used to reference itself. When the program is going in reverse mode (extracting out the data) it comes to the "i" in Row 1 Column 10 and sees the "i" as "aa" (one "a" in the space where the "i" is and one "a" in Row 2 Column 20, directly under it. First it shifts the whole table forward by one to make a space for the "a" to be added back in there."
This is a really tiny explanation, but it should be enough for the majority of you to see if this logic is feasible. If you are changing two pieces of data into one based on how it spatially aligns in a crossword puzzle, then you can how using spatial alignments is how you can throw out one piece of data and yet still be able to recover it.
It's like having a bunch of people, with random normal names, stand in a line. And then you say if there are two "Jim's" standing in the crowd, we can take them both out and replace them with a "Jimovious" a name no one else will have in that group. Now we have shrunken the number of bodies in the crowd by 1. But if asked to reassemble the original crowd, we just say throw Jimovious out and replace with 2 Jim's. The engine knows where to put them in the crossword puzzle, because it would not have encoded them in the first place unless they had aligned directly above and atop each other. So the engine knows to read two lines at a time, comparing IF Row1Caret=Row2Caret then DO REPLACER(). Something like that. Do you get the idea?
Please friends ... let me know your thoughts.