Bitcoin Forum
December 15, 2024, 10:41:41 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [17]  All
  Print  
Author Topic: Crypto Compression Concept Worth Big Money - I Did It!  (Read 13900 times)
B(asic)Miner (OP)
Newbie
*
Offline Offline

Activity: 28
Merit: 0


View Profile
December 11, 2013, 11:32:22 AM
 #321


And you have inadvertently undermined your own point.
"Jimovious" takes more space than "Jim" twice.



No, you just got confused as to the label.  Two "aa" become "i"   This is a 50% reduction.  I purposely made the Jim name larger as Jimovious to point this fact out and you fell into my trap.  But the point here is that in Layer 2, we will ONLY have labels that refer to combinations of 2 which came from Table 1, so they are unique like Jimovious, it wasn't about the size of the name I assigned it, it was about it being unique.

In Layer 2, "i" only means "aa" when extracted out.  And it tells the software where exactly the "aa"s go due to the Crossword Grid alignment.

And yes, you can keep sending the data back into itself recursively (which I know seems impossible) as long as you carefully make sure that there are absolutely NO mistakes in the rulesets that refer the 2 pieces down to the 1 unique piece in that Layer.  It's about the spatial arrangement.  The way space-time exists in 3-D space physically, and in the time-space 4D timespace as slices.   Space-time has physical space separated by time.  But Time-space has time locations separated by space.  So what we are doing here is attempting to put the data into space.  It all still exists, its just spread out in time and only appears to be shrunken down.

The point is that you must have 1 unique character that replaces 2 non-unique characters.  I'm still working on making sure I can do that 100%.  The interesting thing is that if you look at Table 1 above, referring to converting all binary, there are only 16 possible combinations of 4-bit sequences.  So all Binary data can be converted into Ascii first, then arranged into Crossword grid in certain chunk sizes, then crunched Layer by Layer to practically nothing through incursion (or is it called recursion?)  
B(asic)Miner (OP)
Newbie
*
Offline Offline

Activity: 28
Merit: 0


View Profile
December 11, 2013, 11:38:44 AM
 #322

OP just let it go mate, move on.

I'll tell you what, I will decide when to quit.  And you can decide when to quit telling me to quit.  And we will see who quits first.  Deal?
B(asic)Miner (OP)
Newbie
*
Offline Offline

Activity: 28
Merit: 0


View Profile
December 11, 2013, 12:37:46 PM
 #323

Hello Everyone,

The Achievement:
I have made an amazing discovery and spent a year trying to talk everyone I know into helping me with my solution for compressing 99.8% of all data out of a file, leaving only a basic crypto key containing the thread of how to re-create the entire file from scratch.
-snip-

Patent / Open Source or it didnt happen.

What you describe sounds like a very primitive compression method with a bit of "the engine just knows" magic.

Here is the whole thing boiled down a succinct as I can make it.   There's no magic here. 

Import data, convert to binary.  Save.  Re-import binary, convert to Layer 1, letters a-H (aAbBcC... etc) according to the Table above.  The file will double in size.  Save.  Re-import and begin Layer 2.  Encoding will begins here.  Encode 1st chunk of 1024K using Crossword Grid alignment, 20 spaces per row.  Fill Array with data from Layer 1.  Now search data from current row, 2 rows at a time simultaneously, 20 spaces apart (essentially on top of each other in the crossword puzzle grid.  Let's call that the CWG from now on.  When a match is found, replace topmost row's match with unique identifier that means the same thing to the 2nd Layering engine as the match found.  For example "aa" is replaced by "i"  Now delete the bottommost match, leaving an empty space (the compression/encoding).  Now shift the whole array to the right, displacing that empty space, which now becomes a zero at Row 1 Column 1.  Now continue forward.  Find all matches until the last line is (the last 20 cells are) reached.  At the last line, no more compression can be done because there isn't a line under it to compare to.  What we are left with here is essentially a boiled down key.  Nothing more can be done with this chunk.  That key is saved inside the file we are building as such:

1004(0)_GbcDeEafFBAAcbeEBDfga_6(0)

Where the first block above is the topmost last line to the halfway point of the line and the 2ndmost line is the bottommost line to where it ends halfway (20 spaces total combined) and the final part is a message to software that 1004 zeros precede those two keys.  Now the engine can get rid of the 0's, leaving that small chunk to retrieve all the data later.

It would do so like this .... 

0000000000000000000
0000000000000000000
0000000000000000000
0000000000000000000
0000000000000000000
0000000000000000000
GbcDeEafFBAcbeEBDfga000000

Now the engine counts how many zeros are in the block before the first actual piece of data to know how many iterations it had done to reach that final sequence.  It counts the zeros, here we see 120 zeros total.  The engine is told the key goes on the last line plus the number of zeros (empty spaces) that were left over. Now having figured out how may iterations to start from backwards, it begins comparing data back out, starting by re-ordering the entire sequence to the left.  So the key would actually look like this:

0000000000000000000
0000000000000000000
0000000000000000000
0000000000000000000
0000000000000000000
0000000000000GbcDeE
afFBAcbeEBDfga000000

Which we can easily see the last line because the G and the a are not overtop of each other, meaning this is indeed the true last line.  Depending on how good the compression is, the last line can occur anywhere in the block, as such :

0000000000000000000
0000000000000000000
0000000GbcDeEafFBAc
beEBDfga00000000000
0000000000000000000
0000000000000000000
0000000000000000000

As long as the block is totally intact and none of the pieces overlap, it is complete.  The reason we need to know how many empty spaces were left at the end is so we can separate how many iterations occured with how many empty spaces were left, since not all the zeros here mean iterations. 

I hope you can see this as clearly as I see it in my head.  Its efficient and would totally work.  I hope you will be able to see that by studying this.
Bitcoin Oz
Hero Member
*****
Offline Offline

Activity: 686
Merit: 500


Wat


View Profile WWW
December 11, 2013, 01:17:17 PM
 #324

So you invented the Tardis from dr who ? lol

BurtW
Legendary
*
Offline Offline

Activity: 2646
Merit: 1138

All paid signature campaigns should be banned.


View Profile WWW
December 11, 2013, 01:26:23 PM
 #325

It looks like you are still having a lot of fun thinking about this.  Good for you.

Our family was terrorized by Homeland Security.  Read all about it here:  http://www.jmwagner.com/ and http://www.burtw.com/  Any donations to help us recover from the $300,000 in legal fees and forced donations to the Federal Asset Forfeiture slush fund are greatly appreciated!
murraypaul
Sr. Member
****
Offline Offline

Activity: 476
Merit: 250


View Profile
December 11, 2013, 01:43:57 PM
 #326

Quote from: B(asic)Miner lnk=topic=288152.msg3918879#msg3918879

And you have inadvertently undermined your own point.
"Jimovious" takes more space than "Jim" twice.



No, you just got confused as to the label.  Two "aa" become "i"   This is a 50% reduction.  I purposely made the Jim name larger as Jimovious to point this fact out and you fell into my trap.  But the point here is that in Layer 2, we will ONLY have labels that refer to combinations of 2 which came from Table 1, so they are unique like Jimovious, it wasn't about the size of the name I assigned it, it was about it being unique.
If you have X possible input characters then you have X^2 possible combinations.
So you have reduced the number of characters, but increased the storage required for each one.
There is no overall saving.

You need to accept the basic problem:
X distinct input files must generate X distinct output files, otherwise you cannot retrieve the original files.
Do you agree or disagree with this?

BTC: 16TgAGdiTSsTWSsBDphebNJCFr1NT78xFW
SRC: scefi1XMhq91n3oF5FrE3HqddVvvCZP9KB
kanus1113
Sr. Member
****
Offline Offline

Activity: 452
Merit: 250


View Profile
December 11, 2013, 01:58:53 PM
 #327

I've thought about similar methods of compression... In the end, the issue isn't whether you can compress data down to almost nothing, its how much time it takes to decompress the data. Compressing data down is only part of the problem.
B(asic)Miner (OP)
Newbie
*
Offline Offline

Activity: 28
Merit: 0


View Profile
December 11, 2013, 02:12:32 PM
 #328

Quote from: B(asic)Miner lnk=topic=288152.msg3918879#msg3918879

And you have inadvertently undermined your own point.
"Jimovious" takes more space than "Jim" twice.



No, you just got confused as to the label.  Two "aa" become "i"   This is a 50% reduction.  I purposely made the Jim name larger as Jimovious to point this fact out and you fell into my trap.  But the point here is that in Layer 2, we will ONLY have labels that refer to combinations of 2 which came from Table 1, so they are unique like Jimovious, it wasn't about the size of the name I assigned it, it was about it being unique.
If you have X possible input characters then you have X^2 possible combinations.
So you have reduced the number of characters, but increased the storage required for each one.
There is no overall saving.

You need to accept the basic problem:
X distinct input files must generate X distinct output files, otherwise you cannot retrieve the original files.
Do you agree or disagree with this?



I don't believe that I am able, at this time, to accept your basic problem (premise).  I would be like the Wright Brothers agreeing with people who said flight was altogether an impossible thing meant for birds and if God had wanted Man to fly, he would have given him wings.  

If I were to accept your basic premise, I would have to say that one word cannot possibly mean anything more than just one thing.  Therefore, the word "bow" cannot mean to bend over at the waist, when it clearly means a piece of ceremonial cloth tied at the neck.  And there goes the other half of the "____ and arrows" equation.  One input can in fact stand in place of many outputs, its all based on relative meaning.  Space.  Whether that space is in our imaginations or in the real world.  We can clearly simultaneously understand that "bow" means bow and arrow, bow tie, and take a bow.  One input many outputs.  There are millions of guys named "Smith"  ... by your logic, there cannot be more than one Smith, since 1 input = 1 possible output.  

By using tables, one can generate meanings for given sets of data.  I can say in my Table that "aa" = "i"  ... now let's say there are 50,000 matches in a 100,000-text piece of data.  That implies 100,000 total pieces of data that can be compared, being reduced from 2 pieces to 1 piece.  I save the data out using software based on this system, then it's now half the size it once was.   The filesize is now 50,000 characters in bytes (whatever that amounts to) but was 100,000 characters in bytes.  How can you tell me that it was not cut in half by my method using logic, when its now clearly half the size it was (hypothetically)!?

You say we cannot take 2 bits and represent them with only 1 bit.  Perhaps that's not true.  As things stand, a bit is merely a single register either on or off.  But some hardcore engineers have demonstrated that a gate can float halfway between open and closed, creating a third state, a fuzzy state.  So what if we designed a voltage subroutine into the programming of the compression/encoding scheme that was able to read the voltage level of a bit?  Now we see a bit can float more to the left or more to the right.  Something like this:   0    |  /  1   being 30/70    or    0  \ |   1  being 70/30.   So now we can say If the 2 bits are 00, they are 100% right.  But the bits are 01 they are 30/70% in favor of the right.   And if they are 10 they are 70/30% in favor of the left.  And if they 11 they are 100% left.  Thus, that one bit now has 4 states to represent 2 bits in place of 1 bit.  

Its still one bit, it still only holds one bit of info.  But we found a way to read that bit differently (bow-arrow, bow tie, take a bow, bow of the ship) despite the bit holding no extra space.  We used another method to change what the bit could do.  We created a reference where no reference previously existed.  That's all I'm trying to do with this new theory.


kanus1113
Sr. Member
****
Offline Offline

Activity: 452
Merit: 250


View Profile
December 11, 2013, 02:37:49 PM
 #329

I also come up with a solution for the decompression time, so here is my solution and the next problem.

This may not be the best explanation, but I'll try. Imagine a user downloads the tiny compressed file that represents 10 gb of data. Now imagine writing that 10 gb of data to a hard drive hundreds of times in order to get the true 10 gb of data. During that time your using full resources of your computer or device, rendering it somewhat useless.

So of course I couldn't get that far and not come up with a solution. So what if in the new world we had devices that intercept the information and decompress 1 layer at a time, and each layer of decompression streamed it to the next layer of decompression. Now we are no longer compressing files, but internet data as it enters your home. Making it possible to download the 10gb file as if its much smaller.

Well, the problem here is... The tech has already been invented and utilized.
murraypaul
Sr. Member
****
Offline Offline

Activity: 476
Merit: 250


View Profile
December 11, 2013, 03:09:36 PM
Last edit: December 11, 2013, 04:25:56 PM by murraypaul
 #330

You need to accept the basic problem:
X distinct input files must generate X distinct output files, otherwise you cannot retrieve the original files.
Do you agree or disagree with this?

If I were to accept your basic premise, I would have to say that one word cannot possibly mean anything more than just one thing.  Therefore, the word "bow" cannot mean to bend over at the waist, when it clearly means a piece of ceremonial cloth tied at the neck.  And there goes the other half of the "____ and arrows" equation.  One input can in fact stand in place of many outputs, its all based on relative meaning.  Space.  Whether that space is in our imaginations or in the real world.  We can clearly simultaneously understand that "bow" means bow and arrow, bow tie, and take a bow.  One input many outputs.  There are millions of guys named "Smith"  ... by your logic, there cannot be more than one Smith, since 1 input = 1 possible output.  

You are almost there.
What you have shown is that "Smith" is not a unique compression output for a person whose surname is Smith, because many many people would also be compressed to the same name. So Person->Surname is a lossy compression method, because it is impossible to retrieve the original person from just the surname.
Similarly, natural language is a lossy compression of the objects or ideas it represents. Given just the word "bow", I cannot know if you meant bow and arrow or take a bow.
Lossy compression schemes can reduce sizes for all input files, but it doing so they lose the ability to exactly reconstruct the input file from the output file.
MP3/AAC for music and XVID/MP4 for video are common examples of lossy compression schemes.

What you have claimed to be able to do is create a lossless compression scheme that can compress [transform to less than their original size] all input files. That is simply not possible.

There are 256 possible different one byte input files.
If there were less than 256 different output files, then two input files would be mapped to the same output file, and it would not be possible when decompressing that output file to know which of the two input files was meant. Information would have been lost, so this would be a lossy compression scheme.
So our 256 input files must map to 256 different output files. Each file must be at least one byte long, so the total size of all files must be at least 256 bytes, so no space has been saved.

There are 256x256 possible different two byte input files.
If one of them was to map to a single byte output file, it would have to be the same as one of the output files created by compressing one of our single byte input files, which would meant that we could not differentiate between those two files when decompressing, so this would be a lossy compression scheme again.
So there must be 256x256 output files, each of which is at least two byes long. So no space has been saved.

By induction, the same proof shows that for any input file size, the total set of all files of that size or smaller must map to a set of files of at least the same total size. Hence the average compressed size of any one file must be at least as large as the size of the file itself.

BTC: 16TgAGdiTSsTWSsBDphebNJCFr1NT78xFW
SRC: scefi1XMhq91n3oF5FrE3HqddVvvCZP9KB
Dabs
Legendary
*
Offline Offline

Activity: 3416
Merit: 1912


The Concierge of Crypto


View Profile
December 11, 2013, 03:21:47 PM
 #331

I thought about this, or very similar to this, several years back. I could never get it to work after throwing everything at it. So I put it aside and worked on solving Fermat's last theorem. When I was close to a solution, someone beat me and published it.

I went back to my ultimate compression algorithm, and the best I could come up with that actually worked was fractal related and only on specific kinds of data.

I don't want to discourage you, but so far, your table method does not make sense to me, or I can't see how it can work, at least on normal data.

Try it on the blockchain and see if you can losslessly compress 14 gigabytes to something significantly smaller. Give it away for free. You'll get rich somehow.

BurtW
Legendary
*
Offline Offline

Activity: 2646
Merit: 1138

All paid signature campaigns should be banned.


View Profile WWW
December 11, 2013, 04:16:44 PM
 #332

He will be enriched by the experience!

Our family was terrorized by Homeland Security.  Read all about it here:  http://www.jmwagner.com/ and http://www.burtw.com/  Any donations to help us recover from the $300,000 in legal fees and forced donations to the Federal Asset Forfeiture slush fund are greatly appreciated!
shorena
Copper Member
Legendary
*
Offline Offline

Activity: 1498
Merit: 1540


No I dont escrow anymore.


View Profile
December 11, 2013, 09:38:25 PM
Last edit: December 11, 2013, 10:05:06 PM by shorena
 #333

Hello Everyone,

The Achievement:
I have made an amazing discovery and spent a year trying to talk everyone I know into helping me with my solution for compressing 99.8% of all data out of a file, leaving only a basic crypto key containing the thread of how to re-create the entire file from scratch.
-snip-

Patent / Open Source or it didnt happen.

What you describe sounds like a very primitive compression method with a bit of "the engine just knows" magic.

Here is the whole thing boiled down a succinct as I can make it.   There's no magic here.  

Import data, convert to binary.  Save.  Re-import binary, convert to Layer 1, letters a-H (aAbBcC... etc) according to the Table above.  The file will double in size.
> sure double the size, nothing complicated or magical yet, but this probably has to be done, so lets go on.

 Save.  Re-import and begin Layer 2.  Encoding will begins here.  Encode 1st chunk of 1024K using Crossword Grid alignment, 20 spaces per row.  

> 1024K Byte data would be 1024*10^3 Byte unless you talk about Kibibyte (1024^2 Byte)
> So with 20 spaces per row (why 20?) we have a 51200*20 2-dim array
> dont know what this is for but I can follow without problem

Fill Array with data from Layer 1.

> left -> right? top -> low? the other way around? does it matter?

Now search data from current row, 2 rows at a time simultaneously,

> simultaneously? so this only works on parallel computers? well sad story there nothing to sell the masses, but lets assume its not needed
> and can be done alternating

20 spaces apart (essentially on top of each other in the crossword puzzle grid.

> 20 spaces apart? so we have an array like this
> 0123456789abcdefhijk
> 0123456789abcdefhijk
> 0123456789abcdefhijk
> ...
> 0 and k are 19 spaces apart, so you talk about 0 in line 0 and 0 in line 1?

Let's call that the CWG from now on.  

> m'kay, why? you never use CWG again?

When a match is found, replace topmost row's match with unique identifier that means the same thing to the 2nd Layering engine as the match found.  For example "aa" is replaced by "i"  

> how do you code "i"? you have 37 (with "space") different things in layer 2, you need at least 6 bits to code "i"
> if you have a fixed table in your algorithm, so instead of the original 4 bits we made it to 6 with just a little stop at 8 bits.
> also as you state in your picture you loose data.
> aA or 0000 0001 can not be coded as well as 220 of 256 of possible binary codes
> ~ 86% data loss in this step, you might safe some with shifting, but you'd have to note at least the amount of shifts
> given a fixed direction so aA >>7 = ab, but that would cost you another 3 bit of encoding and i doubt
> that you can shift all possible combinations with the code in layer2

Now delete the bottommost match, leaving an empty space (the compression/encoding).  Now shift the whole array to the right, displacing that empty space, which now becomes a zero at Row 1 Column 1.  Now continue forward.

> with a chance of lossing information @ ~86% of the time.

 Find all matches until the last line is (the last 20 cells are) reached.  

> what if there is no match? what if 20 places from my "i" is "t"? what makes you think that all your binary input has the same data
> periodically?

At the last line, no more compression can be done because there isn't a line under it to compare to.  What we are left with here is essentially a boiled down key.  Nothing more can be done with this chunk.  That key is saved inside the file we are building as such:

1004(0)_GbcDeEafFBAAcbeEBDfga_6(0)

Where the first block above is the topmost last line to the halfway point of the line and the 2ndmost line is the bottommost line to where it ends halfway (20 spaces total combined) and the final part is a message to software that 1004 zeros precede those two keys.

> 1004+20 = 1024 Byte? where did the other 1.022.976 Byte go? we had 1024 KByte when we started, so there should be
> 1.024.000-20 = 1.023.980 leading zeros and 20 magic signs

 Now the engine can get rid of the 0's, leaving that small chunk to retrieve all the data later.

It would do so like this ....  

0000000000000000000
0000000000000000000
0000000000000000000
0000000000000000000
0000000000000000000
0000000000000000000
GbcDeEafFBAcbeEBDfga000000

> ... hooold on a second here, lets test what we get when we do this.
> input: William Shakespeare (26 April 1564 (baptised) – 23 April 1616)[nb 1] was an English poet and playwright, widely regarded as the
> greatest writer in the English language and the world's pre-eminent dramatist.
> the first sentence on wikipedia regarding Shakespeare. Put it in a file and safe it with utf-8 and load that with our favorite
> hex editor to view the binary data.

EFBBBF57696C6C69616D205368616B657370656172652028323620417072696C203135363420286 2617074697365642920E2809320323320417072696C2031363136295B6E6220315D207761732061 6E20456E676C69736820706F657420616E6420706C61797772696768742C20776964656C7920726 5676172646564206173207468652067726561746573742077726974657220696E2074686520456E 676C697368206C616E677561676520616E642074686520776F726C642773207072652D656D696E6 56E74206472616D61746973742E

> Well its hex, no biggy, soo we need at least 20 things, lets take a little more for fun

EFBBBF57696C6C69616D205368616B657370656172652028323620417072696C203135363420286 2617074697365642920E2809320323320417072696C2031363136295B6E6220315D207761732061 6E

> we convert these binary data with your lay 1 table and get

hHFFFHCDdEdgdgdEdAdGbaCBdedAdFdCDBDadCdADbdCbabeBbBdbacADaDbdEdgbaBABCBdBcbabed bdADaDcdEDBdCdcbEbahbeaEBbaBbBBbacADaDbdEdgbaBABdBABdbECFdhdbbaBACGbaDDdADBbadA dh

> lets assume they are doubled and just take 10 in a row each

hHFFFHCDdE
dgdgdEdAdG
baCBdedAdF
dCDBDadCdA
DbdCbabeBb
BdbacADaDb
dEdgbaBABC
BdBcbabedb
dADaDcdEDB
!
dCdcbEbahb
eaEBbaBbBB
bacADaDbdE
dgbaBABdBA
BdbECFdhdb
baBACGbaDD
dADBbadAdh

> and we have 1 match, I cant even get to layer 2 with this. maybe you can enlighten us with continuing this example.
> back to your example

Now the engine counts how many zeros are in the block before the first actual piece of data to know how many iterations it had done to reach that final sequence.  It counts the zeros, here we see 120 zeros total.  

> 120 zeros? didnt we just have 1004 zeros befor the break??

The engine is told the key goes on the last line plus the number of zeros (empty spaces) that were left over. Now having figured out how may iterations to start from backwards, it begins comparing data back out, starting by re-ordering the entire sequence to the left.  So the key would actually look like this:

0000000000000000000
0000000000000000000
0000000000000000000
0000000000000000000
0000000000000000000
0000000000000GbcDeE
afFBAcbeEBDfga000000

Which we can easily see the last line because the G and the a are not overtop of each other, meaning this is indeed the true last line.  Depending on how good the compression is, the last line can occur anywhere in the block, as such :

0000000000000000000
0000000000000000000
0000000GbcDeEafFBAc
beEBDfga00000000000
0000000000000000000
0000000000000000000
0000000000000000000

As long as the block is totally intact and none of the pieces overlap, it is complete.  The reason we need to know how many empty spaces were left at the end is so we can separate how many iterations occured with how many empty spaces were left, since not all the zeros here mean iterations.  

I hope you can see this as clearly as I see it in my head.  Its efficient and would totally work.  I hope you will be able to see that by studying this.

> I hope you can clearly see now that all you have shown is that you can make x bytes of data as big as 1,5*x with some
> strange tabulars where is the layer 3 gone in this explanation?
> why cant you post a simple example? Take the binary data of the shakespeare sentence I provided.



comments inside the quote marked with >

Edit:
Fun read for those who want to know more about our mystery man:
http://encode.ru/threads/1789-100-Lossless-Compression-Theory-Please-Join-Me
he also looked for help from people who know their 1:1 of encoding


Edit2:

http://ticc.uvt.nl/~pspronck/sloot.html

by now I have to thank you for posting this. Quite a lot of fun reading thanks to it Cheesy

Im not really here, its just your imagination.
kaito
Full Member
***
Offline Offline

Activity: 168
Merit: 100



View Profile
December 11, 2013, 11:02:39 PM
 #334

Not sure if this is serious or a troll but you sure are presistent.

http://en.wikipedia.org/wiki/Pigeonhole_principle
http://en.wikipedia.org/wiki/Lossless_data_compression
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [17]  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!