Data compression Question?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • tec0
    Diamond Member

    • Jun 2009
    • 4624

    #1

    Data compression Question?

    In the last few months I have noticed that data compression is taking leaps towards what I can only describe as nearly impossible compression ratios. *.ISO files with an actual size of over 14 GB “Gigabytes” compressed in to 4 GB “Gigabytes” files sizes and sometimes 2.4 GB “Gigabytes” that is absolutely astonishing but also nobody seems to know how it is done.

    I have tested every single open source software I could find and also a few trial versions that wasn’t restricted in functionality and yet the best I get is 7 GB “Gigabytes” *.ISO file compressed into 6.25 GB ““Gigabytes” and that is hardly worth mentioning.

    The implementation of this kind of compression will make backup over ADSL lines faster and not to mention your backup medium can then handle so much more making it really cost effective.

    Does anybody have a clue how it is done?
    peace is a state of mind
    Disclaimer: everything written by me can be considered as fictional.
  • AndyD
    Diamond Member

    • Jan 2010
    • 4946

    #2
    An ISO image is just a true copy (image) of a disk. ISO isn't a compressed format in itself, if the ISO image is to the 9660 standard then it should be about the same size as the original disk.

    If you want to compress the contents of an ISO image or an original disk you would be better extracting the files to your hard drive and using a compression tool like WinRAR. The resulting compressed file could then be repackaged as an ISO but obviously it would no longer be mountable to emulate the original disk so it would be pointless. If the file was then sent to a recipient they would need to reverse the process to recreate a true disk image.
    _______________________________________________

    _______________________________________________

    Comment

    • AndyD
      Diamond Member

      • Jan 2010
      • 4946

      #3
      Compression ratios depend on the type of files that are being compressed as well as their contents. For example if you have a text file that's 10 Gb in size and contains just a whole bunch of asterisks then it might compress to a couple of kilobytes. On the other hand if you have a container format file which already supports compression such as an 'avi' or an 'mkv' file then if you ZIP or RAR it the result will be an archive that is almost identical in size to the original.

      What kind of files are you trying to compress?
      _______________________________________________

      _______________________________________________

      Comment

      • tec0
        Diamond Member

        • Jun 2009
        • 4624

        #4
        I know it is really not possible to compress *.ISO files yet on my system is a seemingly normal zip file and once you unzip it, it is builds into a *.ISO file. And it started out as a 4.3 GB file and ended up as a 14 GB ISO. It is by definition impossible but I would gladly make a short video clip and let you be the judge. "Please recommend me a free desktop recorder that works with win7 and I will gladly "record" and show you how near impossible this really is!"

        I myself took a few backup disks captured it as in *ISO format with the hopes that I can do the same. No luck...
        peace is a state of mind
        Disclaimer: everything written by me can be considered as fictional.

        Comment

        • Sparks
          Gold Member

          • Dec 2009
          • 909

          #5
          What I have had is repetitive winzip/winrar compressed files(zipped then re-zipped then zipped again) which are then converted into an image. The image is then also compressed. I have "Magic Disc" version2.7 build 106

          Comment

          • irneb
            Gold Member

            • Apr 2007
            • 625

            #6
            It sounds illogical. Maybe that growth to 14GB was actually just blank space - which when cleared from the ISO makes the actual size reflect the true data on the image. If you mount it, what does the property dialog tell you about the drive? What size in total files & folders? Compare that to the size of the ISO file. Is the ISO image in something like NTFS - that could have internal compression already.

            As for video capture - try CamStudio. I like M$'s Windows Media Encoder a lot, but I don't think they've updated it to work on their newer OS's. Otherwise there seems quite a lot of other possibilities through a google search.

            BTW, you stated you've tried a great many compression programs. And ZIP and RAR was mentioned. Has anyone tried 7Zip ... I've found it to work faster than WinRar / WinZip and create smaller files than either.
            Gold is the money of kings; silver is the money of gentlemen; barter is the money of peasants; but debt is the money of slaves. - Norm Franz
            And central banks are the slave clearing houses

            Comment

            • tec0
              Diamond Member

              • Jun 2009
              • 4624

              #7
              If anything I have given up trying to replicate the compression. If I compress the same file using what is available like 7zip I get a slap in the face. It doesn’t compare. All I know, it is an auto executable and it is completely stand alone. No identification and is definitely custom. Whoever this person is, this person is nothing short of brilliant.
              peace is a state of mind
              Disclaimer: everything written by me can be considered as fictional.

              Comment

              • irneb
                Gold Member

                • Apr 2007
                • 625

                #8
                Originally posted by tec0
                If anything I have given up trying to replicate the compression. If I compress the same file using what is available like 7zip I get a slap in the face. It doesn’t compare. All I know, it is an auto executable and it is completely stand alone. No identification and is definitely custom. Whoever this person is, this person is nothing short of brilliant.
                That actually tells me it's not an (normal) SFX archive. What's probably happening is it's a custom program which generates some portion (at least) of the ISO according to some internal function. The generated portion then appears to other compressors as if it's random data - so it doesn't compress too well through more generalized compression algorithms.

                As an example of such, it's possible to obtain the exact same sequence of data from a single seed value for a random function. Then that random data would compress extremely little as any generalized algorithm wouldn't pick up that it's been calculated from a single value. So theoretically you can make a program which is just a few kB in size which creates a file several GB in size which compresses extremely badly if at all.
                Gold is the money of kings; silver is the money of gentlemen; barter is the money of peasants; but debt is the money of slaves. - Norm Franz
                And central banks are the slave clearing houses

                Comment

                • AndyD
                  Diamond Member

                  • Jan 2010
                  • 4946

                  #9
                  Just a thought but if the original source disk from which the iso file was created was a retail CD or DVD (movie or music) then there are certain forms of anti piracy protection that play around with the file allocation table on the disk in order to confuse a computer which invariably relies on it to read the disk. With a 'dumb' optical appliance such as a DVD Player the file allocation table isn't read in order to reference and access the files on the disk. Similarly here could be 'dummy' files (as used by SonyBMG on platstation game disks) that are nothing more than a space allocation on the disk but these could cause an ISO image to be unnecessarily large.
                  _______________________________________________

                  _______________________________________________

                  Comment

                  • tec0
                    Diamond Member

                    • Jun 2009
                    • 4624

                    #10
                    Fact is, I have tested a few types and found that anything with an *.EXE will not compress as good as a *.DOCX or *.TXT file. Mp3’s and AVI can be made smaller if you play with the quality and so on but other than that it will not “compress” at all. Renaming an *.Mp3 or *.AVI to *.TXT doesn’t help much because I tested it and I had no improvement.

                    On average a commercially available *.ISO compressors will slim *.ISO down with a 100mb or so on average. That said I found some *.ISO files that slim down with about 2Gb so again it is the question of what files are being compressed.
                    peace is a state of mind
                    Disclaimer: everything written by me can be considered as fictional.

                    Comment

                    • AndyD
                      Diamond Member

                      • Jan 2010
                      • 4946

                      #11
                      Originally posted by tec0
                      ...AVI can be made smaller if you play with the quality and so on but other than that it will not “compress” at all. Renaming an *.Mp3 or *.AVI to *.TXT doesn’t help much because I tested it and I had no improvement.
                      AVI is a container who's contents (audio and video streams) are already compressed. If you alter the quality of an avi file you would be recoding the video with a codec (x-vid) at a lower bitrate. This isn't compression, it's just sacrificing quality for filesize.....it's a tradeoff. Changing the file extension doesn't help, many programs look at the file header which would not be changed.


                      Originally posted by tec0
                      On average a commercially available *.ISO compressors will slim *.ISO down with a 100mb or so on average. That said I found some *.ISO files that slim down with about 2Gb so again it is the question of what files are being compressed.
                      It would be possible for an ISO 'compression' program to search the contents of an ISO for duplicate files. When they are found it could delete one of them and replace it with a 'marker' that points to the other identical file. Problem is that this would not longer conform to the ISO image standards, the resulting ISO would need to be rebuilt by the same program at the other end after it was sent to a recipient.
                      _______________________________________________

                      _______________________________________________

                      Comment

                      • tec0
                        Diamond Member

                        • Jun 2009
                        • 4624

                        #12
                        Lossless is basically the main goal as it will decompress to the original “if not corrupted” that said I can’t help but thinking that maybe a combination of lossless and “lossy” is at work here. When you think about it, one realise that audio and video data that is “lossy compressed” will make it much smaller and audio and video will survive better than an executable or data file.
                        peace is a state of mind
                        Disclaimer: everything written by me can be considered as fictional.

                        Comment

                        • AndyD
                          Diamond Member

                          • Jan 2010
                          • 4946

                          #13
                          As far as I'm aware compression is lossless by its definition, you must be able to decompress back to a true copy of the original. If it's lossy it's not compression, it encoding or recoding, this can never be returned to the original.

                          When you think about it, one realise that audio and video data that is “lossy compressed” will make it much smaller and audio and video will survive better than an executable or data file.
                          I don't understand what you're saying here.
                          _______________________________________________

                          _______________________________________________

                          Comment

                          • tec0
                            Diamond Member

                            • Jun 2009
                            • 4624

                            #14
                            Sorry AndyD I was thinking of the “Lossy methods” as they are used for compressing sound and video files. Basically it comes down to tampering with quality along with the knowledge that a video file doesn’t corrupt easy “I am sure you have seen an AVI that had a glitch but was still playable” Now a “program” or “data file” will not survive a glitch and will corrupt. “data string errors” that kind of thing.

                            Example: lest say your backup consist of both data and video, then it “may well be possible” that the data was compressed using "lossless" and the video was compressed using “Lossy methods”. Well it is just a theory really.
                            peace is a state of mind
                            Disclaimer: everything written by me can be considered as fictional.

                            Comment

                            • AndyD
                              Diamond Member

                              • Jan 2010
                              • 4946

                              #15
                              An AVI with minor corruption might still play depending on the compensation techniques built into the media player you're using. There would still be chunk missing or would have video degradation. A program that's corrupted won't execute correctly so an error message (or bluescreen) would be the result. A data file that's corrupted will cause an error message in the program that tries to open it.

                              You wouldn't 'compress video files to make smaller file sizes, you would run the video file through a codec such as x-vid which will encode it at the cost of quality. With the right front end you can set the framerate and resolution etc so it's a controlled trade off with filesize against quality. With audio you would start off with a lossless .wav file and either use something like FLAC which can compress with a lossless output or lame which can give you a smaller but lossy output. Again these codecs would be controlled by a frontend application that allows you to set your preferences. Flac can be seen as compression, Lame is encoding because the lossy output can never be converted back to the original lossless wave files.

                              and the video was compressed using “Lossy methods”.
                              This wouldn't be compressing, it would be encoding.
                              _______________________________________________

                              _______________________________________________

                              Comment

                              Working...