What is File Compression?

You may have had to work with a compressed folder before. Likely, you’ll remember its icon (a folder with a zipper). Maybe you remember that the file you downloaded ended with the .zip file extension. Every time you download a compressed file folder, you have to go through the annoying step of clicking “extract” to get your files. What’s the point? Can’t the files you want just be downloaded as they are?

The answer to that question is yes, but not really. This is because while websites have the ability to host and send uncompressed “original” files/folders, these are much larger in size than the compressed version. In order to increase efficiency, downloads and uploads of large/multiple files are done using file compression.

But how does this work? How is it possible that a file can justĀ become smaller?

File compression works by reducing redundancy. This is best explained with an analogy:

Let’s suppose you wanted to write a comprehensive list of instructions on how to assemble an IKEA dining table and chair set. When writing the instructions to install the legs onto the chairs, you probably would not write the instructions one time for each leg of the chair. Rather, you’d write it once, and then indicate to repeat it three more times, once for each leg. Similarly, when writing instructions for assembly of the chairs, you would only write one, and then indicate to repeat the process once again for each remaining chair.

via GIPHY

This is the essence of a compressed file. Repeated consecutive data is condensed into a single copy of that data with an indication of how many times that file is repeated. Other patterns in data can also be picked up by compression software and condensed into more compact data.

The above form of compression is called “lossless” compression because when decompressed, the output file is exactly the same as the original. However, for certain applications, not all data needs to be perfectly retained. In the world of audio and video downloads, “lossy” compression leads to even smaller files that, when presented to most consumers, are practically indiscernible from the original media. In this form of compression, not only is redundant data eliminated, but data that encodes details that the consumer will likely not notice is also eliminated.

Exactly what data encodes “details that consumer will likely not notice” is determined by the type of file compression used. For example, in audio files like mp3 files, one way in which compression is achieved is by ignoring very quiet sounds that play at the same time as louder sounds. Another way compression is achieved is by eliminating very high and very low frequency sounds that humans usually do not perceive.

With a lossy audio compression method like mp3, a song can be compressed to a file 11 times smaller than the original. In contrast, a lossless audio compression method like FLAC (which, again, only eliminates redundant data) can only compress a song to about half its original size.

From the outside, file compression seems like magic. At the detailed level, file compression is highly technical and difficult to understand. However, on the surface, file compression is comprehensible: it just gets rid of things you don’t really need.

Leave a Reply

Your email address will not be published. Required fields are marked *