|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
CleanupThis page is getting a bit long, and I think that some of the lists could used cleaned up Zack3rdbb 04:50, 22 December 2006 (UTC) I've again broken the algorithms into a diffrent list from the implementations. Did it a few years back when I wrote part of the LZ and Huffman pages, but it was reverted back then by someone who seemed to be unable to determine the diffrence between an algorithm and an implementation. I also agree about moving the algorithm's into the lossy/lossless pages. If my current change sticks I will do that too.. Jrlinton Where should a page referencing lossless compression link? Lossless data compression is more on-topic, but has less information (e.g. about practical algorithms) than data compression. IMHO data compression should convey the basic concept, discuss the difference between lossy & lossless, and leave it at that. Especially the algorighm list and discussion should move to the apropriate subtopic. --Robbe How are we defining "obsolete" here? --Alan D There's a really detailed public domain article on data compression at http://www.vectorsite.net/ttdcmp1.html#m6 It would be great if someone could expand our articles using this material (unfortunately, I don't really have a clue about this subject!) Enchanter --- I agree that this page should be about data compression not about 100's of diffrent implementations of the half dozen basic alogrithms. The discussion about MPEG, DEFLATE, PKZIP, etc should not be on this page since they are implementations of one or more of the basic algorithms (LZ77, DCT, etc) >> I find it disturbing that a contributor to this page claims to have no knowledge of the subject! Lossy compression on CD-ROM?In particular, compression of images and sounds can take advantage of limitations of the human sensory system to compress data in ways that are lossy, but nearly indistinguishable from the original by the human eye. This is used on CD-ROM and DVD, as well as in digital cameras. This is highly suspect. DVD, yes: the video is encoded in a lossy format. Digital cameras, yes: pictures are recorded into lossy formats. CD-ROMs, however, merely store the information on them, and in fact, not only is the information not reduced by lossy means, but redundancy in the form of error-correcting codes is added to safeguard the information that's there. Even if "audio CDs" are meant rather than CD-ROMs, it's still a stretch; sure, the audio is quantized before it's recorded onto the master, but no subsequent compression, lossless or lossy, takes place after quantization. -- Antaeus Feldspar 17:32, 23 Sep 2004 (UTC) I agree. It would be better to say that lossy compression is used mainly for images, video and audio. --Matt Mahoney 01:17, 25 Jan 2005 (UTC) circular linksSelf-extraction points to its disambiguation page, which points to uncompression, which returns to this article, which has no information about uncompression. Perhaps someone "in the know" could do a bit on uncompression (decompression?).--suburbancow 02:14, 26 January 2006 (UTC) failure to compressI find this text from the article problematic: "indeed, any compression algorithm will necessarily fail to compress at least 99.609 375 % (255/256) of all files of a given length." Clearly it cannot be true that the algs fail to compress 99.6% of the time. I think the intent was to say that any lossless alg will fail to compress on some files of any given length, but the numbers don't make sense. Also, the percentage of files that would fail would definitely be a function of the length of the files. Kryzx 15:41, 29 March 2006 (UTC)
Source vs Entropy CodingI have it on reliable authority that Source Coding and Entropy Coding are different sides of the topic of data compression. Source coding is lossy, Entropy coding is lossless - at least if I answered that way in the exam I'm about to sit I would get the marks allocated to the question. My lecturers describe a two-phase compression scheme such as JPEG to involve a source coding phase first - effectively reducing the number of bits required to store the data by finding another representation (such as a DCT transform, which effectively represents the majority of the info in a few terms, leaving the rest nearly zero). The second phase is to take this alternative representation and encode it using an entropy coding scheme (ie huffman and/or RLE). The redundancy introduced by the first phase renders the data more susceptible to efficient entropy coding. They're fairly specific about the difference (to the point of broad hints about the exam paper) - why is this not made clearer in the text? Another point regarding source coding is that the DCT is COMPLETELY reversible without loss of data. The loss only occurs when the DCT transformed image cell is quantised - this is normally done by point-wise division of the transformed cell with a quantisation matrix, followed by a rounding of the values to integers. This quantisation and subsequent rounding is where the loss occurs. Because the majority of the information is contained in a few terms, the reverse process can "fairly faithfully" reconstruct the original "shape" of the intensity information represented within the cell. The amount of quantisation applied dictates the quality of the decompressed image data. Entropy coding tries to reduce the overall "energy" represented by a data stream by finding lower energy representations. i.e. Huffman codes require the most common symbols to be given the shortest codes. Ian (3rd Yr Undergrad - Computer Science) Predictive codingI see that predictive coding is presented under lossy encoding, but surely it's possible to use prediction for lossless encoding too. It seems to me that there are two possibilities: a. It is possible to do perfect predictive coding, and this will be lossless, or b. Do a fairly accurate predictive coding, then subtract from the source, and encode the residual. If the residual is encoded without loss then the overall encoding is also lossless. I am fairly sure that for some applications predictive coding schemes which are completely lossless can be constructed. The article could deal with this. David Martland 09:27, 25 August 2006 (UTC) WHAT NO IMAGES?FAIL! Trigger hurt 15:33, 6 September 2006 (UTC) compression and encryptionThe article currently claims Similarly, compressed data can only be understood if the decoding method is known by the receiver. Some compression algorithms exploit this property in order to encrypt data during the compression process. I'm deleting the second of those sentences, on the basis of: It is true that some data compression programs also allow people to produce "password protected archives" (with the notable exception of gzip, as mentioned in the gzip FAQ ). But my understanding is that they do not "exploit this property" -- instead, those programs first compress the data, then use an unrelated encryption algorithm that would encrypt plain text just as well as compressed text. If there actually exists some compression+encryption algorithm that "exploits this property", please tell me about it or link to it. --68.0.120.35 01:01, 13 December 2006 (UTC) Source coding merged hereI've redirected "source coding" to here. The term was already bolded as something the article laid claim to. Data compression generally is most of source coding, but perhaps a paragraph might now be worthwhile on other aspects of source coding - eg perhaps the way source coding/data compression may make surprising symbols/runs of data more apparent, which may be interesting from a detection point of view. Jheald 21:45, 5 March 2007 (UTC) LZMA in sitx?I find the assertion that sitx uses LZMA to be extremely dubious. I can't find any other source on the web that supports this claim, and the Stuffit page seems to indicate that Stuffit X uses a variety of other methods. Anybody know where this idea came from? -Gsnixon 11:29, 6 March 2007 (UTC) CitationsI was looking through this article for some basic beginner compression information, and although all of it matches up to what I already knew or at least makes sense...it appears to be drastically in need of some citations. I hesitate to tag it as one of my first wikipedia actions so I figured I would post first to see what other thought. Jaidan 16:48, 16 March 2007 (UTC) ComparativeCopied from the french personal discussion page of the author of the comparative. Interesting compression format/software comparison. Thanks for sharing your results. However, it is missing something important that I didn't find anywhere... What version of each program did you use for the tests? Specifically, what version of 7z did you test? Thanks again.
Oops. The article did mention v.4.43... I thought I looked everywhere.. Anyways, thanks for the info. One more question though: Did you save the test data and do you plan to keep updating the results as newer versions come out? For example, I know the new 7z 4.45 runs much faster on my dual core machine (but I have not tested if it compresses any better). I would be very interested in checking back here to see the progress of the different archivers. Well thanks again for a well done comparison :)
I have removed this section in its entirety as it fails to differentiate between compression algorithms and compression software (some of which use a variety of different algorithms), and because most of the data sets contain files which are already compressed. A better comparison would plot individual compression algorithms against uncompressed datasets of particular types (e.g., text, executables, raw bitmaps, raw audio). —Psychonaut 22:07, 12 July 2007 (UTC)
So I can't agree with "Globally, the three best methods tested are rk, uha and 7z." because the table says that the strongest compression is rk (41,699,633) followed by rar (43,593,482) and 7z (44,217,482). Can someone change the text? —Preceding unsigned comment added by 87.160.124.182 (talk) 13:31, August 30, 2007 (UTC) I think the AVI column should be removed - AVI is a container file for any number of different compression systems. Contents could have been anything from uncompressed video to some form of MPEG or DV. —Preceding unsigned comment added by 82.5.204.14 (talk) 12:14, 9 September 2007 (UTC) Comparative: copyright?P.Table(P.T.): PAQ8 (kgb archiver is Windows GUI of old PAQ7) is much better than this all, but for the copyright of the table it can't be copied. If this is true, then the table should be removed. Text in Wikipedia is released under the GFDL. It's also not the best table in the world, as it includes formats like "AVI" which are highly codec-dependent. ⇌Elektron 17:10, 27 August 2007 (UTC)
Independent comparison of different methods of data compression (Results & Softwares, in French. Airelle, 2007). Numbers in parenthesis are the rank of the method of compression for the category of file specified above.
(Missing table here) P.Table(P.T.): PAQ8 (kgb archiver is Windows GUI of old PAQ7) is much better than this all, but for the copyright of the table it can't be copied. Globally, the three best methods tested are rk, rar and 7z. WinRK and WinRar are commercial software, 7-zip is free, open source (LGPL licence) and can be used with Linux. —Preceding unsigned comment added by Elektron (talk • contribs) 16:00, 4 September 2007 (UTC)
Comparative (again)This has too many issues. One of them is a private test case (mentioned before). Another is that it doesn't list archivers and archive format, it justs displays the archive format. Also, the parameters for compression are also unknown. These are quite important issues, because different archivers can archive to the same format, but do so differently, and different parameters tend to lead to different ratios also. This difference, while possibly small, is actually important. Additionally, as mentioned before, what's up with the copyright thing? And while I agree paq8o* probably compresses far better than everything else, unless we have solid evidence we should not publish these things. The copyright issue probably arose because someone wanted to include GFDL-incompatible info in the table, which, unless counting as fair-use, should not even be in Wikipedia. Someone really should fix this. And, in case you didn't know, rar, a command-line version of WinRAR, is also available as shareware for Linux, and WinRAR works under Wine. 7-Zip is not a Linux native, p7zip is. I'll fix this. --Yugsdrawkcabeht (talk) 15:41, 29 December 2007 (UTC) Comparative (final)I am removing the comparative section, because it clearly falls under WP:NOR and as stated above is kinda a bunch of crappy data anyways, I.E. .ZIP does not = winzip, .AVIs are container files . . . if anyone has any strong objections it can always be restored, I guess. Though I will agrue against any such action. --SelfStudyBuddyTALK-- 05:30, 15 June 2008 (UTC)
Transparent compression in file systems and archiversIt would be nice to have something on the above topic, with links to appropriate articles and categories. ((Category:Compression file systems)), Transparency (computing), Transparency (data compression), Data compression and ZIP (file format) come to mind. bkil (talk) 13:24, 5 August 2008 (UTC) |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| All Right Reserved © 2007, Designed by Stylish Blog. |