Trailing garbage in gzip file decompression ended




















So I think a strong argument could be made for the ability to extract all the compressed data from even if there is garbage appended. The question is, how would this support be added? Perhaps the mechanism chosen could also be integrated with a fix for Issue Some libraries and utilities seem to solve this problem incorrectly? Maybe a specific Exception type to catch for an invalid header, and a better method to read the remaining buffer when handling it?

In Python 3. At the end of the compressed data, it starts from scratch, expecting a new gzip header; if it doesn't find one, it raises an exception. This is wrong. The gzip module's error is that it should not raise an exception if there's no gzip header the second time around; it should simply end the file. It should only raise an exception if there's no header the first time. It should set another flag, eg.

There are other bugs in this module. For example, it seeks unnecessarily, causing it to fail on nonseekable streams, such as network sockets. This gives me very little confidence in this module: a developer who doesn't know that gzip needs to function without seeking is badly unqualified to implement it for the Python standard library.

I had a similar problem in the past. I wrote a new module that works better with streams. You can try that out and see if it works for you. I had exactly this problem, but none of this answers resolved my issue. So, here is what I did to solve the problem:. I couldn't make it to work with the above mentioned techniques. Stack Overflow for Teams — Collaborate and share knowledge with a private group.

Create a free Team What is Teams? Collectives on Stack Overflow. Learn more. How can I work with Gzip files which contain extra data? Ask Question. Asked 10 years, 11 months ago. The text was updated successfully, but these errors were encountered:. Please look at your file using zless and see what is the crap at the end and that will probably help you understand it. If you need help with your xdebug, please ask. It's working for everybody else. I always want to hear about problems people are having.

If you're using WSL2 and having trouble, then ddev v1. Anyway, let's try to figure out what's wrong with your file. If you want to PM it to me I can also take a look at it. Sorry, something went wrong. When using zless to inspect the.

If I extract using 7-Zip and go to the end of the. I guess also CC mdempsky who is the secondary owner on a bunch of other compress packages. Sorry, something went wrong. The grammar in RFC does not allow for arbitrary junk at the end since a gzip file must be a series of valid members section 2. It explicitly says that the "members simply appear one after another in the file, with no additional information before, between, or after them.

If you would like the ability to read a valid gzip member and specially handle non-gzip data after the member, then you can use the gzip.

Multistream feature to read the file one member at a time. Regardless of what other popular software do, their behavior is clearly non-compliant with the specification. The bar for a new format to be added is very high. It would have to be very popular format and someone needs to be willing to own it for the long term. The formats I can potentially see being added are Brotli and Zstandard, but I think their popularity still has a ways to go before it would considered for being added.

Even then, there's still the question of who's going to maintain it. For extract question without this issue. I'm looking for LZ4 or ZStandard algorithms.

It's any possible to become contributor with those compress packages? Any new formats to be added needs to be approved as a Go proposal.



0コメント

  • 1000 / 1000