Moved specifications files to web page.
diff --git a/docs/aes_coding_tips.txt b/docs/aes_coding_tips.txt
deleted file mode 100644
index 052864a..0000000
--- a/docs/aes_coding_tips.txt
+++ /dev/null
@@ -1,253 +0,0 @@
-AES Coding Tips for Developers
-
-NOTE: WinZip^(R) users do not need to read or understand the information
-contained on this page. It is intended for developers of Zip file utilities.
-
-This document contains information that may be helpful to developers and other
-interested parties who wish to support the AE-1 and AE-2 AES encryption formats
-in their own Zip file utilities. WinZip Computing makes no warranties regarding
-the information provided in this document. In particular, WinZip Computing does
-not represent or warrant that the information provided here is free from errors
-or is suitable for any particular use, or that the file formats described here
-will be supported in future versions of WinZip. You should test and validate
-all code and techniques in accordance with good programming practice.
-
-This information supplements the basic encryption specification document found
-here.
-
-This document assumes that you are using Dr. Brian Gladman's AES encryption
-package. Dr. Gladman has generously made public a sample application that
-demonstrates the use of his encryption/decryption and other routines, and the
-code samples shown below are derived from this sample application. Dr.
-Gladman's AES library and the sample application are available from the AES
-project page on Dr. Gladman's web site.
-
-━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
-
-Generating a salt value
-
-Please read the discussion of salt values in the encryption specification.
-
-Dr. Gladman has provided a pseudo-random number generator in the files PRNG.C
-and PRNG.H. You may find this suitable for generating salt values. These files
-are included in the sample package available through the AES project page on
-Dr. Gladman's web site.
-
-Here are guidelines for using Dr. Gladman's generator. Note that the generator
-is used rather like an I/O stream: it is opened (initialized), used, and
-finally closed. To obtain the best results, it is recommended that you
-initialize the generator when your application starts and close it when your
-application closes. (If you are coding in C++, you may wish to wrap these
-actions in a C++ class that initializes the generator on construction and
-closes it on destruction.)
-
- 1. You will need to provide an entropy function in your code for
- initialization of the generator. The entropy function need not be
- particularly sophisticated for this use. Here is one possibility for such a
- function, based primarily upon the Windows performance counter:
-
- int entropy_fun(
- unsigned char buf[],
- unsigned int len)
- {
- unsigned __int64 pentium_tsc[1];
- unsigned int i;
- static unsigned int num = 0;
- // this sample code returns the following sequence of entropy information
- // - the current 8-byte Windows performance counter value
- // - an 8-byte representation of the current date/time
- // - an 8-byte value built from the current process ID and thread ID
- // - all subsequent calls return the then-current 8-byte performance
- // counter value
- switch (num)
- {
- case 1:
- ++num;
- // use a value that is unlikely to repeat across system reboots
- GetSystemTimeAsFileTime((FILETIME *)pentium_tsc);
- break;
- case 2:
- ++num;
- {
- // use a value that distinguishes between different instances of this
- // code that might be running on different processors at the same time
- unsigned __int32 processtest = GetCurrentProcessId();
- unsigned __int32 threadtest = GetCurrentThreadId();
- pentium_tsc[0] = processtest;
- pentium_tsc[0] = (pentium_tsc[0] << 32) + threadtest;
- }
- break;
- case 0:
- ++num;
- // fall through to default case
- default:
- // use a rapidly-changing value
- // Note: check QueryPerformanceFrequency() first to
- // ensure that QueryPerformanceCounter() will work.
- QueryPerformanceCounter((LARGE_INTEGER *)pentium_tsc);
- break;
- }
- for(i = 0; i < 8 && i < len; ++i)
- buf[i] = ((unsigned char*)pentium_tsc)[i];
- return i;
- }
-
- Note: the required prototype for the entropy function is defined in PRNG.H.
-
- 2. Initialize the generator by calling prng_init(), providing the addresses of
- your entropy function and of an instance of a prng_ctx structure (defined
- in PRNG.H). The prng_ctx variable maintains a context for the generator and
- is used as a parameter for the other generator functions. Therefore, the
- variable's state must be maintained until the generator is closed.
-
- prng_ctx ctx;
- prng_init(entropy_fun, &ctx);
-
- You only need to do this once per application session (as long as you keep
- the "stream" open).
-
- 3. To obtain a sequence of random bytes of arbitrary size, use prng_rand().
- This code obtains 16 random bytes, suitable for use as a salt value for
- 256-bit AES encryption:
-
- unsigned char buffer[16];
- prng_rand(buffer, sizeof(buffer), &ctx);
-
- Note that the ctx parameter is the same prng_ctx variable that was used in
- the initialization call.
-
- 4. When you are done with the generator (this would normally be when your
- application closes), close it by calling prng_end:
-
- prng_end(&ctx);
-
- Again, the ctx parameter is the same prng_ctx variable that was used in the
- initialization call.
-
-Encryption and decryption
-
-The actual encryption and decryption of data are handled quite similarly, and
-again are rather stream-like: a stream is "opened", data is passed to it for
-encryption or decryption, and then it is closed. The password verifier is
-returned when the stream is opened, and the authentication code is returned
-when the stream is closed.
-
-Here is the basic technique:
-
- 1. Initialize the "stream" for encryption or decryption and obtain the
- password verification value.
-
- There is no difference in the initialization, regardless of whether you are
- encrypting or decrypting:
-
- fcrypt_ctx zctx; // the encryption context
- int rc = fcrypt_init(
- KeySize, // extra data value indicating key size
- pszPassword, // the password
- strlen(pszPassword), // number of bytes in password
- achSALT, // the salt
- achPswdVerifier, // on return contains password verifier
- &zctx); // encryption context
-
- The return value is 0 if the initialization was successful; non-zero values
- indicate errors. Note that passwords are null-terminated ANSI strings;
- embedded nulls must not be used. (To avoid incompatibilities between the
- various character sets in use, especially in different versions of Windows,
- users should be encouraged to use passwords containing only the "standard"
- characters in the range 32-127.)
-
- The function returns the password verification value in achPswdVerifier,
- which must be a 2-byte buffer. If you are encrypting, store this value in
- the Zip file as indicated by the encryption specification. If you are
- decrypting, compare this returned value to the value stored in the Zip
- file. If they are different, then either the password provided by your user
- was incorrect or the encrypted file has been altered in some way since it
- was encrypted. (Note that if they match, there is still a 1 in 65,536
- chance that an incorrect password was provided.)
-
- The initialized encryption context (zctx) is used as a parameter to the
- encryption/decryption functions. Therefore, its state must be maintained
- until the "stream" is closed.
-
- 2. Encrypt or decrypt the data.
-
- To encrypt:
-
- fcrypt_encrypt(
- pchData, // pointer to the data to encrypt
- cb, // how many bytes to encrypt
- &zctx); // encryption context
-
- To decrypt:
-
- fcrypt_decrypt(
- pchData, // pointer to the data to decrypt
- cb, // how many bytes to decrypt
- &zctx); // decryption context
-
- You may need to call the encrypt or decrypt function multiple times,
- passing in successive chunks of data in the buffer. For AE-1 and AE-2
- compatibility, the buffer size must be a multiple of 16 bytes except for
- the last buffer, which may be smaller. For efficiency, a larger buffer size
- such as 32,768 would generally be used.
-
- Note: to encrypt zero-length files, simply skip this step. You will still
- obtain and use the password verifier (step 1) and authentication code (step
- 3).
-
- 3. Close the "stream" and obtain the authentication code.
-
- When encryption/decryption is complete, close the "stream" as follows:
-
- int rc = fcrypt_end(
- achMAC, // on return contains the authentication code
- &zctx); // encryption context
-
- The return value is the size of the authentication code, which will always
- be 10 for AE-1 and AE-2. The authentication code itself is returned in your
- buffer at achMAC, which is an array of char, sized to hold at least 10
- characters. If you are encrypting, store this value in the Zip file as
- indicated by the encryption specification; if you are decrypting, compare
- this value to the value stored in the Zip file. If the values are
- different, either the password is incorrect or the encrypted data has been
- altered subsequent to storage.
-
- Note that decryption can fail even if the encrypted data is unaltered and
- the password verifier was correct in step 1. The password verifier is
- useful as a quick way to detect most incorrect passwords, but it is not
- perfect and on rare occasions (1 out of 65,536) it will fail to detect an
- incorrect password. It is therefore important for you to check the
- authentication code on completion even though the password verifier was
- correct.
-
-Notes
-
- • Dr. Gladman's AES code depends on the byte order (little-endian or
- big-endian) used by the computing platform the code will run on. This is
- determined by a C preprocessor constant called PLATFORM_BYTE_ORDER, which
- is defined in the file AESOPT.H. You should be sure that
- PLATFORM_BYTE_ORDER gets the proper value for your platform; if it does
- not, you will need to define it yourself to the correct value. When using
- the Microsoft compiler on Intel platforms it does get the proper value,
- which on these platforms is AES_LITTLE_ENDIAN. We have, however, had a
- report that it does not default properly when Borland C++ Builder is used,
- and that manual assignment is necessary. For additional information on this
- topic, refer to the comments within AESOPT.H.
-
-Change history
-
-Changes made in document version 1.04, July, 2008:
-
- A. Sample Entropy Function
-
- The sample entropy function was changed to include information near the
- very beginning of the entropy stream that's unique to the day and to the
- process and thread.
-
-━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
-
-Document version: 1.04
-Last modified: July 21, 2008
-
-Copyright(C) 2003-2016 WinZip International LLC.
-All Rights Reserved
diff --git a/docs/aes_info.txt b/docs/aes_info.txt
deleted file mode 100644
index c7995d5..0000000
--- a/docs/aes_info.txt
+++ /dev/null
@@ -1,607 +0,0 @@
-AES Encryption Information:
-Encryption Specification AE-1 and AE-2
-
-Document version: 1.04
-Last modified: January 30, 2009
-
-NOTE: WinZip^(R) users do not need to read or understand the information
-contained on this page. It is intended for developers of Zip file utilities.
-
-Changes since the original version of this document are summarized in the
-Change History section below.
-
-This document describes the file format that WinZip uses to create
-AES-encrypted Zip files. The AE-1 encryption specification was introduced in
-WinZip 9.0 Beta 1, released in May 2003. The AE-2 encryption specification, a
-minor variant of the original AE-1 specification differing only in how the CRC
-is handled, was introduced in WinZip 9.0 Beta 3, released in January, 2004.
-Note that as of WinZip 11, WinZip itself encrypts most files using the AE-1
-format and encrypts others using the AE-2 format.
-
-From time to time we may update the information provided here, for example to
-document any changes to the file formats, or to add additional notes or
-implementation tips. If you would like to receive e-mail announcements of any
-substantive changes we make to this document, you can sign up below for our
-Developer Information mailing list.
-
-Without compromising the basic Zip file format, WinZip Computing has extended
-the format specification to support AES encryption, and this document fully
-describes the format extension. Additionally, we are providing information
-about a no-cost third-party source for the actual AES encryption code--the same
-code that is used by WinZip. We believe that use of the free encryption code
-and of this specification will make it easy for all developers to add
-compatible advanced encryption to their Zip file utilities.
-
-This document is not a tutorial on encryption or Zip file structure. While we
-have attempted to provide the necessary details of the current WinZip AES
-encryption format, developers and other interested third parties will need to
-have or obtain an understanding of basic encryption concepts, Zip file format,
-etc.
-
-Developers should also review AES Coding Tips page.
-
-WinZip Computing makes no warranties regarding the information provided in this
-document. In particular, WinZip Computing does not represent or warrant that
-the information provided here is free from errors or is suitable for any
-particular use, or that the file formats described here will be supported in
-future versions of WinZip. You should test and validate all code and techniques
-in accordance with good programming practice.
-
-━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
-
-Contents
-
- I. Encryption services
-II. Zip file format
- A. Base format reference
- B. Compression method and encryption flag
- C. CRC value
- D. AES extra data field
-III. Encrypted file storage format
- A. File format
- B. Salt value
- C. Password verification value
- D. Encrypted file data
- E. Authentication code
-IV. Changes in WinZip 11
- V. Notes
- A. Non-files and zero-length files
- B. "Mixed" Zip files
- C. Key generation
-VI. FAQs
-VII. Change history
-VIII. Mailing list signup
-
-
-I. Encryption services
-
-To perform AES encryption and decryption, WinZip uses AES functions written by
-Dr. Brian Gladman. The source code for these functions is available in C/C++
-and Pentium family assembler for anyone to use under an open source BSD or GPL
-license from the AES project page on Dr. Gladman's web site. The AES Coding
-Tips page also has some information on the use of these functions. WinZip
-Computing thanks Dr. Gladman for making his AES functions available to anyone
-under liberal license terms.
-
-Dr. Gladman's encryption functions are portable to a number of operating
-systems and can be static linked into your applications, so there are no
-operating system version or library dependencies. In particular, the functions
-do not require Microsoft's Cryptography API.
-
-General information on the AES standard and the encryption algorithm (also
-known as Rijndael) is readily available on the Internet. A good place to start
-is http://www.nist.gov/public_affairs/releases/g00-176.htm.
-
-II. Zip file format
-
- A. Base format reference
-
- AES-encrypted files are stored within the guidelines of the standard Zip
- file format using only a new "extra data" field, a new compression method
- code, and a value in the CRC field dependant on the encryption version,
- AE-1 or AE-2. The basic Zip file format is otherwise unchanged.
-
- WinZip sets the version needed to extract and version made by fields in the
- local and central headers to the same values it would use if the file were
- not encrypted.
-
- The basic Zip file format specification used by WinZip is available via FTP
- from the Info-ZIP group at ftp://ftp.info-zip.org/pub/infozip/doc/
- appnote-iz-latest.zip.
-
- B. Compression method and encryption flag
-
- As for any encrypted file, bit 0 of the "general purpose bit flags" field
- must be set to 1 in each AES-encrypted file's local header and central
- directory entry.
-
- Additionally, the presence of an AES-encrypted file in a Zip file is
- indicated by a new compression method code (decimal 99) in the file's local
- header and central directory entry, used along with the AES extra data
- field described below. There is no change in either the version made by or
- version needed to extract codes.
-
- The code for the actual compression method is stored in the AES extra data
- field (see below).
-
- C. CRC value
-
- For files encrypted using the AE-2 method, the standard Zip CRC value is
- not used, and a 0 must be stored in this field. Corruption of encrypted
- data within a Zip file is instead detected via the authentication code
- field.
-
- Files encrypted using the AE-1 method do include the standard Zip CRC
- value. This, along with the fact that the vendor version stored in the AES
- extra data field is 0x0001 for AE-1 and 0x0002 for AE-2, is the only
- difference between the AE-1 and AE-2 formats.
-
- NOTE: Zip utilities that support the AE-2 format are required to be able to
- read files that were created in the AE-1 format, and during decryption/
- extraction of files in AE-1 format should verify that the file's CRC
- matches the value stored in the CRC field.
-
- D. AES extra data field
- 1. A file encrypted with AES encryption will have a special "extra data"
- field associated with it. This extra data field is stored in both the
- local header and central directory entry for the file.
-
- Note: see the Zip file format document referenced above for general
- information on the format and use of extra data fields.
-
- 2. The extra data header ID for AES encryption is 0x9901. The fields are
- all stored in Intel low-byte/high-byte order. The extra data field
- currently has a length of 11: seven data bytes plus two bytes for the
- header ID and two bytes for the data size. Therefore, the extra data
- overhead for each file in the archive is 22 bytes (11 bytes in the
- central header plus 11 bytes in the local header).
- 3. The format of the data in the AES extra data field is as follows. See
- the notes below for additional information.
-
- Offset Size(bytes) Content
-
- 0 2 Extra field header ID (0x9901)
-
- 2 2 Data size (currently 7, but subject
- to possible increase in the future)
-
- 4 2 Integer version number specific to
- the zip vendor
-
- 6 2 2-character vendor ID
-
- 8 1 Integer mode value indicating AES
- encryption strength
-
- 9 2 The actual compression method used to
- compress the file
-
- 4. Notes
- ☆ Data size: this value is currently 7, but because it is possible
- that this specification will be modified in the future to store
- additional data in this extra field, vendors should not assume that
- it will always remain 7.
- ☆ Vendor ID: the vendor ID field should always be set to the two
- ASCII characters "AE".
- ☆ Vendor version: the vendor version for AE-1 is 0x0001. The vendor
- version for AE-2 is 0x0002.
-
- Zip utilities that support AE-2 must also be able to process files
- that are encrypted in AE-1 format. The handling of the CRC value is
- the only difference between the AE-1 and AE-2 formats.
-
- ☆ Encryption strength: the mode values (encryption strength) for AE-1
- and AE-2 are:
-
- Value Strength
-
- 0x01 128-bit encryption key
-
- 0x02 192-bit encryption key
-
- 0x03 256-bit encryption key
-
- The encryption specification supports only 128-, 192-, and 256-bit
- encryption keys. No other key lengths are permitted.
-
- (Note: the current version of WinZip does not support encrypting
- files using 192-bit keys. This specification, however, does provide
- for the use of 192-bit keys, and WinZip is able to decrypt such
- files.)
-
- ☆ Compression method: the compression method is the one that would
- otherwise have been stored in the local and central headers for the
- file. For example, if the file is imploded, this field will contain
- the compression code 6. This is needed because a compression method
- of 99 is used to indicate the presence of an AES-encrypted file
- (see above).
-
-III. Encrypted file storage format
-
- A. File format
-
- Additional overhead data required for decryption is stored with the
- encrypted file itself (i.e., not in the headers). The actual format of the
- stored file is as follows; additional information about these fields is
- below. All fields are byte-aligned.
-
- Size Content
- (bytes)
-
- Variable Salt value
-
- 2 Password verification value
-
- Variable Encrypted file data
-
- 10 Authentication code
-
- Note that the value in the "compressed size" fields of the local file
- header and the central directory entry is the total size of all the items
- listed above. In other words, it is the total size of the salt value,
- password verification value, encrypted data, and authentication code.
-
- B. Salt value
-
- The "salt" or "salt value" is a random or pseudo-random sequence of bytes
- that is combined with the encryption password to create encryption and
- authentication keys. The salt is generated by the encrypting application
- and is stored unencrypted with the file data. The addition of salt values
- to passwords provides a number of security benefits and makes dictionary
- attacks based on precomputed keys much more difficult.
-
- Good cryptographic practice requires that a different salt value be used
- for each of multiple files encrypted with the same password. If two files
- are encrypted with the same password and salt, they can leak information
- about each other. For example, it is possible to determine whether two
- files encrypted with the same password and salt are identical, and an
- attacker who somehow already knows the contents of one of two files
- encrypted with the same password and salt can determine some or all of the
- contents of the other file. Therefore, you should make every effort to use
- a unique salt value for each file.
-
- The size of the salt value depends on the length of the encryption key, as
- follows:
-
- Key size Salt size
-
- 128 bits 8 bytes
-
- 192 bits 12 bytes
-
- 256 bits 16 bytes
-
- C. Password verification value
-
- This two-byte value is produced as part of the process that derives the
- encryption and decryption keys from the password. When encrypting, a
- verification value is derived from the encryption password and stored with
- the encrypted file. Before decrypting, a verification value can be derived
- from the decryption password and compared to the value stored with the
- file, serving as a quick check that will detect most, but not all,
- incorrect passwords. There is a 1 in 65,536 chance that an incorrect
- password will yield a matching verification value; therefore, a matching
- verification value cannot be absolutely relied on to indicate a correct
- password.
-
- Information on how to obtain the password verification value from Dr.
- Gladman's encryption library can be found on the coding tips page.
-
- This value is stored unencrypted.
-
- D. Encrypted file data
-
- Encryption is applied only to the content of files. It is performed after
- compression, and not to any other associated data. The file data is
- encrypted byte-for-byte using the AES encryption algorithm operating in
- "CTR" mode, which means that the lengths of the compressed data and the
- compressed, encrypted data are the same.
-
- It is important for implementors to note that, although the data is
- encrypted byte-for-byte, it is presented to the encryption and decryption
- functions in blocks. The block size used for encryption and decryption must
- be the same. To be compatible with the encryption specification, this block
- size must be 16 bytes (although the last block may be smaller).
-
- E. Authentication code
-
- Authentication provides a high quality check that the contents of an
- encrypted file have not been inadvertently changed or deliberately tampered
- with since they were first encrypted. In effect, this is a super-CRC check
- on the data in the file after compression and encryption. (Additionally,
- authentication is essential when using CTR mode encryption because this
- mode is vulnerable to several trivial attacks in its absence.)
-
- The authentication code is derived from the output of the encryption
- process. Dr. Gladman's AES code provides this service, and information
- about how to obtain it is in the coding tips.
-
- The authentication code is stored unencrypted. It is byte-aligned and
- immediately follows the last byte of encrypted data.
-
- For more discussion about authentication, see the authentication code FAQ
- below.
-
-IV. Changes in WinZip 11
-
-Beginning with WinZip 11, WinZip makes a change in its use of the AE-1 and AE-2
-file formats. The file formats themselves have not changed, and AES-encrypted
-files created by WinZip 11 are completely compatible with version 1.02 the
-WinZip AES encryption specification, which was published in January 2004.
-
-WinZip 9.0 and WinZip 10.0 stored all AES-encrypted files using the AE-2 file
-format, which does not store the encrypted file's CRC. WinZip 11 instead uses
-the AE-1 file format, which does store the CRC, for most files. This provides
-an extra integrity check against the possibility of hardware or software errors
-that occur during the actual process of file compression/encryption or
-decryption/decompression. For more information on this point, see the
-discussion of the CRC below.
-
-Because for some very small files the CRC can be used to determine the exact
-contents of a file, regardless of the encryption method used, WinZip 11
-continues to use the AE-2 file format, with no CRC stored, for files with an
-uncompressed size of less than 20 bytes. WinZip 11 also uses the AE-2 file
-format for files compressed in BZIP2 format, because the BZIP2 format contains
-its own integrity checks equivalent to those provided by the Zip format's CRC.
-
-Other vendors who support WinZip's AES encryption specification may want to
-consider making a similar change to their own implementations of the
-specification, to get the benefit of the extra integrity check that it
-provides.
-
-Note that the January 2004 version of the WinZip AE-2 specification, version
-1.0.2, already required that all utilities that implemented the AE-2 format
-also be able to process files in AE-1 format, and should check on decryption/
-extraction of those files that the CRC was correct.
-
-V. Notes
-
- A. Non-files and zero-length files
-
- To reduce Zip file size, it is recommended that non-file entries such as
- folder/directory entries not be encrypted. This, however, is only a
- recommendation; it is permissible to encrypt or not encrypt these entries,
- as you prefer.
-
- On the other hand, it is recommended that you do encrypt zero-length files.
- The presence of both encrypted and unencrypted files in a Zip file may
- trigger user warnings in some Zip file utilities, so the user experience
- may be improved if all files (including zero-length files) are encrypted.
-
- If zero-length files are encrypted, the encrypted data portion of the file
- storage (see above) will be empty, but the remainder of the encryption
- overhead data must be present, both in the file storage area and in the
- local and central headers.
-
- B. "Mixed" Zip files
-
- There is no requirement that all files in a Zip file be encrypted or that
- all files that are encrypted use the same encryption method or the same
- password.
-
- A Zip file can contain any combination of unencrypted files and files
- encrypted with any of the four currently defined encryption methods (Zip
- 2.0, AES-128, AES-192, AES-256). Encrypted files may use the same password
- or different passwords.
-
- C. Key Generation
-
- Key derivation, as used by AE-1 and AE-2 and as implemented in Dr.
- Gladman's library, is done according to the PBKDF2 algorithm, which is
- described in the RFC2898 guidelines. An iteration count of 1000 is used. An
- appropriate number of bits from the resulting hash value are used to
- compose three output values: an encryption key, an authentication key, and
- a password verification value. The first n bits become the encryption key,
- the next m bits become the authentication key, and the last 16 bits (two
- bytes) become the password verification value.
-
- As part of the process outlined in RFC 2898 a pseudo-random function must
- be called; AE-2 uses the HMAC-SHA1 function, since it is a well-respected
- algorithm that has been in wide use for this purpose for several years.
-
- Note that, when used in connection with 192- or 256-bit AES encryption, the
- fact that HMAC-SHA1 produces a 160-bit result means that, regardless of the
- password that you specify, the search space for the encryption key is
- unlikely to reach the theoretical 192- or 256-bit maximum, and cannot be
- guaranteed to exceed 160 bits. This is discussed in section B.1.1 of the
- RFC2898 specification.
-
-VI. FAQs
-
- • Why is the compression method field used to indicate AES encryption?
-
- As opposed to using new version made by and version needed to extract
- values to signal AES encryption for a file, the new compression method is
- more likely to be handled gracefully by older versions of existing Zip file
- utilities. Zip file utilities typically do not attempt to extract files
- compressed with unknown methods, presumably notifying the user with an
- appropriate message.
-
- • How can I guarantee that the salt value is unique?
-
- In principle, the value of the salt should be different whenever the same
- password is used more than once, for the reasons described above, but this
- is difficult to guarantee.
-
- In practice, the number of bytes in the salt (as specified by AE-1 and
- AE-2) is such that using a pseudo-random value will ensure that the
- probability of duplicated salt values is very low and can be safely
- ignored.
-
- There is one exception to this: With the 8-byte salt values used with
- WinZip's 128-bit encryption it is likely that, if approximately 4 billion
- files are encrypted with the same password, two of the files will have the
- same salt, so it is advisable to stay well below this limit. Because of
- this, when using the same password to encrypt very large numbers of files
- in WinZip's AES encryption format (that is, files totalling in the
- millions, for example 2000 Zip files, each containing 1000 encrypted
- files), we recommend the use of 192-bit or 256-bit AES keys, with their 12-
- and 16-byte salt values, rather than 128-bit AES keys, with their 8-byte
- salt values.
-
- Although salt values do not need to be truly random, it is important that
- they be generated in a way that the probability of duplicated salt values
- is not significantly higher than that which would be expected if truly
- random values were being used.
-
- One technique for generating salt values is presented in the coding tips
- page.
-
- • Why is there an authentication code?
-
- The purpose of the authentication code is to insure that, once a file's
- data has been compressed and encrypted, any accidental corruption of the
- encrypted data, and any deliberate attempts to modify the encrypted data by
- an attacker who does not know the password, can be detected.
-
- The current consensus in the cryptographic community is that associating a
- message authentication code (or MAC) with encrypted data has strong
- security value because it makes a number of attacks more difficult to
- engineer. For AES CTR mode encryption in particular, a MAC is especially
- important because a number of trivial attacks are possible in its absence.
- The MAC used with WinZip's AES encryption is based on HMAC-SHA1-80, a
- mature and widely respected authentication algorithm.
-
- The MAC is calculated after the file data has been compressed and
- encrypted. This order of calculation is referred to as Encrypt-then-MAC,
- and is preferred by many cryptographers to the alternative order of
- MAC-then-Encrypt because Encrypt-then-MAC is immune to some known attacks
- on MAC-then-Encrypt.
-
- • What is the role of the CRC in WinZip 11?
-
- Within the Zip format, the primary use of the CRC value is to detect
- accidental corruption of data that has been stored in the Zip file. With
- files encrypted according to the Zip 2.0 encryption specification, it also
- functions to some extent as a method of detecting deliberate attempts to
- modify the encrypted data, but not one that can be considered
- cryptographically strong. The CRC is not needed for these purposes with the
- WinZip AES encryption specification, where the HMAC-SHA1-based
- authentication code instead serves these roles.
-
- The CRC has a drawback in that for very small files, such as files with
- four or fewer bytes, the CRC can be used, independent of the encryption
- algorithm, to determine the unencrypted contents of the file. And, in
- general, it is preferable to store as little information as possible about
- the encrypted file in the unencrypted Zip headers.
-
- The CRC does serve one purpose that the authentication code does not. The
- CRC is computed based on the original uncompressed, unencrypted contents of
- the file, and it is checked after the file has been decrypted and
- decompressed. In contrast, the authentication code used with WinZip AES
- encryption is computed after compression/encryption and it is checked
- before decryption/decompression. In the very rare event of a hardware or
- software error that corrupts data during compression and encryption, or
- during decryption and decompression, the CRC will catch the error, but the
- authentication code will not.
-
- WinZip 9.0 and WinZip 10.0 used AE-2 for all files that they created, and
- did not store the CRC. As of WinZip 11, WinZip instead uses AE-1 for most
- files, storing the CRC as an additional integrity check against hardware or
- software errors occurring during the actual compression/encryption or
- decryption/decompression processes. WinZip 11 will continue to use AE-2,
- with no CRC, for very small files of less than 20 bytes. It will also use
- AE-2 for files compressed in BZIP2 format, because this format has internal
- integrity checks equivalent to a CRC check built in.
-
- Note that the AES-encrypted files created by WinZip 11 are fully compatible
- with January 2004's version 1.0.2 of the WinZip AES encryption
- specification, in which both the AE-1 and AE-2 variants of the file format
- were already defined.
-
-VII. Change history
-
-Changes made in document version 1.04, January, 2009: Minor clarification
-regarding the algorithm used to generate the authentication code.
-
-Changes made in document version 1.03, November, 2006: Minor editorial and
-clarifying changes have been made throughout the document. The following
-substantive technical changes have been made:
-
- A. WinZip 11 Usage of AE-1 and AE-2
-
- WinZip's AES encryption specification defines two formats, known as AE-1
- and AE-2, which differ in whether the CRC of the encrypted file is stored
- in the Zip headers. While the file formats themselves remain unchanged,
- WinZip's usage of them is changing. Beginning with WinZip 11, WinZip uses
- the AE-1 format, which includes the CRC of the encrypted file, for many
- encrypted files, in order to provide an additional integrity check against
- hardware or software errors occurring during the compression/encryption or
- decryption/decompression processes. Note that AES-encrypted files created
- by WinZip 11 are completely compatible with the previous version of the
- WinZip encryption specification, January 2004's version 1.0.2.
-
- B. The discussion of salt values mentions a limitation that applies to the
- uniqueness of salt values when very large numbers of files are encrypted
- with 128-bit encryption.
- C. Older versions of this specification suggested that other vendors might
- want to use their own vendor IDs to create their own unique encryption
- formats. We no longer suggest that vendor-specific alternative encryption
- methods be created in this way.
-
-Changes made in document version 1.02, January, 2004: The introductory text at
-the start of the document has been rewritten, and minor editorial and
-clarifying changes have been made throughout the document. Two substantive
-technical changes have been made:
-
- A. AE-2 Specification
-
- Standard Zip files store the CRC of each file's unencrypted data. This
- value is used to help detect damage or other alterations to Zip files.
- However, storing the CRC value has a drawback in that, for a very small
- file, such as a file of four or fewer bytes, the CRC value can be used,
- independent of the encryption algorithm, to help determine the unencrypted
- contents of the file.
-
- Because of this, files encrypted with the new AE-2 method store a 0 in the
- CRC field of the Zip header, and use the authentication code instead of the
- CRC value to verify that encrypted data within the Zip file has not been
- corrupted.
-
- The only differences between the AE-1 and AE-2 methods are the storage in
- AE-2 of 0 instead of the CRC in the Zip file header,and the use in the AES
- extra data field of 0x0002 for AE-2 instead of 0x0001 for AE-1 as the
- vendor version.
-
- Zip utilities that support the AE-2 format are required to be able to read
- files that were created in the AE-1 format, and during decryption/
- extraction of files in AE-1 format should verify that the file's CRC
- matches the value stored in the CRC field.
-
- B. Key Generation and HMAC-SHA1
-
- The description of the key generation mechanism has been updated to point
- out a limitation arising from its use of HMAC-SHA1 as the pseudo-random
- function: When used in connection with 192- or 256-bit AES encryption, the
- fact that HMAC-SHA1 produces a 160-bit result means that, regardless of the
- password that you specify, the search space for the encryption key is
- unlikely to reach the theoretical 192- or 256-bit maximum, and cannot be
- guaranteed to exceed 160 bits. This is discussed in section B.1.1 of the
- RFC2898 specification.
-
-VII. Developer Information Mailing List Signup
-
-We plan to use this mailing list to notify subscribers of any substantive
-changes made to the Developer Information pages on the WinZip web site.
-
-
-
- If you enter your e-mail address above, you will receive a message
- asking you to confirm your wish to be added to the mailing list. If you
- don't reply to the confirmation message, you will not be added to the
- list.
-
- By subscribing to this complimentary mailing list service, you
- acknowledge and agree that WinZip Computing makes no representations
- regarding the completeness or accuracy of the information provided
- through the service, and that this service may be discontinued, in whole
- or in part, with respect to any or all subscribers at any time.
- * E-mail Address:
- [ ] [Submit to Support] [Clear Form]
-━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
-
-Document version: 1.04
-Last modified: January 30, 2009
-
-Copyright(C) 2003-2015 WinZip International LLC.
-All Rights Reserved
diff --git a/docs/appnote.iz b/docs/appnote.iz
deleted file mode 100644
index eed5663..0000000
--- a/docs/appnote.iz
+++ /dev/null
@@ -1,3686 +0,0 @@
-[Info-ZIP note, 20040528: this file is based on PKWARE's appnote.txt of
- 15 February 1996, taking into account PKWARE's revised appnote.txt
- version 6.2.0 of 26 April 2004. It has been unofficially corrected
- and extended by Info-ZIP without explicit permission by PKWARE.
- Although Info-ZIP believes the information to be accurate and complete,
- it is provided under a disclaimer similar to the PKWARE disclaimer below,
- differing only in the substitution of "Info-ZIP" for "PKWARE". In other
- words, use this information at your own risk, but we think it's correct.
-
- Specification info from PKWARE that was obviously wrong has been corrected
- silently (e.g. missing structure fields, wrong numbers).
- As of PKZIPW 2.50, two new incompatibilities have been introduced by PKWARE;
- they are noted below. Note that the "NTFS tag" conflict is currently not
- real; PKZIPW 2.50 actually tags NTFS files as having come from a FAT
- file system, too.]
-
-File: APPNOTE.TXT - .ZIP File Format Specification
-Version: 6.2.0 - NOTIFICATION OF CHANGE
-Revised: 04/26/2004 [2004-05-28 Info-ZIP]
-Copyright (c) 1989 - 2004 PKWARE Inc., All Rights Reserved.
-
-I. Purpose
-----------
-
-This specification is intended to define a cross-platform,
-interoperable file format. Since its first publication
-in 1989, PKWARE has remained committed to ensuring the
-interoperability of the .ZIP file format through this
-specification. We trust that all .ZIP compatible vendors
-and application developers that have adopted this format
-will share and support this commitment.
-
-
-II. Disclaimer
---------------
-
-Although PKWARE will attempt to supply current and accurate
-information relating to its file formats, algorithms, and the
-subject programs, the possibility of error or omission can not
-be eliminated. PKWARE therefore expressly disclaims any warranty
-that the information contained in the associated materials relating
-to the subject programs and/or the format of the files created or
-accessed by the subject programs and/or the algorithms used by
-the subject programs, or any other matter, is current, correct or
-accurate as delivered. Any risk of damage due to any possible
-inaccurate information is assumed by the user of the information.
-Furthermore, the information relating to the subject programs
-and/or the file formats created or accessed by the subject
-programs and/or the algorithms used by the subject programs is
-subject to change without notice.
-
-If the version of this file is marked as a NOTIFICATION OF CHANGE,
-the content defines an Early Feature Specification (EFS) change
-to the .ZIP file format that may be subject to modification prior
-to publication of the Final Feature Specification (FFS). This
-document may also contain information on Planned Feature
-Specifications (PFS) defining recognized future extensions.
-
-
-III. Change Log
----------------
-
-Version Change Description Date
-------- ------------------ ----------
-5.2 -Single Password Symmetric Encryption 06/02/2003
- storage
-
-6.1.0 -Smart Card compatibility 01/20/2004
- -Documentation on certificate storage
-
-6.2.0 -Introduction of Central Directory 04/26/2004
- Encryption for encrypting metadata
- -Added OS/X to Version Made By values
-
-
-IV. General Format of a .ZIP file
----------------------------------
-
- Files stored in arbitrary order. Large .ZIP files can span multiple
- diskette media or be split into user-defined segment sizes. [The
- minimum user-defined segment size for a split .ZIP file is 64K.
- (removed by PKWare 2003-06-01)]
-
- Overall .ZIP file format:
-
- [local file header 1]
- [file data 1]
- [data descriptor 1]
- .
- .
- .
- [local file header n]
- [file data n]
- [data descriptor n]
- [archive decryption header] (EFS)
- [archive extra data record] (EFS)
- [central directory]
- [zip64 end of central directory record]
- [zip64 end of central directory locator]
- [end of central directory record]
-
-
- A. Local file header:
-
- local file header signature 4 bytes (0x04034b50)
- version needed to extract 2 bytes
- general purpose bit flag 2 bytes
- compression method 2 bytes
- last mod file time 2 bytes
- last mod file date 2 bytes
- crc-32 4 bytes
- compressed size 4 bytes
- uncompressed size 4 bytes
- file name length 2 bytes
- extra field length 2 bytes
-
- file name (variable size)
- extra field (variable size)
-
-
- B. File data
-
- Immediately following the local header for a file
- is the compressed or stored data for the file.
- The series of [local file header][file data][data
- descriptor] repeats for each file in the .ZIP archive.
-
-
- C. Data descriptor:
-
- [Info-ZIP discrepancy:
- The Info-ZIP zip program starts the data descriptor with a 4-byte
- PK-style signature. Despite the specification, none of the PKWARE
- programs supports the data descriptor. PKZIP 4.0 -fix function
- (and PKZIPFIX 2.04) ignores the data descriptor info even when bit 3
- of the general purpose bit flag is set.
- data descriptor signature 4 bytes (0x08074b50)
- ]
- crc-32 4 bytes
- compressed size 4 bytes
- uncompressed size 4 bytes
-
- This descriptor exists only if bit 3 of the general
- purpose bit flag is set (see below). It is byte aligned
- and immediately follows the last byte of compressed data.
- This descriptor is used only when it was not possible to
- seek in the output .ZIP file, e.g., when the output .ZIP file
- was standard output or a non seekable device. For Zip64 format
- archives, the compressed and uncompressed sizes are 8 bytes each.
-
-
- D. Archive decryption header: (EFS)
-
- The Archive Decryption Header is introduced in version 6.2
- of the ZIP format specification. This record exists in support
- of the Central Directory Encryption Feature implemented as part of
- the Strong Encryption Specification as described in this document.
- When the Central Directory Structure is encrypted, this decryption
- header will precede the encrypted data segment. The encrypted
- data segment will consist of the Archive extra data record (if
- present) and the encrypted Central Directory Structure data.
- The format of this data record is identical to the Decryption
- header record preceding compressed file data. If the central
- directory structure is encrypted, the location of the start of
- this data record is determined using the Start of Central Directory
- field in the Zip64 End of Central Directory record. Refer to the
- section on the Strong Encryption Specification for information
- on the fields used in the Archive Decryption Header record.
-
-
- E. Archive extra data record: (EFS)
-
- archive extra data signature 4 bytes (0x08064b50)
- extra field length 4 bytes
- extra field data (variable size)
-
- The Archive Extra Data Record is introduced in version 6.2
- of the ZIP format specification. This record exists in support
- of the Central Directory Encryption Feature implemented as part of
- the Strong Encryption Specification as described in this document.
- When present, this record immediately precedes the central
- directory data structure. The size of this data record will be
- included in the Size of the Central Directory field in the
- End of Central Directory record. If the central directory structure
- is compressed, but not encrypted, the location of the start of
- this data record is determined using the Start of Central Directory
- field in the Zip64 End of Central Directory record.
-
-
- F. Central directory structure:
-
- [file header 1]
- .
- .
- .
- [file header n]
- [digital signature]
-
- File header:
-
- central file header signature 4 bytes (0x02014b50)
- version made by 2 bytes
- version needed to extract 2 bytes
- general purpose bit flag 2 bytes
- compression method 2 bytes
- last mod file time 2 bytes
- last mod file date 2 bytes
- crc-32 4 bytes
- compressed size 4 bytes
- uncompressed size 4 bytes
- file name length 2 bytes
- extra field length 2 bytes
- file comment length 2 bytes
- disk number start 2 bytes
- internal file attributes 2 bytes
- external file attributes 4 bytes
- relative offset of local header 4 bytes
-
- file name (variable size)
- extra field (variable size)
- file comment (variable size)
-
- Digital signature:
-
- header signature 4 bytes (0x05054b50)
- size of data 2 bytes
- signature data (variable size)
-
- With the introduction of the Central Directory Encryption
- feature in version 6.2 of this specification, the Central
- Directory Structure may be stored both compressed and encrypted.
- Although not required, it is assumed when encrypting the
- Central Directory Structure, that it will be compressed
- for greater storage efficiency. Information on the
- Central Directory Encryption feature can be found in the section
- describing the Strong Encryption Specification. The Digital
- Signature record will be neither compressed nor encrypted.
-
-
- G. Zip64 end of central directory record
-
- zip64 end of central dir
- signature 4 bytes (0x06064b50)
- size of zip64 end of central
- directory record 8 bytes
- version made by 2 bytes
- version needed to extract 2 bytes
- number of this disk 4 bytes
- number of the disk with the
- start of the central directory 4 bytes
- total number of entries in the
- central directory on this disk 8 bytes
- total number of entries in the
- central directory 8 bytes
- size of the central directory 8 bytes
- offset of start of central
- directory with respect to
- the starting disk number 8 bytes
- zip64 extensible data sector (variable size)
-
- The above record structure defines Version 1 of the
- Zip64 end of central directory record. Version 1 was
- implemented in versions of this specification preceding
- 6.2 in support of the ZIP64(tm) large file feature. The
- introduction of the Central Directory Encryption feature
- implemented in version 6.2 as part of the Strong Encryption
- Specification defines Version 2 of this record structure.
- Refer to the section describing the Strong Encryption
- Specification for details on the version 2 format for
- this record.
-
-
- H. Zip64 end of central directory locator
-
- zip64 end of central dir locator
- signature 4 bytes (0x07064b50)
- number of the disk with the
- start of the zip64 end of
- central directory 4 bytes
- relative offset of the zip64
- end of central directory record 8 bytes
- total number of disks 4 bytes
-
-
- I. End of central directory record:
-
- end of central dir signature 4 bytes (0x06054b50)
- number of this disk 2 bytes
- number of the disk with the
- start of the central directory 2 bytes
- total number of entries in the
- central directory on this disk 2 bytes
- total number of entries in
- the central directory 2 bytes
- size of the central directory 4 bytes
- offset of start of central
- directory with respect to
- the starting disk number 4 bytes
- .ZIP file comment length 2 bytes
- .ZIP file comment (variable size)
-
-
- J. Explanation of fields:
-
- version made by (2 bytes)
-
- [PKWARE describes "OS made by" now (since 1998) as follows:
- The upper byte indicates the compatibility of the file
- attribute information. If the external file attributes
- are compatible with MS-DOS and can be read by PKZIP for
- DOS version 2.04g then this value will be zero. If these
- attributes are not compatible, then this value will
- identify the host system on which the attributes are
- compatible.]
- The upper byte indicates the host system (OS) for the
- file. Software can use this information to determine
- the line record format for text files etc. The current
- mappings are:
-
- 0 - FAT file system (DOS, OS/2, NT) + PKWARE 2.50+ VFAT, NTFS
- 1 - Amiga
- 2 - OpenVMS
- 3 - Unix
- 4 - VM/CMS
- 5 - Atari ST
- 6 - HPFS file system (OS/2, NT 3.x)
- 7 - Macintosh
- 8 - Z-System
- 9 - CP/M
- ---------------------------------------------------------------------
- PKWARE assignment | Info-ZIP assignment
- -----------------------------------|---------------------------------
- 10 - Windows NTFS | TOPS-20
- (since PKZIPW 2.50, but | (assigned Oct-1992,
- not used by any PKWARE prog) | no longer used)
- 11 - MVS | NTFS file system (WinNT)
- | (actively used by Info-ZIP's
- | Zip for NT since Sep-1993)
- 12 - VSE | SMS/QDOS
- ---------------------------------------------------------------------
- 13 - Acorn RISC OS
- 14 - VFAT file system (Win95, NT) [Info-ZIP reservation, unused]
- 15 - MVS [PKWARE describes this assignment as "alternate MVS"]
- 16 - BeOS (BeBox or PowerMac)
- 17 - Tandem
- 18 - OS/400 (IBM) | THEOS
- 19 - OS/X (Darwin)
- 20 thru 29 - unused
- 30 - AtheOS/Syllable
- 31 thru 255 - unused
-
- The lower byte indicates the ZIP specification version
- (the version of this document) supported by the software
- used to encode the file. The value/10 indicates the major
- version number, and the value mod 10 is the minor version
- number.
-
- version needed to extract (2 bytes)
-
- The minimum supported ZIP specification version needed to
- extract the file, mapped as above. This value is based on
- the specific format features a ZIP program must support to
- be able to extract the file. If multiple features are
- applied to a file, the minimum version should be set to the
- feature having the highest value. New features or feature
- changes affecting the published format specification will be
- implemented using higher version numbers than the last
- published value to avoid conflict.
-
- Current minimum feature versions are as defined below:
-
- 1.0 - Default value
- 1.1 - File is a volume label
- 2.0 - File is a folder (directory)
- 2.0 - File is compressed using Deflate compression
- 2.0 - File is encrypted using traditional PKWARE encryption
- 2.1 - File is compressed using Deflate64(tm)
- 2.5 - File is compressed using PKWARE DCL Implode
- 2.7 - File is a patch data set
- 4.5 - File uses ZIP64 format extensions
- 4.6 - File is compressed using BZIP2 compression*
- 5.0 - File is encrypted using DES
- 5.0 - File is encrypted using 3DES
- 5.0 - File is encrypted using original RC2 encryption
- 5.0 - File is encrypted using RC4 encryption
- 5.1 - File is encrypted using AES encryption
- 5.1 - File is encrypted using corrected RC2 encryption**
- 5.2 - File is encrypted using corrected RC2-64 encryption**
- 6.1 - File is encrypted using non-OAEP key wrapping***
- 6.2 - Central directory encryption
-
-
- * Early 7.x (pre-7.2) versions of PKZIP incorrectly set the
- version needed to extract for BZIP2 compression to be 50
- when it should have been 46.
-
- ** Refer to the section on Strong Encryption Specification
- for additional information regarding RC2 corrections.
-
- *** Certificate encryption using non-OAEP key wrapping is the
- intended mode of operation for all versions beginning with 6.1.
- Support for OAEP key wrapping should only be used for
- backward compatibility when sending ZIP files to be opened by
- versions of PKZIP older than 6.1 (5.0 or 6.0).
-
- When using ZIP64 extensions, the corresponding value in the
- Zip64 end of central directory record should also be set.
- This field currently supports only the value 45 to indicate
- ZIP64 extensions are present.
-
- general purpose bit flag: (2 bytes)
-
- Bit 0: If set, indicates that the file is encrypted.
-
- (For Method 6 - Imploding)
- Bit 1: If the compression method used was type 6,
- Imploding, then this bit, if set, indicates
- an 8K sliding dictionary was used. If clear,
- then a 4K sliding dictionary was used.
- Bit 2: If the compression method used was type 6,
- Imploding, then this bit, if set, indicates
- 3 Shannon-Fano trees were used to encode the
- sliding dictionary output. If clear, then 2
- Shannon-Fano trees were used.
-
- (For Methods 8 and 9 - Deflating)
- Bit 2 Bit 1
- 0 0 Normal (-en) compression option was used.
- 0 1 Maximum (-exx/-ex) compression option was used.
- 1 0 Fast (-ef) compression option was used.
- 1 1 Super Fast (-es) compression option was used.
-
- Note: Bits 1 and 2 are undefined if the compression
- method is any other.
-
- Bit 3: If this bit is set, the fields crc-32, compressed
- size and uncompressed size are set to zero in the
- local header. The correct values are put in the
- data descriptor immediately following the compressed
- data. (Note: PKZIP version 2.04g for DOS only
- recognizes this bit for method 8 compression, newer
- versions of PKZIP recognize this bit for any
- compression method.)
- [Info-ZIP note: This bit was introduced by PKZIP 2.04 for
- DOS. In general, this feature can only be reliably used
- together with compression methods that allow intrinsic
- detection of the "end-of-compressed-data" condition. From
- the set of compression methods described in this Zip archive
- specification, only "deflate" and "bzip2" fulfill this
- requirement.
- Especially, the method STORED does not work!
- The Info-ZIP tools recognize this bit regardless of the
- compression method; but, they rely on correctly set
- "compressed size" information in the central directory entry.]
-
- Bit 4: Reserved for use with method 8, for enhanced
- deflating.
-
- Bit 5: If this bit is set, this indicates that the file is
- compressed patched data. (Note: Requires PKZIP
- version 2.70 or greater)
-
- Bit 6: Strong encryption. If this bit is set, you should
- set the version needed to extract value to at least
- 50 and you must also set bit 0. If AES encryption
- is used, the version needed to extract value must
- be at least 51.
-
- Bit 7: Currently unused.
-
- Bit 8: Currently unused.
-
- Bit 9: Currently unused.
-
- Bit 10: Currently unused.
-
- Bit 11: Currently unused.
-
- Bit 12: Reserved by PKWARE for enhanced compression.
-
- Bit 13: Used when encrypting the Central Directory to indicate
- selected data values in the Local Header are masked to
- hide their actual values. See the section describing
- the Strong Encryption Specification for details.
-
- Bit 14: Reserved by PKWARE.
-
- Bit 15: Reserved by PKWARE.
-
- compression method: (2 bytes)
-
- (see accompanying documentation for algorithm
- descriptions)
-
- 0 - The file is stored (no compression)
- 1 - The file is Shrunk
- 2 - The file is Reduced with compression factor 1
- 3 - The file is Reduced with compression factor 2
- 4 - The file is Reduced with compression factor 3
- 5 - The file is Reduced with compression factor 4
- 6 - The file is Imploded
- 7 - Reserved for Tokenizing compression algorithm
- 8 - The file is Deflated
- 9 - Enhanced Deflating using Deflate64(tm)
- 10 - PKWARE Data Compression Library Imploding
- 11 - Reserved by PKWARE
- 12 - File is compressed using BZIP2 algorithm
-
- date and time fields: (2 bytes each)
-
- The date and time are encoded in standard MS-DOS format.
- If input came from standard input, the date and time are
- those at which compression was started for this data.
- If encrypting the central directory and general purpose bit
- flag 13 is set indicating masking, the value stored in the
- Local Header will be zero.
-
- CRC-32: (4 bytes)
-
- The CRC-32 algorithm was generously contributed by
- David Schwaderer and can be found in his excellent
- book "C Programmers Guide to NetBIOS" published by
- Howard W. Sams & Co. Inc. The 'magic number' for
- the CRC is 0xdebb20e3. The proper CRC pre and post
- conditioning is used, meaning that the CRC register
- is pre-conditioned with all ones (a starting value
- of 0xffffffff) and the value is post-conditioned by
- taking the one's complement of the CRC residual.
- If bit 3 of the general purpose flag is set, this
- field is set to zero in the local header and the correct
- value is put in the data descriptor and in the central
- directory. If encrypting the central directory and general
- purpose bit flag 13 is set indicating masking, the value
- stored in the Local Header will be zero.
-
- compressed size: (4 bytes)
- uncompressed size: (4 bytes)
-
- The size of the file compressed and uncompressed,
- respectively. If bit 3 of the general purpose bit flag
- is set, these fields are set to zero in the local header
- and the correct values are put in the data descriptor and
- in the central directory. If an archive is in zip64 format
- and the value in this field is 0xFFFFFFFF, the size will be
- in the corresponding 8 byte zip64 extended information
- extra field. If encrypting the central directory and general
- purpose bit flag 13 is set indicating masking, the value stored
- for the uncompressed size in the Local Header will be zero.
-
- file name length: (2 bytes)
- extra field length: (2 bytes)
- file comment length: (2 bytes)
-
- The length of the file name, extra field, and comment
- fields respectively. The combined length of any
- directory record and these three fields should not
- generally exceed 65,535 bytes. If input came from standard
- input, the file name length is set to zero.
-
- [Info-ZIP note:
- This feature is not yet supported by any PKWARE version of ZIP
- (at least not in PKZIP for DOS and PKZIP for Windows/WinNT).
- The Info-ZIP programs handle standard input differently:
- If input came from standard input, the filename is set to "-"
- (length one).]
-
-
- disk number start: (2 bytes)
-
- The number of the disk on which this file begins. If an
- archive is in zip64 format and the value in this field is
- 0xFFFF, the size will be in the corresponding 4 byte zip64
- extended information extra field.
-
- internal file attributes: (2 bytes)
-
- Bits 1 and 2 are reserved for use by PKWARE.
-
- The lowest bit of this field indicates, if set, that
- the file is apparently an ASCII or text file. If not
- set, that the file apparently contains binary data.
- The remaining bits are unused in version 1.0.
-
- The 0x0002 bit of this field indicates, if set, that a
- 4 byte variable record length control field precedes each
- logical record indicating the length of the record. This
- flag is independent of text control characters, and if used
- in conjunction with text data, includes any control
- characters in the total length of the record. This value is
- provided for mainframe data transfer support.
-
- external file attributes: (4 bytes)
-
- The mapping of the external attributes is
- host-system dependent (see 'version made by'). For
- MS-DOS, the low order byte is the MS-DOS directory
- attribute byte. If input came from standard input, this
- field is set to zero.
-
- relative offset of local header: (4 bytes)
-
- This is the offset from the start of the first disk on
- which this file appears, to where the local header should
- be found. If an archive is in zip64 format and the value
- in this field is 0xFFFFFFFF, the size will be in the
- corresponding 8 byte zip64 extended information extra field.
-
- file name: (Variable)
-
- The name of the file, with optional relative path.
- The path stored should not contain a drive or
- device letter, or a leading slash. All slashes
- should be forward slashes '/' as opposed to
- backwards slashes '\' for compatibility with Amiga
- and Unix file systems etc. If input came from standard
- input, there is no file name field. If encrypting
- the central directory and general purpose bit flag 13 is set
- indicating masking, the file name stored in the Local Header
- will not be the actual file name. A masking value consisting
- of a unique hexadecimal value will be stored. This value will
- be sequentially incremented for each file in the archive. See
- the section on the Strong Encryption Specification for details
- on retrieving the encrypted file name.
- [Info-ZIP discrepancy:
- If input came from standard input, the file name is set
- to "-" (without the quotes).
- As far as we know, the PKWARE specification for "input from
- stdin" is not supported by PKZIP/PKUNZIP for DOS, OS/2, Windows
- Windows NT.]
-
- extra field: (Variable)
-
- This is for expansion. If additional information
- needs to be stored for special needs or for specific
- platforms, it should be stored here. Earlier versions
- of the software can then safely skip this file, and
- find the next file or header. This field will be 0
- length in version 1.0.
-
- In order to allow different programs and different types
- of information to be stored in the 'extra' field in .ZIP
- files, the following structure should be used for all
- programs storing data in this field:
-
- header1+data1 + header2+data2 . . .
-
- Each header should consist of:
-
- Header ID - 2 bytes
- Data Size - 2 bytes
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- The Header ID field indicates the type of data that is in
- the following data block.
-
- Header ID's of 0 thru 31 are reserved for use by PKWARE.
- The remaining ID's can be used by third party vendors for
- proprietary usage.
-
- The current Header ID mappings defined by PKWARE are:
-
- 0x0001 ZIP64 extended information extra field
- 0x0007 AV Info
- 0x0008 Reserved for future Unicode file name data (PFS)
- 0x0009 OS/2 extended attributes (also Info-ZIP)
- 0x000a NTFS (Win9x/WinNT FileTimes)
- 0x000c OpenVMS (also Info-ZIP)
- 0x000d Unix
- 0x000e Reserved for file stream and fork descriptors
- 0x000f Patch Descriptor
- 0x0014 PKCS#7 Store for X.509 Certificates
- 0x0015 X.509 Certificate ID and Signature for
- individual file
- 0x0016 X.509 Certificate ID for Central Directory
- 0x0017 Strong Encryption Header
- 0x0018 Record Management Controls
- 0x0019 PKCS#7 Encryption Recipient Certificate List
- 0x0065 IBM S/390 (Z390), AS/400 (I400) attributes
- - uncompressed
- 0x0066 Reserved for IBM S/390 (Z390), AS/400 (I400)
- attributes - compressed
-
- The Header ID mappings defined by Info-ZIP and third parties are:
-
- 0x07c8 Info-ZIP Macintosh (old, J. Lee)
- 0x2605 ZipIt Macintosh (first version)
- 0x2705 ZipIt Macintosh v 1.3.5 and newer (w/o full filename)
- 0x2805 ZipIt Macintosh 1.3.5+
- 0x334d Info-ZIP Macintosh (new, D. Haase's 'Mac3' field)
- 0x4154 Tandem NSK
- 0x4341 Acorn/SparkFS (David Pilling)
- 0x4453 Windows NT security descriptor (binary ACL)
- 0x4704 VM/CMS
- 0x470f MVS
- 0x4854 Theos, old inofficial port
- 0x4b46 FWKCS MD5 (see below)
- 0x4c41 OS/2 access control list (text ACL)
- 0x4d49 Info-ZIP OpenVMS (obsolete)
- 0x4d63 Macintosh SmartZIP, by Macro Bambini
- 0x4f4c Xceed original location extra field
- 0x5356 AOS/VS (binary ACL)
- 0x5455 extended timestamp
- 0x554e Xceed unicode extra field
- 0x5855 Info-ZIP Unix (original; also OS/2, NT, etc.)
- 0x6542 BeOS (BeBox, PowerMac, etc.)
- 0x6854 Theos
- 0x7441 AtheOS (AtheOS/Syllable attributes)
- 0x756e ASi Unix
- 0x7855 Info-ZIP Unix (new)
- 0xfb4a SMS/QDOS
-
- Detailed descriptions of Extra Fields defined by third
- party mappings will be documented as information on
- these data structures is made available to PKWARE.
- PKWARE does not guarantee the accuracy of any published
- third party data.
-
- The Data Size field indicates the size of the following
- data block. Programs can use this value to skip to the
- next header block, passing over any data blocks that are
- not of interest.
-
- Note: As stated above, the size of the entire .ZIP file
- header, including the file name, comment, and extra
- field should not exceed 64K in size.
-
- In case two different programs should appropriate the same
- Header ID value, it is strongly recommended that each
- program place a unique signature of at least two bytes in
- size (and preferably 4 bytes or bigger) at the start of
- each data area. Every program should verify that its
- unique signature is present, in addition to the Header ID
- value being correct, before assuming that it is a block of
- known type.
-
- In the following descriptions, note that "Short" means two bytes,
- "Long" means four bytes, and "Long-Long" means eight bytes,
- regardless of their native sizes. Unless specifically noted, all
- integer fields should be interpreted as unsigned (non-negative)
- numbers.
-
-
- -ZIP64 Extended Information Extra Field (0x0001):
- ===============================================
-
- The following is the layout of the ZIP64 extended
- information "extra" block. If one of the size or
- offset fields in the Local or Central directory
- record is too small to hold the required data,
- a ZIP64 extended information record is created.
- The order of the fields in the ZIP64 extended
- information record is fixed, but the fields will
- only appear if the corresponding Local or Central
- directory record field is set to 0xFFFF or 0xFFFFFFFF.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- (ZIP64) 0x0001 2 bytes Tag for this "extra" block type
- Size 2 bytes Size of this "extra" block
- Original
- Size 8 bytes Original uncompressed file size
- Compressed
- Size 8 bytes Size of compressed data
- Relative Header
- Offset 8 bytes Offset of local header record
- Disk Start
- Number 4 bytes Number of the disk on which
- this file starts
-
- This entry in the Local header must include BOTH original
- and compressed file sizes.
-
-
- -OS/2 Extended Attributes Extra Field (0x0009):
- =============================================
-
- The following is the layout of the OS/2 extended attributes "extra"
- block. (Last Revision 19960922)
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (OS/2) 0x0009 Short tag for this extra block type
- TSize Short total data size for this block
- BSize Long uncompressed EA data size
- CType Short compression type
- EACRC Long CRC value for uncompressed EA data
- (var.) variable compressed EA data
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (OS/2) 0x0009 Short tag for this extra block type
- TSize Short total data size for this block (4)
- BSize Long size of uncompressed local EA data
-
- The value of CType is interpreted according to the "compression
- method" section above; i.e., 0 for stored, 8 for deflated, etc.
-
- The OS/2 extended attribute structure (FEA2LIST) is
- compressed and then stored in its entirety within this
- structure. There will only ever be one "block" of data in
- the variable-length field.
-
-
- -OS/2 Access Control List Extra Field:
- ====================================
-
- The following is the layout of the OS/2 ACL extra block.
- (Last Revision 19960922)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (ACL) 0x4c41 Short tag for this extra block type ("AL")
- TSize Short total data size for this block
- BSize Long uncompressed ACL data size
- CType Short compression type
- EACRC Long CRC value for uncompressed ACL data
- (var.) variable compressed ACL data
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (ACL) 0x4c41 Short tag for this extra block type ("AL")
- TSize Short total data size for this block (4)
- BSize Long size of uncompressed local ACL data
-
- The value of CType is interpreted according to the "compression
- method" section above; i.e., 0 for stored, 8 for deflated, etc.
-
- The uncompressed ACL data consist of a text header of the form
- "ACL1:%hX,%hd\n", where the first field is the OS/2 ACCINFO acc_attr
- member and the second is acc_count, followed by acc_count strings
- of the form "%s,%hx\n", where the first field is acl_ugname (user
- group name) and the second acl_access. This block type will be
- extended for other operating systems as needed.
-
-
- -Windows NT Security Descriptor Extra Field (0x4453):
- ===================================================
-
- The following is the layout of the NT Security Descriptor (another
- type of ACL) extra block. (Last Revision 19960922)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (SD) 0x4453 Short tag for this extra block type ("SD")
- TSize Short total data size for this block
- BSize Long uncompressed SD data size
- Version Byte version of uncompressed SD data format
- CType Short compression type
- EACRC Long CRC value for uncompressed SD data
- (var.) variable compressed SD data
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (SD) 0x4453 Short tag for this extra block type ("SD")
- TSize Short total data size for this block (4)
- BSize Long size of uncompressed local SD data
-
- The value of CType is interpreted according to the "compression
- method" section above; i.e., 0 for stored, 8 for deflated, etc.
- Version specifies how the compressed data are to be interpreted
- and allows for future expansion of this extra field type. Currently
- only version 0 is defined.
-
- For version 0, the compressed data are to be interpreted as a single
- valid Windows NT SECURITY_DESCRIPTOR data structure, in self-relative
- format.
-
-
- -PKWARE Win95/WinNT Extra Field (0x000a):
- =======================================
-
- The following description covers PKWARE's "NTFS" attributes
- "extra" block, introduced with the release of PKZIP 2.50 for
- Windows. (Last Revision 20001118)
-
- (Note: At this time the Mtime, Atime and Ctime values may
- be used on any WIN32 system.)
- [Info-ZIP note: In the current implementations, this field has
- a fixed total data size of 32 bytes and is only stored as local
- extra field.]
-
- Value Size Description
- ----- ---- -----------
- (NTFS) 0x000a Short Tag for this "extra" block type
- TSize Short Total Data Size for this block
- Reserved Long for future use
- Tag1 Short NTFS attribute tag value #1
- Size1 Short Size of attribute #1, in bytes
- (var.) SubSize1 Attribute #1 data
- .
- .
- .
- TagN Short NTFS attribute tag value #N
- SizeN Short Size of attribute #N, in bytes
- (var.) SubSizeN Attribute #N data
-
- For NTFS, values for Tag1 through TagN are as follows:
- (currently only one set of attributes is defined for NTFS)
-
- Tag Size Description
- ----- ---- -----------
- 0x0001 2 bytes Tag for attribute #1
- Size1 2 bytes Size of attribute #1, in bytes (24)
- Mtime 8 bytes 64-bit NTFS file last modification time
- Atime 8 bytes 64-bit NTFS file last access time
- Ctime 8 bytes 64-bit NTFS file creation time
-
- The total length for this block is 28 bytes, resulting in a
- fixed size value of 32 for the TSize field of the NTFS block.
-
- The NTFS filetimes are 64-bit unsigned integers, stored in Intel
- (least significant byte first) byte order. They determine the
- number of 1.0E-07 seconds (1/10th microseconds!) past WinNT "epoch",
- which is "01-Jan-1601 00:00:00 UTC".
-
-
- -PKWARE OpenVMS Extra Field (0x000c):
- ===================================
-
- The following is the layout of PKWARE's OpenVMS attributes
- "extra" block. (Last Revision 12/17/91)
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- (VMS) 0x000c Short Tag for this "extra" block type
- TSize Short Total Data Size for this block
- CRC Long 32-bit CRC for remainder of the block
- Tag1 Short OpenVMS attribute tag value #1
- Size1 Short Size of attribute #1, in bytes
- (var.) Size1 Attribute #1 data
- .
- .
- .
- TagN Short OpenVMS attribute tage value #N
- SizeN Short Size of attribute #N, in bytes
- (var.) SizeN Attribute #N data
-
- Rules:
-
- 1. There will be one or more of attributes present, which
- will each be preceded by the above TagX & SizeX values.
- These values are identical to the ATR$C_XXXX and
- ATR$S_XXXX constants which are defined in ATR.H under
- OpenVMS C. Neither of these values will ever be zero.
-
- 2. No word alignment or padding is performed.
-
- 3. A well-behaved PKZIP/OpenVMS program should never produce
- more than one sub-block with the same TagX value. Also,
- there will never be more than one "extra" block of type
- 0x000c in a particular directory record.
-
-
- -Info-ZIP VMS Extra Field:
- ========================
-
- The following is the layout of Info-ZIP's VMS attributes extra
- block for VAX or Alpha AXP. The local-header and central-header
- versions are identical. (Last Revision 19960922)
-
- Value Size Description
- ----- ---- -----------
- (VMS2) 0x4d49 Short tag for this extra block type ("JM")
- TSize Short total data size for this block
- ID Long block ID
- Flags Short info bytes
- BSize Short uncompressed block size
- Reserved Long (reserved)
- (var.) variable compressed VMS file-attributes block
-
- The block ID is one of the following unterminated strings:
-
- "VFAB" struct FAB
- "VALL" struct XABALL
- "VFHC" struct XABFHC
- "VDAT" struct XABDAT
- "VRDT" struct XABRDT
- "VPRO" struct XABPRO
- "VKEY" struct XABKEY
- "VMSV" version (e.g., "V6.1"; truncated at hyphen)
- "VNAM" reserved
-
- The lower three bits of Flags indicate the compression method. The
- currently defined methods are:
-
- 0 stored (not compressed)
- 1 simple "RLE"
- 2 deflated
-
- The "RLE" method simply replaces zero-valued bytes with zero-valued
- bits and non-zero-valued bytes with a "1" bit followed by the byte
- value.
-
- The variable-length compressed data contains only the data corre-
- sponding to the indicated structure or string. Typically multiple
- VMS2 extra fields are present (each with a unique block type).
-
-
- -Info-ZIP Macintosh Extra Field:
- ==============================
-
- The following is the layout of the (old) Info-ZIP resource-fork extra
- block for Macintosh. The local-header and central-header versions
- are identical. (Last Revision 19960922)
-
- Value Size Description
- ----- ---- -----------
- (Mac) 0x07c8 Short tag for this extra block type
- TSize Short total data size for this block
- "JLEE" beLong extra-field signature
- FInfo 16 bytes Macintosh FInfo structure
- CrDat beLong HParamBlockRec fileParam.ioFlCrDat
- MdDat beLong HParamBlockRec fileParam.ioFlMdDat
- Flags beLong info bits
- DirID beLong HParamBlockRec fileParam.ioDirID
- VolName 28 bytes volume name (optional)
-
- All fields but the first two are in native Macintosh format
- (big-endian Motorola order, not little-endian Intel). The least
- significant bit of Flags is 1 if the file is a data fork, 0 other-
- wise. In addition, if this extra field is present, the filename
- has an extra 'd' or 'r' appended to indicate data fork or resource
- fork. The 28-byte VolName field may be omitted.
-
-
- -ZipIt Macintosh Extra Field (long):
- ==================================
-
- The following is the layout of the ZipIt extra block for Macintosh.
- The local-header and central-header versions are identical.
- (Last Revision 19970130)
-
- Value Size Description
- ----- ---- -----------
- (Mac2) 0x2605 Short tag for this extra block type
- TSize Short total data size for this block
- "ZPIT" beLong extra-field signature
- FnLen Byte length of FileName
- FileName variable full Macintosh filename
- FileType Byte[4] four-byte Mac file type string
- Creator Byte[4] four-byte Mac creator string
-
-
- -ZipIt Macintosh Extra Field (short, for files):
- ==============================================
-
- The following is the layout of a shortened variant of the
- ZipIt extra block for Macintosh (without "full name" entry).
- This variant is used by ZipIt 1.3.5 and newer for entries of
- files (not directories) that do not have a MacBinary encoded
- file. The local-header and central-header versions are identical.
- (Last Revision 20030602)
-
- Value Size Description
- ----- ---- -----------
- (Mac2b) 0x2705 Short tag for this extra block type
- TSize Short total data size for this block (min. 12)
- "ZPIT" beLong extra-field signature
- FileType Byte[4] four-byte Mac file type string
- Creator Byte[4] four-byte Mac creator string
- fdFlags beShort attributes from FInfo.frFlags,
- may be omitted
- 0x0000 beShort reserved, may be omitted
-
-
- -ZipIt Macintosh Extra Field (short, for directories):
- ====================================================
-
- The following is the layout of a shortened variant of the
- ZipIt extra block for Macintosh used only for directory
- entries. This variant is used by ZipIt 1.3.5 and newer to
- save some optional Mac-specific information about directories.
- The local-header and central-header versions are identical.
-
- Value Size Description
- ----- ---- -----------
- (Mac2c) 0x2805 Short tag for this extra block type
- TSize Short total data size for this block (12)
- "ZPIT" beLong extra-field signature
- frFlags beShort attributes from DInfo.frFlags, may
- be omitted
- View beShort ZipIt view flag, may be omitted
-
-
- The View field specifies ZipIt-internal settings as follows:
-
- Bits of the Flags:
- bit 0 if set, the folder is shown expanded (open)
- when the archive contents are viewed in ZipIt.
- bits 1-15 reserved, zero;
-
-
- -Info-ZIP Macintosh Extra Field (new):
- ====================================
-
- The following is the layout of the (new) Info-ZIP extra
- block for Macintosh, designed by Dirk Haase.
- All values are in little-endian.
- (Last Revision 19981005)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (Mac3) 0x334d Short tag for this extra block type ("M3")
- TSize Short total data size for this block
- BSize Long uncompressed finder attribute data size
- Flags Short info bits
- fdType Byte[4] Type of the File (4-byte string)
- fdCreator Byte[4] Creator of the File (4-byte string)
- (CType) Short compression type
- (CRC) Long CRC value for uncompressed MacOS data
- Attribs variable finder attribute data (see below)
-
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (Mac3) 0x334d Short tag for this extra block type ("M3")
- TSize Short total data size for this block
- BSize Long uncompressed finder attribute data size
- Flags Short info bits
- fdType Byte[4] Type of the File (4-byte string)
- fdCreator Byte[4] Creator of the File (4-byte string)
-
- The third bit of Flags in both headers indicates whether
- the LOCAL extra field is uncompressed (and therefore whether CType
- and CRC are omitted):
-
- Bits of the Flags:
- bit 0 if set, file is a data fork; otherwise unset
- bit 1 if set, filename will be not changed
- bit 2 if set, Attribs is uncompressed (no CType, CRC)
- bit 3 if set, date and times are in 64 bit
- if zero date and times are in 32 bit.
- bit 4 if set, timezone offsets fields for the native
- Mac times are omitted (UTC support deactivated)
- bits 5-15 reserved;
-
-
- Attributes:
-
- Attribs is a Mac-specific block of data in little-endian format with
- the following structure (if compressed, uncompress it first):
-
- Value Size Description
- ----- ---- -----------
- fdFlags Short Finder Flags
- fdLocation.v Short Finder Icon Location
- fdLocation.h Short Finder Icon Location
- fdFldr Short Folder containing file
-
- FXInfo 16 bytes Macintosh FXInfo structure
- FXInfo-Structure:
- fdIconID Short
- fdUnused[3] Short unused but reserved 6 bytes
- fdScript Byte Script flag and number
- fdXFlags Byte More flag bits
- fdComment Short Comment ID
- fdPutAway Long Home Dir ID
-
- FVersNum Byte file version number
- may be not used by MacOS
- ACUser Byte directory access rights
-
- FlCrDat ULong date and time of creation
- FlMdDat ULong date and time of last modification
- FlBkDat ULong date and time of last backup
- These time numbers are original Mac FileTime values (local time!).
- Currently, date-time width is 32-bit, but future version may
- support be 64-bit times (see flags)
-
- CrGMTOffs Long(signed!) difference "local Creat. time - UTC"
- MdGMTOffs Long(signed!) difference "local Modif. time - UTC"
- BkGMTOffs Long(signed!) difference "local Backup time - UTC"
- These "local time - UTC" differences (stored in seconds) may be
- used to support timestamp adjustment after inter-timezone transfer.
- These fields are optional; bit 4 of the flags word controls their
- presence.
-
- Charset Short TextEncodingBase (Charset)
- valid for the following two fields
-
- FullPath variable Path of the current file.
- Zero terminated string (C-String)
- Currently coded in the native Charset.
-
- Comment variable Finder Comment of the current file.
- Zero terminated string (C-String)
- Currently coded in the native Charset.
-
-
- -SmartZIP Macintosh Extra Field:
- ====================================
-
- The following is the layout of the SmartZIP extra
- block for Macintosh, designed by Marco Bambini.
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- 0x4d63 Short tag for this extra block type ("cM")
- TSize Short total data size for this block (64)
- "dZip" beLong extra-field signature
- fdType Byte[4] Type of the File (4-byte string)
- fdCreator Byte[4] Creator of the File (4-byte string)
- fdFlags beShort Finder Flags
- fdLocation.v beShort Finder Icon Location
- fdLocation.h beShort Finder Icon Location
- fdFldr beShort Folder containing file
- CrDat beLong HParamBlockRec fileParam.ioFlCrDat
- MdDat beLong HParamBlockRec fileParam.ioFlMdDat
- frScroll.v Byte vertical pos. of folder's scroll bar
- fdScript Byte Script flag and number
- frScroll.h Byte horizontal pos. of folder's scroll bar
- fdXFlags Byte More flag bits
- FileName Byte[32] full Macintosh filename (pascal string)
-
- All fields but the first two are in native Macintosh format
- (big-endian Motorola order, not little-endian Intel).
- The extra field size is fixed to 64 bytes.
- The local-header and central-header versions are identical.
-
-
- -Acorn SparkFS Extra Field:
- =========================
-
- The following is the layout of David Pilling's SparkFS extra block
- for Acorn RISC OS. The local-header and central-header versions are
- identical. (Last Revision 19960922)
-
- Value Size Description
- ----- ---- -----------
- (Acorn) 0x4341 Short tag for this extra block type ("AC")
- TSize Short total data size for this block (20)
- "ARC0" Long extra-field signature
- LoadAddr Long load address or file type
- ExecAddr Long exec address
- Attr Long file permissions
- Zero Long reserved; always zero
-
- The following bits of Attr are associated with the given file
- permissions:
-
- bit 0 user-writable ('W')
- bit 1 user-readable ('R')
- bit 2 reserved
- bit 3 locked ('L')
- bit 4 publicly writable ('w')
- bit 5 publicly readable ('r')
- bit 6 reserved
- bit 7 reserved
-
-
- -VM/CMS Extra Field:
- ==================
-
- The following is the layout of the file-attributes extra block for
- VM/CMS. The local-header and central-header versions are
- identical. (Last Revision 19960922)
-
- Value Size Description
- ----- ---- -----------
- (VM/CMS) 0x4704 Short tag for this extra block type
- TSize Short total data size for this block
- flData variable file attributes data
-
- flData is an uncompressed fldata_t struct.
-
-
- -MVS Extra Field:
- ===============
-
- The following is the layout of the file-attributes extra block for
- MVS. The local-header and central-header versions are identical.
- (Last Revision 19960922)
-
- Value Size Description
- ----- ---- -----------
- (MVS) 0x470f Short tag for this extra block type
- TSize Short total data size for this block
- flData variable file attributes data
-
- flData is an uncompressed fldata_t struct.
-
-
- -PKWARE Unix Extra Field (0x000d):
- ================================
-
- The following is the layout of PKWARE's Unix "extra" block.
- It was introduced with the release of PKZIP for Unix 2.50.
- Note: all fields are stored in Intel low-byte/high-byte order.
- (Last Revision 19980901)
-
- This field has a minimum data size of 12 bytes and is only stored
- as local extra field.
-
- Value Size Description
- ----- ---- -----------
- (Unix0) 0x000d Short Tag for this "extra" block type
- TSize Short Total Data Size for this block
- AcTime Long time of last access (UTC/GMT)
- ModTime Long time of last modification (UTC/GMT)
- UID Short Unix user ID
- GID Short Unix group ID
- (var) variable Variable length data field
-
- The variable length data field will contain file type
- specific data. Currently the only values allowed are
- the original "linked to" file names for hard or symbolic
- links, and the major and minor device node numbers for
- character and block device nodes. Since device nodes
- cannot be either symbolic or hard links, only one set of
- variable length data is stored. Link files will have the
- name of the original file stored. This name is NOT NULL
- terminated. Its size can be determined by checking TSize -
- 12. Device entries will have eight bytes stored as two 4
- byte entries (in little-endian format). The first entry
- will be the major device number, and the second the minor
- device number.
-
- [Info-ZIP note: The fixed part of this field has the same layout as
- Info-ZIP's abandoned "Unix1 timestamps & owner ID info" extra field;
- only the two tag bytes are different.]
-
-
- -PATCH Descriptor Extra Field (0x000f):
- =====================================
-
- The following is the layout of the Patch Descriptor "extra"
- block.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- (Patch) 0x000f Short Tag for this "extra" block type
- TSize Short Size of the total "extra" block
- Version Short Version of the descriptor
- Flags Long Actions and reactions (see below)
- OldSize Long Size of the file about to be patched
- OldCRC Long 32-bit CRC of the file about to be patched
- NewSize Long Size of the resulting file
- NewCRC Long 32-bit CRC of the resulting file
-
-
- Actions and reactions
-
- Bits Description
- ---- ----------------
- 0 Use for auto detection
- 1 Treat as a self-patch
- 2-3 RESERVED
- 4-5 Action (see below)
- 6-7 RESERVED
- 8-9 Reaction (see below) to absent file
- 10-11 Reaction (see below) to newer file
- 12-13 Reaction (see below) to unknown file
- 14-15 RESERVED
- 16-31 RESERVED
-
- Actions
-
- Action Value
- ------ -----
- none 0
- add 1
- delete 2
- patch 3
-
- Reactions
-
- Reaction Value
- -------- -----
- ask 0
- skip 1
- ignore 2
- fail 3
-
- Patch support is provided by PKPatchMaker(tm) technology and is
- covered under U.S. Patents and Patents Pending.
-
-
- -PKCS#7 Store for X.509 Certificates (0x0014):
- ============================================
-
- This field contains information about each of the certificates
- files may be signed with. When the Central Directory Encryption
- feature is enabled for a ZIP file, this record will appear in
- the Archive Extra Data Record, otherwise it will appear in the
- first central directory record and will be ignored in any
- other record.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- (Store) 0x0014 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size of the store data
- SData TSize Data about the store
-
- SData
- Value Size Description
- ----- ---- -----------
- Version 2 bytes Version number, 0x0001 for now
- StoreD (variable) Actual store data
-
- The StoreD member is suitable for passing as the pbData
- member of a CRYPT_DATA_BLOB to the CertOpenStore() function
- in Microsoft's CryptoAPI. The SSize member above will be
- cbData + 6, where cbData is the cbData member of the same
- CRYPT_DATA_BLOB. The encoding type to pass to
- CertOpenStore() should be
- PKCS_7_ANS_ENCODING | X509_ASN_ENCODING.
-
-
- -X.509 Certificate ID and Signature for individual file (0x0015):
- ===============================================================
-
- This field contains the information about which certificate in
- the PKCS#7 store was used to sign a particular file. It also
- contains the signature data. This field can appear multiple
- times, but can only appear once per certificate.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- (CID) 0x0015 2 bytes Tag for this "extra" block type
- CSize 2 bytes Size of Method
- Method (variable)
-
- Method
- Value Size Description
- ----- ---- -----------
- Version 2 bytes Version number, for now 0x0001
- AlgID 2 bytes Algorithm ID used for signing
- IDSize 2 bytes Size of Certificate ID data
- CertID (variable) Certificate ID data
- SigSize 2 bytes Size of Signature data
- Sig (variable) Signature data
-
- CertID
- Value Size Description
- ----- ---- -----------
- Size1 4 bytes Size of CertID, should be (IDSize - 4)
- Size1 4 bytes A bug in version one causes this value
- to appear twice.
- IssSize 4 bytes Issuer data size
- Issuer (variable) Issuer data
- SerSize 4 bytes Serial Number size
- Serial (variable) Serial Number data
-
- The Issuer and IssSize members are suitable for creating a
- CRYPT_DATA_BLOB to be the Issuer member of a CERT_INFO
- struct. The Serial and SerSize members would be the
- SerialNumber member of the same CERT_INFO struct. This
- struct would be used to find the certificate in the store
- the file was signed with. Those structures are from the MS
- CryptoAPI.
-
- Sig and SigSize are the actual signature data and size
- generated by signing the file with the MS CryptoAPI using a
- hash created with the given AlgID.
-
-
- -X.509 Certificate ID and Signature for central directory (0x0016):
- =================================================================
-
- This field contains the information about which certificate in
- the PKCS#7 store was used to sign the central directory structure.
- When the Central Directory Encryption feature is enabled for a
- ZIP file, this record will appear in the Archive Extra Data Record,
- otherwise it will appear in the first central directory record,
- along with the store. The data structure is the
- same as the CID, except that SigSize will be 0, and there
- will be no Sig member.
-
- This field is also kept after the last central directory
- record, as the signature data (ID 0x05054b50, it looks like
- a central directory record of a different type). This
- second copy of the data is the Signature Data member of the
- record, and will have a SigSize that is non-zero, and will
- have Sig data.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- (CDID) 0x0016 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size of data that follows
- TData TSize Data
-
-
- -Strong Encryption Header (0x0017) (EFS):
- ===============================
-
- Value Size Description
- ----- ---- -----------
- 0x0017 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size of data that follows
- Format 2 bytes Format definition for this record
- AlgID 2 bytes Encryption algorithm identifier
- Bitlen 2 bytes Bit length of encryption key
- Flags 2 bytes Processing flags
- CertData TSize-8 Certificate decryption extra field data
- (refer to the explanation for CertData
- in the section describing the
- Certificate Processing Method under
- the Strong Encryption Specification)
-
-
- -Record Management Controls (0x0018):
- ===================================
-
- Value Size Description
- ----- ---- -----------
-(Rec-CTL) 0x0018 2 bytes Tag for this "extra" block type
- CSize 2 bytes Size of total extra block data
- Tag1 2 bytes Record control attribute 1
- Size1 2 bytes Size of attribute 1, in bytes
- Data1 Size1 Attribute 1 data
- .
- .
- .
- TagN 2 bytes Record control attribute N
- SizeN 2 bytes Size of attribute N, in bytes
- DataN SizeN Attribute N data
-
-
- -PKCS#7 Encryption Recipient Certificate List (0x0019): (EFS)
- =====================================================
-
- This field contains the information about each of the certificates
- that files may be encrypted with. This field should only appear
- in the archive extra data record. This field is not required and
- serves only to aide archive modifications by preserving public
- encryption data. Individual security requirements may dictate
- that this data be omitted to deter information exposure.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- (CStore) 0x0019 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size of the store data
- TData TSize Data about the store
-
- TData:
-
- Value Size Description
- ----- ---- -----------
- Version 2 bytes Format version number - must 0x0001 at this time
- CStore (var) PKCS#7 data blob
-
-
- -MVS Extra Field (PKWARE, 0x0065):
- ================================
-
- The following is the layout of the MVS "extra" block.
- Note: Some fields are stored in Big Endian format.
- All text is in EBCDIC format unless otherwise specified.
-
- Value Size Description
- ----- ---- -----------
- (MVS) 0x0065 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size for the following data block
- ID 4 bytes EBCDIC "Z390" 0xE9F3F9F0 or
- "T4MV" for TargetFour
- (var) TSize-4 Attribute data
-
-
- -OS/400 Extra Field (0x0065):
- ===========================
-
- The following is the layout of the OS/400 "extra" block.
- Note: Some fields are stored in Big Endian format.
- All text is in EBCDIC format unless otherwise specified.
-
- Value Size Description
- ----- ---- -----------
- (OS400) 0x0065 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size for the following data block
- ID 4 bytes EBCDIC "I400" 0xC9F4F0F0 or
- "T4MV" for TargetFour
- (var) TSize-4 Attribute data
-
-
- -Extended Timestamp Extra Field:
- ==============================
-
- The following is the layout of the extended-timestamp extra block.
- (Last Revision 19970118)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (time) 0x5455 Short tag for this extra block type ("UT")
- TSize Short total data size for this block
- Flags Byte info bits
- (ModTime) Long time of last modification (UTC/GMT)
- (AcTime) Long time of last access (UTC/GMT)
- (CrTime) Long time of original creation (UTC/GMT)
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (time) 0x5455 Short tag for this extra block type ("UT")
- TSize Short total data size for this block
- Flags Byte info bits (refers to local header!)
- (ModTime) Long time of last modification (UTC/GMT)
-
- The central-header extra field contains the modification time only,
- or no timestamp at all. TSize is used to flag its presence or
- absence. But note:
-
- If "Flags" indicates that Modtime is present in the local header
- field, it MUST be present in the central header field, too!
- This correspondence is required because the modification time
- value may be used to support trans-timezone freshening and
- updating operations with zip archives.
-
- The time values are in standard Unix signed-long format, indicating
- the number of seconds since 1 January 1970 00:00:00. The times
- are relative to Coordinated Universal Time (UTC), also sometimes
- referred to as Greenwich Mean Time (GMT). To convert to local time,
- the software must know the local timezone offset from UTC/GMT.
-
- The lower three bits of Flags in both headers indicate which time-
- stamps are present in the LOCAL extra field:
-
- bit 0 if set, modification time is present
- bit 1 if set, access time is present
- bit 2 if set, creation time is present
- bits 3-7 reserved for additional timestamps; not set
-
- Those times that are present will appear in the order indicated, but
- any combination of times may be omitted. (Creation time may be
- present without access time, for example.) TSize should equal
- (1 + 4*(number of set bits in Flags)), as the block is currently
- defined. Other timestamps may be added in the future.
-
-
- -Info-ZIP Unix Extra Field (type 1):
- ==================================
-
- The following is the layout of the old Info-ZIP extra block for
- Unix. It has been replaced by the extended-timestamp extra block
- (0x5455) and the Unix type 2 extra block (0x7855).
- (Last Revision 19970118)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (Unix1) 0x5855 Short tag for this extra block type ("UX")
- TSize Short total data size for this block
- AcTime Long time of last access (UTC/GMT)
- ModTime Long time of last modification (UTC/GMT)
- UID Short Unix user ID (optional)
- GID Short Unix group ID (optional)
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (Unix1) 0x5855 Short tag for this extra block type ("UX")
- TSize Short total data size for this block
- AcTime Long time of last access (GMT/UTC)
- ModTime Long time of last modification (GMT/UTC)
-
- The file access and modification times are in standard Unix signed-
- long format, indicating the number of seconds since 1 January 1970
- 00:00:00. The times are relative to Coordinated Universal Time
- (UTC), also sometimes referred to as Greenwich Mean Time (GMT). To
- convert to local time, the software must know the local timezone
- offset from UTC/GMT. The modification time may be used by non-Unix
- systems to support inter-timezone freshening and updating of zip
- archives.
-
- The local-header extra block may optionally contain UID and GID
- info for the file. The local-header TSize value is the only
- indication of this. Note that Unix UIDs and GIDs are usually
- specific to a particular machine, and they generally require root
- access to restore.
-
- This extra field type is obsolete, but it has been in use since
- mid-1994. Therefore future archiving software should continue to
- support it. Some guidelines:
-
- An archive member should either contain the old "Unix1"
- extra field block or the new extra field types "time" and/or
- "Unix2".
-
- If both the old "Unix1" block type and one or both of the new
- block types "time" and "Unix2" are found, the "Unix1" block
- should be considered invalid and ignored.
-
- Unarchiving software should recognize both old and new extra
- field block types, but the info from new types overrides the
- old "Unix1" field.
-
- Archiving software should recognize "Unix1" extra fields for
- timestamp comparison but never create it for updated, freshened
- or new archive members. When copying existing members to a new
- archive, any "Unix1" extra field blocks should be converted to
- the new "time" and/or "Unix2" types.
-
-
- -Info-ZIP Unix Extra Field (type 2):
- ==================================
-
- The following is the layout of the new Info-ZIP extra block for
- Unix. (Last Revision 19960922)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (Unix2) 0x7855 Short tag for this extra block type ("Ux")
- TSize Short total data size for this block (4)
- UID Short Unix user ID
- GID Short Unix group ID
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (Unix2) 0x7855 Short tag for this extra block type ("Ux")
- TSize Short total data size for this block (0)
-
- The data size of the central-header version is zero; it is used
- solely as a flag that UID/GID info is present in the local-header
- extra field. If additional fields are ever added to the local
- version, the central version may be extended to indicate this.
-
- Note that Unix UIDs and GIDs are usually specific to a particular
- machine, and they generally require root access to restore.
-
-
- -ASi Unix Extra Field:
- ====================
-
- The following is the layout of the ASi extra block for Unix. The
- local-header and central-header versions are identical.
- (Last Revision 19960916)
-
- Value Size Description
- ----- ---- -----------
- (Unix3) 0x756e Short tag for this extra block type ("nu")
- TSize Short total data size for this block
- CRC Long CRC-32 of the remaining data
- Mode Short file permissions
- SizDev Long symlink'd size OR major/minor dev num
- UID Short user ID
- GID Short group ID
- (var.) variable symbolic link filename
-
- Mode is the standard Unix st_mode field from struct stat, containing
- user/group/other permissions, setuid/setgid and symlink info, etc.
-
- If Mode indicates that this file is a symbolic link, SizDev is the
- size of the file to which the link points. Otherwise, if the file
- is a device, SizDev contains the standard Unix st_rdev field from
- struct stat (includes the major and minor numbers of the device).
- SizDev is undefined in other cases.
-
- If Mode indicates that the file is a symbolic link, the final field
- will be the name of the file to which the link points. The file-
- name length can be inferred from TSize.
-
- [Note that TSize may incorrectly refer to the data size not counting
- the CRC; i.e., it may be four bytes too small.]
-
-
- -BeOS Extra Field:
- ================
-
- The following is the layout of the file-attributes extra block for
- BeOS. (Last Revision 19970531)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (BeOS) 0x6542 Short tag for this extra block type ("Be")
- TSize Short total data size for this block
- BSize Long uncompressed file attribute data size
- Flags Byte info bits
- (CType) Short compression type
- (CRC) Long CRC value for uncompressed file attribs
- Attribs variable file attribute data
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (BeOS) 0x6542 Short tag for this extra block type ("Be")
- TSize Short total data size for this block (5)
- BSize Long size of uncompr. local EF block data
- Flags Byte info bits
-
- The least significant bit of Flags in both headers indicates whether
- the LOCAL extra field is uncompressed (and therefore whether CType
- and CRC are omitted):
-
- bit 0 if set, Attribs is uncompressed (no CType, CRC)
- bits 1-7 reserved; if set, assume error or unknown data
-
- Currently the only supported compression types are deflated (type 8)
- and stored (type 0); the latter is not used by Info-ZIP's Zip but is
- supported by UnZip.
-
- Attribs is a BeOS-specific block of data in big-endian format with
- the following structure (if compressed, uncompress it first):
-
- Value Size Description
- ----- ---- -----------
- Name variable attribute name (null-terminated string)
- Type Long attribute type (32-bit unsigned integer)
- Size Long Long data size for this sub-block (64 bits)
- Data variable attribute data
-
- The attribute structure is repeated for every attribute. The Data
- field may contain anything--text, flags, bitmaps, etc.
-
-
- -AtheOS Extra Field:
- ==================
-
- The following is the layout of the file-attributes extra block for
- AtheOS. This field is a very close spin-off from the BeOS e.f.
- The only differences are:
- - a new extra field signature
- - numeric field in the attributes data are stored in little-endian
- format ("i386" was initial hardware for AtheOS)
- (Last Revision 20040908)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (AtheOS) 0x7441 Short tag for this extra block type ("At")
- TSize Short total data size for this block
- BSize Long uncompressed file attribute data size
- Flags Byte info bits
- (CType) Short compression type
- (CRC) Long CRC value for uncompressed file attribs
- Attribs variable file attribute data
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (AtheOS) 0x7441 Short tag for this extra block type ("At")
- TSize Short total data size for this block (5)
- BSize Long size of uncompr. local EF block data
- Flags Byte info bits
-
- The least significant bit of Flags in both headers indicates whether
- the LOCAL extra field is uncompressed (and therefore whether CType
- and CRC are omitted):
-
- bit 0 if set, Attribs is uncompressed (no CType, CRC)
- bits 1-7 reserved; if set, assume error or unknown data
-
- Currently the only supported compression types are deflated (type 8)
- and stored (type 0); the latter is not used by Info-ZIP's Zip but is
- supported by UnZip.
-
- Attribs is a AtheOS-specific block of data in little-endian format
- with the following structure (if compressed, uncompress it first):
-
- Value Size Description
- ----- ---- -----------
- Name variable attribute name (null-terminated string)
- Type Long attribute type (32-bit unsigned integer)
- Size Long Long data size for this sub-block (64 bits)
- Data variable attribute data
-
- The attribute structure is repeated for every attribute. The Data
- field may contain anything--text, flags, bitmaps, etc.
-
-
- -SMS/QDOS Extra Field:
- ====================
-
- The following is the layout of the file-attributes extra block for
- SMS/QDOS. The local-header and central-header versions are identical.
- (Last Revision 19960929)
-
- Value Size Description
- ----- ---- -----------
- (QDOS) 0xfb4a Short tag for this extra block type
- TSize Short total data size for this block
- LongID Long extra-field signature
- (ExtraID) Long additional signature/flag bytes
- QDirect 64 bytes qdirect structure
-
- LongID may be "QZHD" or "QDOS". In the latter case, ExtraID will
- be present. Its first three bytes are "02\0"; the last byte is
- currently undefined.
-
- QDirect contains the file's uncompressed directory info (qdirect
- struct). Its elements are in native (big-endian) format:
-
- d_length beLong file length
- d_access byte file access type
- d_type byte file type
- d_datalen beLong data length
- d_reserved beLong unused
- d_szname beShort size of filename
- d_name 36 bytes filename
- d_update beLong time of last update
- d_refdate beLong file version number
- d_backup beLong time of last backup (archive date)
-
-
- -AOS/VS Extra Field:
- ==================
-
- The following is the layout of the extra block for Data General
- AOS/VS. The local-header and central-header versions are identical.
- (Last Revision 19961125)
-
- Value Size Description
- ----- ---- -----------
- (AOSVS) 0x5356 Short tag for this extra block type ("VS")
- TSize Short total data size for this block
- "FCI\0" Long extra-field signature
- Version Byte version of AOS/VS extra block (10 = 1.0)
- Fstat variable fstat packet
- AclBuf variable raw ACL data ($MXACL bytes)
-
- Fstat contains the file's uncompressed fstat packet, which is one of
- the following:
-
- normal fstat packet (P_FSTAT struct)
- DIR/CPD fstat packet (P_FSTAT_DIR struct)
- unit (device) fstat packet (P_FSTAT_UNIT struct)
- IPC file fstat packet (P_FSTAT_IPC struct)
-
- AclBuf contains the raw ACL data; its length is $MXACL.
-
-
- -Tandem NSK Extra Field:
- ======================
-
- The following is the layout of the file-attributes extra block for
- Tandem NSK. The local-header and central-header versions are
- identical. (Last Revision 19981221)
-
- Value Size Description
- ----- ---- -----------
- (TA) 0x4154 Short tag for this extra block type ("TA")
- TSize Short total data size for this block (20)
- NSKattrs 20 Bytes NSK attributes
-
-
- -THEOS Extra Field:
- =================
-
- The following is the layout of the file-attributes extra block for
- Theos. The local-header and central-header versions are identical.
- (Last Revision 19990206)
-
- Value Size Description
- ----- ---- -----------
- (Theos) 0x6854 Short 'Th' signature
- size Short size of extra block
- flags Byte reserved for future use
- filesize Long file size
- fileorg Byte type of file (see below)
- keylen Short key length for indexed and keyed files,
- data segment size for 16 bits programs
- reclen Short record length for indexed,keyed and direct,
- text segment size for 16 bits programs
- filegrow Byte growing factor for indexed,keyed and direct
- protect Byte protections (see below)
- reserved Short reserved for future use
-
- File types
- ==========
-
- 0x80 library (keyed access list of files)
- 0x40 directory
- 0x10 stream file
- 0x08 direct file
- 0x04 keyed file
- 0x02 indexed file
- 0x0e reserved
- 0x01 16 bits real mode program (obsolete)
- 0x21 16 bits protected mode program
- 0x41 32 bits protected mode program
-
- Protection codes
- ================
-
- User protection
- ---------------
- 0x01 non readable
- 0x02 non writable
- 0x04 non executable
- 0x08 non erasable
-
- Other protection
- ----------------
- 0x10 non readable
- 0x20 non writable
- 0x40 non executable Theos before 4.0
- 0x40 modified Theos 4.x
- 0x80 not hidden
-
-
- -THEOS old inofficial Extra Field:
- ================================
-
- The following is the layout of an inoffical former version of a
- Theos file-attributes extra blocks. This layout was never published
- and is no longer created. However, UnZip can optionally support it
- when compiling with the option flag OLD_THEOS_EXTRA defined.
- Both the local-header and central-header versions are identical.
- (Last Revision 19990206)
-
- Value Size Description
- ----- ---- -----------
- (THS0) 0x4854 Short 'TH' signature
- size Short size of extra block
- flags Short reserved for future use
- filesize Long file size
- reclen Short record length for indexed,keyed and direct,
- text segment size for 16 bits programs
- keylen Short key length for indexed and keyed files,
- data segment size for 16 bits programs
- filegrow Byte growing factor for indexed,keyed and direct
- reserved 3 Bytes reserved for future use
-
-
- -FWKCS MD5 Extra Field (0x4b46):
- ==============================
-
- The FWKCS Contents_Signature System, used in automatically
- identifying files independent of filename, optionally adds
- and uses an extra field to support the rapid creation of
- an enhanced contents_signature.
- There is no local-header version; the following applies
- only to the central header. (Last Revision 19961207)
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (MD5) 0x4b46 Short tag for this extra block type ("FK")
- TSize Short total data size for this block (19)
- "MD5" 3 bytes extra-field signature
- MD5hash 16 bytes 128-bit MD5 hash of uncompressed data
- (low byte first)
-
- When FWKCS revises a .ZIP file central directory to add
- this extra field for a file, it also replaces the
- central directory entry for that file's uncompressed
- file length with a measured value.
-
- FWKCS provides an option to strip this extra field, if
- present, from a .ZIP file central directory. In adding
- this extra field, FWKCS preserves .ZIP file Authenticity
- Verification; if stripping this extra field, FWKCS
- preserves all versions of AV through PKZIP version 2.04g.
-
- FWKCS, and FWKCS Contents_Signature System, are
- trademarks of Frederick W. Kantor.
-
- (1) R. Rivest, RFC1321.TXT, MIT Laboratory for Computer
- Science and RSA Data Security, Inc., April 1992.
- ll.76-77: "The MD5 algorithm is being placed in the
- public domain for review and possible adoption as a
- standard."
-
-
- file comment: (Variable)
-
- The comment for this file.
-
- number of this disk: (2 bytes)
-
- The number of this disk, which contains central
- directory end record. If an archive is in zip64 format
- and the value in this field is 0xFFFF, the size will
- be in the corresponding 4 byte zip64 end of central
- directory field.
-
- number of the disk with the start of the central directory: (2 bytes)
-
- The number of the disk on which the central
- directory starts. If an archive is in zip64 format
- and the value in this field is 0xFFFF, the size will
- be in the corresponding 4 byte zip64 end of central
- directory field.
-
- total number of entries in the central dir on this disk: (2 bytes)
-
- The number of central directory entries on this disk.
- If an archive is in zip64 format and the value in
- this field is 0xFFFF, the size will be in the
- corresponding 8 byte zip64 end of central
- directory field.
-
- total number of entries in the central dir: (2 bytes)
-
- The total number of files in the .ZIP file. If an
- archive is in zip64 format and the value in this field
- is 0xFFFF, the size will be in the corresponding 8 byte
- zip64 end of central directory field.
-
- size of the central directory: (4 bytes)
-
- The size (in bytes) of the entire central directory.
- If an archive is in zip64 format and the value in
- this field is 0xFFFFFFFF, the size will be in the
- corresponding 8 byte zip64 end of central
- directory field.
-
- offset of start of central directory with respect to
- the starting disk number: (4 bytes)
-
- Offset of the start of the central directory on the
- disk on which the central directory starts. If an
- archive is in zip64 format and the value in this
- field is 0xFFFFFFFF, the size will be in the
- corresponding 8 byte zip64 end of central
- directory field.
-
- .ZIP file comment length: (2 bytes)
-
- The length of the comment for this .ZIP file.
-
- .ZIP file comment: (Variable)
-
- The comment for this .ZIP file. ZIP file comment data
- is stored unsecured. No encryption or data authentication
- is applied to this area at this time. Confidential information
- should not be stored in this section.
-
- zip64 extensible data sector (variable size)
-
- (currently reserved for use by PKWARE)
-
-
- K. General notes:
-
- 1) All fields unless otherwise noted are unsigned and stored
- in Intel low-byte:high-byte, low-word:high-word order.
-
- 2) String fields are not null terminated, since the
- length is given explicitly.
-
- 3) Local headers should not span disk boundaries. Also, even
- though the central directory can span disk boundaries, no
- single record in the central directory should be split
- across disks.
-
- 4) The entries in the central directory may not necessarily
- be in the same order that files appear in the .ZIP file.
-
- 5) Spanned/Split archives created using PKZIP for Windows
- (V2.50 or greater), PKZIP Command Line (V2.50 or greater),
- or PKZIP Explorer will include a special spanning
- signature as the first 4 bytes of the first segment of
- the archive. This signature (0x08074b50) will be
- followed immediately by the local header signature for
- the first file in the archive. A special spanning
- marker may also appear in spanned/split archives if the
- spanning or splitting process starts but only requires
- one segment. In this case the 0x08074b50 signature
- will be replaced with the temporary spanning marker
- signature of 0x30304b50. Spanned/split archives
- created with this special signature are compatible with
- all versions of PKZIP from PKWARE. Split archives can
- only be uncompressed by other versions of PKZIP that
- know how to create a split archive.
-
- 6) If one of the fields in the end of central directory
- record is too small to hold required data, the field
- should be set to -1 (0xFFFF or 0xFFFFFFFF) and the
- Zip64 format record should be created.
-
- 7) The end of central directory record and the
- Zip64 end of central directory locator record must
- reside on the same disk when splitting or spanning
- an archive.
-
-V. UnShrinking - Method 1
--------------------------
-
-Shrinking is a Dynamic Ziv-Lempel-Welch compression algorithm
-with partial clearing. The initial code size is 9 bits, and
-the maximum code size is 13 bits. Shrinking differs from
-conventional Dynamic Ziv-Lempel-Welch implementations in several
-respects:
-
-1) The code size is controlled by the compressor, and is not
- automatically increased when codes larger than the current
- code size are created (but not necessarily used). When
- the decompressor encounters the code sequence 256
- (decimal) followed by 1, it should increase the code size
- read from the input stream to the next bit size. No
- blocking of the codes is performed, so the next code at
- the increased size should be read from the input stream
- immediately after where the previous code at the smaller
- bit size was read. Again, the decompressor should not
- increase the code size used until the sequence 256,1 is
- encountered.
-
-2) When the table becomes full, total clearing is not
- performed. Rather, when the compressor emits the code
- sequence 256,2 (decimal), the decompressor should clear
- all leaf nodes from the Ziv-Lempel tree, and continue to
- use the current code size. The nodes that are cleared
- from the Ziv-Lempel tree are then re-used, with the lowest
- code value re-used first, and the highest code value
- re-used last. The compressor can emit the sequence 256,2
- at any time.
-
-
-VI. Expanding - Methods 2-5
----------------------------
-
-The Reducing algorithm is actually a combination of two
-distinct algorithms. The first algorithm compresses repeated
-byte sequences, and the second algorithm takes the compressed
-stream from the first algorithm and applies a probabilistic
-compression method.
-
-The probabilistic compression stores an array of 'follower
-sets' S(j), for j=0 to 255, corresponding to each possible
-ASCII character. Each set contains between 0 and 32
-characters, to be denoted as S(j)[0],...,S(j)[m], where m<32.
-The sets are stored at the beginning of the data area for a
-Reduced file, in reverse order, with S(255) first, and S(0)
-last.
-
-The sets are encoded as { N(j), S(j)[0],...,S(j)[N(j)-1] },
-where N(j) is the size of set S(j). N(j) can be 0, in which
-case the follower set for S(j) is empty. Each N(j) value is
-encoded in 6 bits, followed by N(j) eight bit character values
-corresponding to S(j)[0] to S(j)[N(j)-1] respectively. If
-N(j) is 0, then no values for S(j) are stored, and the value
-for N(j-1) immediately follows.
-
-Immediately after the follower sets, is the compressed data
-stream. The compressed data stream can be interpreted for the
-probabilistic decompression as follows:
-
-
-let Last-Character <- 0.
-loop until done
- if the follower set S(Last-Character) is empty then
- read 8 bits from the input stream, and copy this
- value to the output stream.
- otherwise if the follower set S(Last-Character) is non-empty then
- read 1 bit from the input stream.
- if this bit is not zero then
- read 8 bits from the input stream, and copy this
- value to the output stream.
- otherwise if this bit is zero then
- read B(N(Last-Character)) bits from the input
- stream, and assign this value to I.
- Copy the value of S(Last-Character)[I] to the
- output stream.
-
- assign the last value placed on the output stream to
- Last-Character.
-end loop
-
-
-B(N(j)) is defined as the minimal number of bits required to
-encode the value N(j)-1.
-
-
-The decompressed stream from above can then be expanded to
-re-create the original file as follows:
-
-
-let State <- 0.
-
-loop until done
- read 8 bits from the input stream into C.
- case State of
- 0: if C is not equal to DLE (144 decimal) then
- copy C to the output stream.
- otherwise if C is equal to DLE then
- let State <- 1.
-
- 1: if C is non-zero then
- let V <- C.
- let Len <- L(V)
- let State <- F(Len).
- otherwise if C is zero then
- copy the value 144 (decimal) to the output stream.
- let State <- 0
-
- 2: let Len <- Len + C
- let State <- 3.
-
- 3: move backwards D(V,C) bytes in the output stream
- (if this position is before the start of the output
- stream, then assume that all the data before the
- start of the output stream is filled with zeros).
- copy Len+3 bytes from this position to the output stream.
- let State <- 0.
- end case
-end loop
-
-
-The functions F,L, and D are dependent on the 'compression
-factor', 1 through 4, and are defined as follows:
-
-For compression factor 1:
- L(X) equals the lower 7 bits of X.
- F(X) equals 2 if X equals 127 otherwise F(X) equals 3.
- D(X,Y) equals the (upper 1 bit of X) * 256 + Y + 1.
-For compression factor 2:
- L(X) equals the lower 6 bits of X.
- F(X) equals 2 if X equals 63 otherwise F(X) equals 3.
- D(X,Y) equals the (upper 2 bits of X) * 256 + Y + 1.
-For compression factor 3:
- L(X) equals the lower 5 bits of X.
- F(X) equals 2 if X equals 31 otherwise F(X) equals 3.
- D(X,Y) equals the (upper 3 bits of X) * 256 + Y + 1.
-For compression factor 4:
- L(X) equals the lower 4 bits of X.
- F(X) equals 2 if X equals 15 otherwise F(X) equals 3.
- D(X,Y) equals the (upper 4 bits of X) * 256 + Y + 1.
-
-
-VII. Imploding - Method 6
--------------------------
-
-The Imploding algorithm is actually a combination of two distinct
-algorithms. The first algorithm compresses repeated byte
-sequences using a sliding dictionary. The second algorithm is
-used to compress the encoding of the sliding dictionary output,
-using multiple Shannon-Fano trees.
-
-The Imploding algorithm can use a 4K or 8K sliding dictionary
-size. The dictionary size used can be determined by bit 1 in the
-general purpose flag word; a 0 bit indicates a 4K dictionary
-while a 1 bit indicates an 8K dictionary.
-
-The Shannon-Fano trees are stored at the start of the compressed
-file. The number of trees stored is defined by bit 2 in the
-general purpose flag word; a 0 bit indicates two trees stored, a
-1 bit indicates three trees are stored. If 3 trees are stored,
-the first Shannon-Fano tree represents the encoding of the
-Literal characters, the second tree represents the encoding of
-the Length information, the third represents the encoding of the
-Distance information. When 2 Shannon-Fano trees are stored, the
-Length tree is stored first, followed by the Distance tree.
-
-The Literal Shannon-Fano tree, if present is used to represent
-the entire ASCII character set, and contains 256 values. This
-tree is used to compress any data not compressed by the sliding
-dictionary algorithm. When this tree is present, the Minimum
-Match Length for the sliding dictionary is 3. If this tree is
-not present, the Minimum Match Length is 2.
-
-The Length Shannon-Fano tree is used to compress the Length part
-of the (length,distance) pairs from the sliding dictionary
-output. The Length tree contains 64 values, ranging from the
-Minimum Match Length, to 63 plus the Minimum Match Length.
-
-The Distance Shannon-Fano tree is used to compress the Distance
-part of the (length,distance) pairs from the sliding dictionary
-output. The Distance tree contains 64 values, ranging from 0 to
-63, representing the upper 6 bits of the distance value. The
-distance values themselves will be between 0 and the sliding
-dictionary size, either 4K or 8K.
-
-The Shannon-Fano trees themselves are stored in a compressed
-format. The first byte of the tree data represents the number of
-bytes of data representing the (compressed) Shannon-Fano tree
-minus 1. The remaining bytes represent the Shannon-Fano tree
-data encoded as:
-
- High 4 bits: Number of values at this bit length + 1. (1 - 16)
- Low 4 bits: Bit Length needed to represent value + 1. (1 - 16)
-
-The Shannon-Fano codes can be constructed from the bit lengths
-using the following algorithm:
-
-1) Sort the Bit Lengths in ascending order, while retaining the
- order of the original lengths stored in the file.
-
-2) Generate the Shannon-Fano trees:
-
- Code <- 0
- CodeIncrement <- 0
- LastBitLength <- 0
- i <- number of Shannon-Fano codes - 1 (either 255 or 63)
-
- loop while i >= 0
- Code = Code + CodeIncrement
- if BitLength(i) <> LastBitLength then
- LastBitLength=BitLength(i)
- CodeIncrement = 1 shifted left (16 - LastBitLength)
- ShannonCode(i) = Code
- i <- i - 1
- end loop
-
-
-3) Reverse the order of all the bits in the above ShannonCode()
- vector, so that the most significant bit becomes the least
- significant bit. For example, the value 0x1234 (hex) would
- become 0x2C48 (hex).
-
-4) Restore the order of Shannon-Fano codes as originally stored
- within the file.
-
-Example:
-
- This example will show the encoding of a Shannon-Fano tree
- of size 8. Notice that the actual Shannon-Fano trees used
- for Imploding are either 64 or 256 entries in size.
-
-Example: 0x02, 0x42, 0x01, 0x13
-
- The first byte indicates 3 values in this table. Decoding the
- bytes:
- 0x42 = 5 codes of 3 bits long
- 0x01 = 1 code of 2 bits long
- 0x13 = 2 codes of 4 bits long
-
- This would generate the original bit length array of:
- (3, 3, 3, 3, 3, 2, 4, 4)
-
- There are 8 codes in this table for the values 0 thru 7. Using
- the algorithm to obtain the Shannon-Fano codes produces:
-
- Reversed Order Original
-Val Sorted Constructed Code Value Restored Length
---- ------ ----------------- -------- -------- ------
-0: 2 1100000000000000 11 101 3
-1: 3 1010000000000000 101 001 3
-2: 3 1000000000000000 001 110 3
-3: 3 0110000000000000 110 010 3
-4: 3 0100000000000000 010 100 3
-5: 3 0010000000000000 100 11 2
-6: 4 0001000000000000 1000 1000 4
-7: 4 0000000000000000 0000 0000 4
-
-
-The values in the Val, Order Restored and Original Length columns
-now represent the Shannon-Fano encoding tree that can be used for
-decoding the Shannon-Fano encoded data. How to parse the
-variable length Shannon-Fano values from the data stream is beyond
-the scope of this document. (See the references listed at the end of
-this document for more information.) However, traditional decoding
-schemes used for Huffman variable length decoding, such as the
-Greenlaw algorithm, can be successfully applied.
-
-The compressed data stream begins immediately after the
-compressed Shannon-Fano data. The compressed data stream can be
-interpreted as follows:
-
-loop until done
- read 1 bit from input stream.
-
- if this bit is non-zero then (encoded data is literal data)
- if Literal Shannon-Fano tree is present
- read and decode character using Literal Shannon-Fano tree.
- otherwise
- read 8 bits from input stream.
- copy character to the output stream.
- otherwise (encoded data is sliding dictionary match)
- if 8K dictionary size
- read 7 bits for offset Distance (lower 7 bits of offset).
- otherwise
- read 6 bits for offset Distance (lower 6 bits of offset).
-
- using the Distance Shannon-Fano tree, read and decode the
- upper 6 bits of the Distance value.
-
- using the Length Shannon-Fano tree, read and decode
- the Length value.
-
- Length <- Length + Minimum Match Length
-
- if Length = 63 + Minimum Match Length
- read 8 bits from the input stream,
- add this value to Length.
-
- move backwards Distance+1 bytes in the output stream, and
- copy Length characters from this position to the output
- stream. (if this position is before the start of the output
- stream, then assume that all the data before the start of
- the output stream is filled with zeros).
-end loop
-
-VIII. Tokenizing - Method 7
----------------------------
-
-This method is not used by PKZIP.
-
-IX. Deflating - Method 8
-------------------------
-
-The Deflate algorithm is similar to the Implode algorithm using
-a sliding dictionary of up to 32K with secondary compression
-from Huffman/Shannon-Fano codes.
-
-The compressed data is stored in blocks with a header describing
-the block and the Huffman codes used in the data block. The header
-format is as follows:
-
- Bit 0: Last Block bit This bit is set to 1 if this is the last
- compressed block in the data.
- Bits 1-2: Block type
- 00 (0) - Block is stored - All stored data is byte aligned.
- Skip bits until next byte, then next word = block
- length, followed by the ones compliment of the block
- length word. Remaining data in block is the stored
- data.
-
- 01 (1) - Use fixed Huffman codes for literal and distance codes.
- Lit Code Bits Dist Code Bits
- --------- ---- --------- ----
- 0 - 143 8 0 - 31 5
- 144 - 255 9
- 256 - 279 7
- 280 - 287 8
-
- Literal codes 286-287 and distance codes 30-31 are
- never used but participate in the huffman construction.
-
- 10 (2) - Dynamic Huffman codes. (See expanding Huffman codes)
-
- 11 (3) - Reserved - Flag a "Error in compressed data" if seen.
-
-Expanding Huffman Codes
------------------------
-If the data block is stored with dynamic Huffman codes, the Huffman
-codes are sent in the following compressed format:
-
- 5 Bits: # of Literal codes sent - 257 (257 - 286)
- All other codes are never sent.
- 5 Bits: # of Dist codes - 1 (1 - 32)
- 4 Bits: # of Bit Length codes - 4 (4 - 19)
-
-The Huffman codes are sent as bit lengths and the codes are built as
-described in the implode algorithm. The bit lengths themselves are
-compressed with Huffman codes. There are 19 bit length codes:
-
- 0 - 15: Represent bit lengths of 0 - 15
- 16: Copy the previous bit length 3 - 6 times.
- The next 2 bits indicate repeat length (0 = 3, ... ,3 = 6)
- Example: Codes 8, 16 (+2 bits 11), 16 (+2 bits 10) will
- expand to 12 bit lengths of 8 (1 + 6 + 5)
- 17: Repeat a bit length of 0 for 3 - 10 times. (3 bits of length)
- 18: Repeat a bit length of 0 for 11 - 138 times (7 bits of length)
-
-The lengths of the bit length codes are sent packed 3 bits per value
-(0 - 7) in the following order:
-
- 16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15
-
-The Huffman codes should be built as described in the Implode algorithm
-except codes are assigned starting at the shortest bit length, i.e. the
-shortest code should be all 0's rather than all 1's. Also, codes with
-a bit length of zero do not participate in the tree construction. The
-codes are then used to decode the bit lengths for the literal and
-distance tables.
-
-The bit lengths for the literal tables are sent first with the number
-of entries sent described by the 5 bits sent earlier. There are up
-to 286 literal characters; the first 256 represent the respective 8
-bit character, code 256 represents the End-Of-Block code, the remaining
-29 codes represent copy lengths of 3 thru 258. There are up to 30
-distance codes representing distances from 1 thru 32k as described
-below.
-
- Length Codes
- ------------
- Extra Extra Extra Extra
- Code Bits Length Code Bits Lengths Code Bits Lengths Code Bits Length(s)
- ---- ---- ------ ---- ---- ------- ---- ---- ------- ---- ---- ---------
- 257 0 3 265 1 11,12 273 3 35-42 281 5 131-162
- 258 0 4 266 1 13,14 274 3 43-50 282 5 163-194
- 259 0 5 267 1 15,16 275 3 51-58 283 5 195-226
- 260 0 6 268 1 17,18 276 3 59-66 284 5 227-258
- 261 0 7 269 2 19-22 277 4 67-82 285 0 258
- 262 0 8 270 2 23-26 278 4 83-98
- 263 0 9 271 2 27-30 279 4 99-114
- 264 0 10 272 2 31-34 280 4 115-130
-
- Distance Codes
- --------------
- Extra Extra Extra Extra
- Code Bits Dist Code Bits Dist Code Bits Distance Code Bits Distance
- ---- ---- ---- ---- ---- ------ ---- ---- -------- ---- ---- --------
- 0 0 1 8 3 17-24 16 7 257-384 24 11 4097-6144
- 1 0 2 9 3 25-32 17 7 385-512 25 11 6145-8192
- 2 0 3 10 4 33-48 18 8 513-768 26 12 8193-12288
- 3 0 4 11 4 49-64 19 8 769-1024 27 12 12289-16384
- 4 1 5,6 12 5 65-96 20 9 1025-1536 28 13 16385-24576
- 5 1 7,8 13 5 97-128 21 9 1537-2048 29 13 24577-32768
- 6 2 9-12 14 6 129-192 22 10 2049-3072
- 7 2 13-16 15 6 193-256 23 10 3073-4096
-
-The compressed data stream begins immediately after the
-compressed header data. The compressed data stream can be
-interpreted as follows:
-
-do
- read header from input stream.
-
- if stored block
- skip bits until byte aligned
- read count and 1's compliment of count
- copy count bytes data block
- otherwise
- loop until end of block code sent
- decode literal character from input stream
- if literal < 256
- copy character to the output stream
- otherwise
- if literal = end of block
- break from loop
- otherwise
- decode distance from input stream
-
- move backwards distance bytes in the output stream, and
- copy length characters from this position to the output
- stream.
- end loop
-while not last block
-
-if data descriptor exists
- skip bits until byte aligned
- check data descriptor signature
- read crc and sizes
-endif
-
-X. Enhanced Deflating - Method 9
---------------------------------
-
-The Enhanced Deflating algorithm is similar to Deflate but
-uses a sliding dictionary of up to 64K. Deflate64(tm) is supported
-by the Deflate extractor.
-
-[This description is inofficial. It has been deduced by Info-ZIP from
-close inspection of PKZIP 4.x Deflate64(tm) compressed output.]
-
-The Deflate64 algorithm is almost identical to the normal Deflate algorithm.
-Differences are:
-
-- The sliding window size is 64k.
-
-- The previously unused distance codes 30 and 31 are now used to describe
- match distances from 32k-48k and 48k-64k.
- Extra
- Code Bits Distance
- ---- ---- -----------
- .. .. ...
- 29 13 24577-32768
- 30 14 32769-49152
- 31 14 49153-65536
-
-- The semantics of the "maximum match length" code #258 has been changed to
- allow the specification of arbitrary large match lengths (up to 64k).
- Extra
- Code Bits Lengths
- ---- ---- ------
- ... .. ...
- 284 5 227-258
- 285 16 3-65538
-
-Whereas the first two modifications fit into the framework of Deflate,
-this last change breaks compatibility with Deflate method 8. Thus, a
-Deflate64 decompressor cannot decode normal deflated data.
-
-XI. BZIP2 - Method 12
----------------------
-
-BZIP2 is an open-source data compression algorithm developed by
-Julian Seward. Information and source code for this algorithm
-can be found on the internet.
-
-
-XII. Traditional PKWARE Encryption
-----------------------------------
-
-The following information discusses the decryption steps
-required to support traditional PKWARE encryption. This
-form of encryption is considered weak by today's standards
-and its use is recommended only for situations with
-low security needs or for compatibility with older .ZIP
-applications.
-
-XIII. Decryption
-----------------
-
-The encryption used in PKZIP was generously supplied by Roger
-Schlafly. PKWARE is grateful to Mr. Schlafly for his expert
-help and advice in the field of data encryption.
-
-PKZIP encrypts the compressed data stream. Encrypted files must
-be decrypted before they can be extracted.
-
-Each encrypted file has an extra 12 bytes stored at the start of
-the data area defining the encryption header for that file. The
-encryption header is originally set to random values, and then
-itself encrypted, using three, 32-bit keys. The key values are
-initialized using the supplied encryption password. After each byte
-is encrypted, the keys are then updated using pseudo-random number
-generation techniques in combination with the same CRC-32 algorithm
-used in PKZIP and described elsewhere in this document.
-
-The following is the basic steps required to decrypt a file:
-
-1) Initialize the three 32-bit keys with the password.
-2) Read and decrypt the 12-byte encryption header, further
- initializing the encryption keys.
-3) Read and decrypt the compressed data stream using the
- encryption keys.
-
-
-Step 1 - Initializing the encryption keys
------------------------------------------
-
-Key(0) <- 305419896
-Key(1) <- 591751049
-Key(2) <- 878082192
-
-loop for i <- 0 to length(password)-1
- update_keys(password(i))
-end loop
-
-
-Where update_keys() is defined as:
-
-
-update_keys(char):
- Key(0) <- crc32(key(0),char)
- Key(1) <- Key(1) + (Key(0) & 000000ffH)
- Key(1) <- Key(1) * 134775813 + 1
- Key(2) <- crc32(key(2),key(1) >> 24)
-end update_keys
-
-
-Where crc32(old_crc,char) is a routine that given a CRC value and a
-character, returns an updated CRC value after applying the CRC-32
-algorithm described elsewhere in this document.
-
-
-Step 2 - Decrypting the encryption header
------------------------------------------
-
-The purpose of this step is to further initialize the encryption
-keys, based on random data, to render a plaintext attack on the
-data ineffective.
-
-
-Read the 12-byte encryption header into Buffer, in locations
-Buffer(0) thru Buffer(11).
-
-loop for i <- 0 to 11
- C <- buffer(i) ^ decrypt_byte()
- update_keys(C)
- buffer(i) <- C
-end loop
-
-
-Where decrypt_byte() is defined as:
-
-
-unsigned char decrypt_byte()
- local unsigned short temp
- temp <- Key(2) | 2
- decrypt_byte <- (temp * (temp ^ 1)) >> 8
-end decrypt_byte
-
-
-After the header is decrypted, the last 1 or 2 bytes in Buffer
-should be the high-order word/byte of the CRC for the file being
-decrypted, stored in Intel low-byte/high-byte order, or the high-order
-byte of the file time if bit 3 of the general purpose bit flag is set.
-Versions of PKZIP prior to 2.0 used a 2 byte CRC check; a 1 byte CRC check is
-used on versions after 2.0. This can be used to test if the password
-supplied is correct or not.
-
-
-Step 3 - Decrypting the compressed data stream
-----------------------------------------------
-
-The compressed data stream can be decrypted as follows:
-
-
-loop until done
- read a character into C
- Temp <- C ^ decrypt_byte()
- update_keys(temp)
- output Temp
-end loop
-
-
-XIV. Strong Encryption Specification (EFS)
-------------------------------------------
-
-Version 5.x of this specification introduced support for strong
-encryption algorithms. These algorithms can be used with either
-a password or an X.509v3 digital certificate to encrypt each file.
-This format specification supports either password or certificate
-based encryption to meet the security needs of today, to enable
-interoperability between users within both PKI and non-PKI
-environments, and to ensure interoperability between different
-computing platforms that are running a ZIP program.
-
-Password based encryption is the most common form of encryption
-people are familiar with. However, inherent weaknesses with
-passwords (e.g. susceptibility to dictionary/brute force attack)
-as well as password management and support issues make certificate
-based encryption a more secure and scalable option. Industry
-efforts and support are defining and moving towards more advanced
-security solutions built around X.509v3 digital certificates and
-Public Key Infrastructures(PKI) because of the greater scalability,
-administrative options, and more robust security over traditional
-password-based encryption.
-
-Most standard encryption algorithms are supported with this
-specification. Reference implementations for many of these
-algorithms are available from either commercial or open source
-distributors. Readily available cryptographic toolkits make
-implementation of the encryption features straight-forward.
-This document is not intended to provide a treatise on data
-encryption principles or theory. Its purpose is to document the
-data structures required for implementing interoperable data
-encryption within the .ZIP format. It is strongly recommended that
-you have a good understanding of data encryption before reading
-further.
-
-The algorithms introduced in Version 5.0 of this specification
-include:
-
- RC2 40 bit, 64 bit, and 128 bit
- RC4 40 bit, 64 bit, and 128 bit
- DES
- 3DES 112 bit and 168 bit
-
-Version 5.1 adds support for the following:
-
- AES 128 bit, 192 bit, and 256 bit
-
-Version 6.1 introduces encryption data changes to support
-interoperability with SmartCard and USB Token certificate storage
-methods which do not support the OAEP strengthening standard.
-
-Version 6.2 introduces support for encrypting metadata by compressing
-and encrypting the central directory data structure to reduce information
-leakage. Information leakage can occur in legacy ZIP applications
-through exposure of information about a file even though that file is
-stored encrypted. The information exposed consists of file
-characteristics stored within the records and fields defined by this
-specification. This includes data such as a files name, its original
-size, timestamp and CRC32 value.
-
-Central Directory Encryption provides greater protection against
-information leakage by encrypting the Central Directory structure and
-by masking key values that are replicated in the unencrypted Local
-Header. ZIP compatible programs that cannot interpret an encrypted
-Central Directory structure cannot rely on the data in the corresponding
-Local Header for decompression information.
-
-Extra Field records that may contain information about a file that should
-not be exposed should not be stored in the Local Header and should only
-be written to the Central Directory where they can be encrypted. This
-design currently does not support streaming. Information in the End of
-Central Directory record, the ZIP64 End of Central Directory Locator,
-and the ZIP64 End of Central Directory record are not encrypted. Access
-to view data on files within a ZIP file with an encrypted Central Directory
-requires the appropriate password or private key for decryption prior to
-viewing any files, or any information about the files, in the archive.
-
-Older ZIP compatible programs not familiar with the Central Directory
-Encryption feature will no longer be able to recognize the Central
-Directory and may assume the ZIP file is corrupt. Programs that
-attempt streaming access using Local Headers will see invalid
-information for each file. Central Directory Encryption need not be
-used for every ZIP file. Its use is recommended for greater security.
-ZIP files not using Central Directory Encryption should operate as
-in the past.
-
-The details of the strong encryption specification for certificates
-remain under development as design and testing issues are worked out
-for the range of algorithms, encryption methods, certificate processing
-and cross-platform support necessary to meet the advanced security needs
-of .ZIP file users today and in the future.
-
-This feature specification is intended to support basic encryption needs
-of today, such as password support. However this specification is also
-designed to lay the foundation for future advanced security needs.
-
-Encryption provides data confidentiality and privacy. It is
-recommended that you combine X.509 digital signing with encryption
-to add authentication and non-repudiation.
-
-
-Single Password Symmetric Encryption Method:
--------------------------------------------
-
-The Single Password Symmetric Encryption Method using strong
-encryption algorithms operates similarly to the traditional
-PKWARE encryption defined in this format. Additional data
-structures are added to support the processing needs of the
-strong algorithms.
-
-The Strong Encryption data structures are:
-
-1. General Purpose Bits - Bits 0 and 6 of the General Purpose bit
-flag in both local and central header records. Both bits set
-indicates strong encryption. Bit 13, when set indicates the Central
-Directory is encrypted and that selected fields in the Local Header
-are masked to hide their actual value.
-
-
-2. Extra Field 0x0017 in central header only.
-
- Fields to consider in this record are:
-
- Format - the data format identifier for this record. The only
- value allowed at this time is the integer value 2.
-
- AlgId - integer identifier of the encryption algorithm from the
- following range
-
- 0x6601 - DES
- 0x6602 - RC2 (version needed to extract < 5.2)
- 0x6603 - 3DES 168
- 0x6609 - 3DES 112
- 0x660E - AES 128
- 0x660F - AES 192
- 0x6610 - AES 256
- 0x6702 - RC2 (version needed to extract >= 5.2)
- 0x6801 - RC4
- 0xFFFF - Unknown algorithm
-
- Bitlen - Explicit bit length of key
-
- 40
- 56
- 64
- 112
- 128
- 168
- 192
- 256
-
- Flags - Processing flags needed for decryption
-
- 0x0001 - Password is required to decrypt
- 0x0002 - Certificates only
- 0x0003 - Password or certificate required to decrypt
-
- Values > 0x0003 reserved for certificate processing
-
-
-3. Decryption header record preceeding compressed file data.
-
- -Decryption Header:
-
- Value Size Description
- ----- ---- -----------
- IVSize 2 bytes Size of initialization vector (IV)
- IVData IVSize Initialization vector for this file
- Size 4 bytes Size of remaining decryption header data
- Format 2 bytes Format definition for this record
- AlgID 2 bytes Encryption algorithm identifier
- Bitlen 2 bytes Bit length of encryption key
- Flags 2 bytes Processing flags
- ErdSize 2 bytes Size of Encrypted Random Data
- ErdData ErdSize Encrypted Random Data
- Reserved1 4 bytes Reserved certificate processing data
- Reserved2 (var) Reserved for certificate processing data
- VSize 2 bytes Size of password validation data
- VData VSize-4 Password validation data
- VCRC32 4 bytes Standard ZIP CRC32 of password validation data
-
- IVData - The size of the IV should match the algorithm block size.
- The IVData can be completely random data. If the size of
- the randomly generated data does not match the block size
- it should be complemented with zero's or truncated as
- necessary. If IVSize is 0, then IV = CRC32 + Uncompressed
- File Size (as a 64 bit little-endian, unsigned integer value).
-
- Format - the data format identifier for this record. The only
- value allowed at this time is the integer value 3.
-
- AlgId - integer identifier of the encryption algorithm from the
- following range
-
- 0x6601 - DES
- 0x6602 - RC2 (version needed to extract < 5.2)
- 0x6603 - 3DES 168
- 0x6609 - 3DES 112
- 0x660E - AES 128
- 0x660F - AES 192
- 0x6610 - AES 256
- 0x6702 - RC2 (version needed to extract >= 5.2)
- 0x6801 - RC4
- 0xFFFF - Unknown algorithm
-
- Bitlen - Explicit bit length of key
-
- 40
- 56
- 64
- 112
- 128
- 168
- 192
- 256
-
- Flags - Processing flags needed for decryption
-
- 0x0001 - Password is required to decrypt
- 0x0002 - Certificates only
- 0x0003 - Password or certificate required to decrypt
-
- Values > 0x0003 reserved for certificate processing
-
- ErdData - Encrypted random data is used to generate a file
- session key for encrypting each file. SHA1 is
- used to calculate hash data used to derive keys.
- File session keys are derived from a master session
- key generated from the user-supplied password.
- If the Flags field in the decryption header contains
- the value 0x4000, then the ErdData field must be
- decrypted using 3DES.
-
- Reserved1 - Reserved for certificate processing, if value is
- zero, then Reserved2 data is absent. See the explanation
- under the Certificate Processing Method for details on
- this data structure.
-
- Reserved2 - If present, the size of the Reserved2 data structure
- is located by skipping the first 4 bytes of this field
- and using the next 2 bytes as the remaining size. See
- the explanation under the Certificate Processing Method
- for details on this data structure.
-
- VSize - This size value will always include the 4 bytes of the
- VCRC32 data and will be greater than 4 bytes.
-
- VData - Random data for password validation. This data is VSize
- in length and VSize must be a multiple of the encryption
- block size. VCRC32 is a checksum value of VData.
- VData and VCRC32 are stored encrypted and start the
- stream of encrypted data for a file.
-
-4. Single Password Central Directory Encryption
-
-Central Directory Encryption is achieved within the .ZIP format by
-encrypting the Central Directory structure. This encapsulates the metadata
-most often used for processing .ZIP files. Additional metadata is stored for
-redundancy in the Local Header for each file. The process of concealing
-metadata by encrypting the Central Directory does not protect the data within
-the Local Header. To avoid information leakage from the exposed metadata
-in the Local Header, the fields containing information about a file are masked.
-
-Local Header:
-
-Masking replaces the true content of the fields for a file in the Local
-Header with false information. When masked, the Local Header is not
-suitable for streaming access and the options for data recovery of damaged
-archives is reduced. Extra Data fields that may contain confidential
-data should not be stored within the Local Header. The value set into
-the Version needed to extract field should be the correct value needed to
-extract the file without regard to Central Directory Encryption. The fields
-within the Local Header targeted for masking when the Central Directory is
-encrypted are:
-
- Field Name Mask Value
- ------------------ ---------------------------
- compression method 0
- last mod file time 0
- last mod file date 0
- crc-32 0
- compressed size 0
- uncompressed size 0
- file name (variable size) Base 16 value from the
- range 1 - FFFFFFFFFFFFFFFF
- represented as a string whose
- size will be set into the
- file name length field
-
-The Base 16 value assigned as a masked file name is simply a sequentially
-incremented value for each file starting with 1 for the first file.
-Modifications to a ZIP file may cause different values to be stored for
-each file. For compatibility, the file name field in the Local Header
-should never be left blank. As of Version 6.2 of this specification,
-the Compression Method and Compressed Size fields are not yet masked.
-
-Encrypting the Central Directory:
-
-Encryption of the Central Directory does not include encryption of the
-Central Directory Signature data, the ZIP64 End of Central Directory
-record, the ZIP64 End of Central Directory Locator, or the End
-of Central Directory record. The ZIP file comment data is never
-encrypted.
-
-Before encrypting the Central Directory, it may optionally be compressed.
-Compression is not required, but for storage efficiency it is assumed
-this structure will be compressed before encrypting. Similarly, this
-specification supports compressing the Central Directory without
-requiring that it also be encrypted. Early implementations of this
-feature will assume the encryption method applied to files matches the
-encryption applied to the Central Directory.
-
-Encryption of the Central Directory is done in a manner similar to
-that of file encryption. The encrypted data is preceded by a
-decryption header. The decryption header is known as the Archive
-Decryption Header. The fields of this record are identical to
-the decryption header preceding each encrypted file. The location
-of the Archive Decryption Header is determined by the value in the
-Start of the Central Directory field in the ZIP64 End of Central
-Directory record. When the Central Directory is encrypted, the
-ZIP64 End of Central Directory record will always be present.
-
-The layout of the ZIP64 End of Central Directory record for all
-versions starting with 6.2 of this specification will follow the
-Version 2 format. The Version 2 format is as follows:
-
-The first 48 bytes will remain identical to that of Version 1.
-The record signature for both Version 1 and Version 2 will be
-0x06064b50. Immediately following the 48th byte, which identifies
-the end of the field known as the Offset of Start of Central
-Directory With Respect to the Starting Disk Number will begin the
-new fields defining Version 2 of this record.
-
-New fields for Version 2:
-
-Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- Compression Method 2 bytes Method used to compress the
- Central Directory
- Compressed Size 8 bytes Size of the compressed data
- Original Size 8 bytes Original uncompressed size
- AlgId 2 bytes Encryption algorithm ID
- BitLen 2 bytes Encryption key length
- Flags 2 bytes Encryption flags
- HashID 2 bytes Hash algorithm identifier
- Hash Length 2 bytes Length of hash data
- Hash Data (variable) Hash data
-
-The Compression Method accepts the same range of values as the
-corresponding field in the Central Header.
-
-The Compressed Size and Original Size values will not include the
-data of the Central Directory Signature which is compressed or
-encrypted.
-
-The AlgId, BitLen, and Flags fields accept the same range of values
-the corresponding fields within the 0x0017 record.
-
-Hash ID identifies the algorithm used to hash the Central Directory
-data. This data does not have to be hashed, in which case the
-values for both the HashID and Hash Length will be 0. Possible
-values for HashID are:
-
- Value Algorithm
- ------ ---------
- 0x0000 none
- 0x0001 CRC32
- 0x8003 MD5
- 0x8004 SHA1
-
-When the Central Directory data is signed, the same hash algorithm
-used to hash the Central Directory for signing should be used.
-This is recommended for processing efficiency, however, it is
-permissible for any of the above algorithms to be used independent
-of the signing process.
-
-The Hash Data will contain the hash data for the Central Directory.
-The length of this data will vary depending on the algorithm used.
-
-The Version Needed to Extract should be set to 62.
-
-The value for the Total Number of Entries on the Current Disk will
-be 0. These records will no longer support random access when
-encrypting the Central Directory.
-
-When the Central Directory is compressed and/or encrypted, the
-End of Central Directory record will store the value 0xFFFFFFFF
-as the value for the Total Number of Entries in the Central
-Directory. The value stored in the Total Number of Entries in
-the Central Directory on this Disk field will be 0. The actual
-values will be stored in the equivalent fields of the ZIP64
-End of Central Directory record.
-
-Decrypting and decompressing the Central Directory is accomplished
-in the same manner as decrypting and decompressing a file.
-
-
-5. Useful Tips
-
-Strong Encryption is always applied to a file after compression. The
-block oriented algorithms all operate in Cypher Block Chaining (CBC)
-mode. The block size used for AES encryption is 16. All other block
-algorithms use a block size of 8. Two ID's are defined for RC2 to
-account for a discrepancy found in the implementation of the RC2
-algorithm in the cryptographic library on Windows XP SP1 and all
-earlier versions of Windows.
-
-A pseudo-code representation of the encryption process is as follows:
-
-Password = GetUserPassword()
-RD = Random()
-ERD = Encrypt(RD,DeriveKey(SHA1(Password)))
-For Each File
- IV = Random()
- VData = Random()
- FileSessionKey = DeriveKey(SHA1(IV + RD))
- Encrypt(VData + VCRC32 + FileData,FileSessionKey)
-Done
-
-The function names and parameter requirements will depend on
-the choice of the cryptographic toolkit selected. Almost any
-toolkit supporting the reference implementations for each
-algorithm can be used. The RSA BSAFE(r), OpenSSL, and Microsoft
-CryptoAPI libraries are all known to work well.
-
-
-Certificate Processing Method:
------------------------------
-
-The Certificate Processing Method for ZIP file encryption remains
-under development. The information provided here serves as a guide
-to those interested in certificate-based data decryption. This
-information may be subject to change in future versions of this
-specification and is subject to change without notice.
-
-OAEP Processing with Certificate-based Encryption:
-
-Versions of PKZIP available during this development phase of the
-certificate processing method may set a value of 61 into the
-version needed to extract field for a file. This indicates that
-non-OAEP key wrapping is used. This affects certificate encryption
-only, and password encryption functions should not be affected by
-this value. This means values of 61 may be found on files encrypted
-with certificates only, or on files encrypted with both password
-encryption and certificate encryption. Files encrypted with both
-methods can safely be decrypted using the password methods documented.
-
-OAEP stands for Optimal Asymmetric Encryption Padding. It is a
-strengthening technique used for small encoded items such as decryption
-keys. This is commonly applied in cryptographic key-wrapping techniques
-and is supported by PKCS #1. Versions 5.0 and 6.0 of this specification
-were designed to support OAEP key-wrapping for certificate-based
-decryption keys for additional security.
-
-Support for private keys stored on Smart Cards or Tokens introduced
-a conflict with this OAEP logic. Most card and token products do
-not support the additional strengthening applied to OAEP key-wrapped
-data. In order to resolve this conflict, versions 6.1 and above of this
-specification will no longer support OAEP when encrypting using
-digital certificates.
-
-Certificate Processing Data Fields:
-
-The Certificate Processing Method of this specification defines the
-following additional data fields:
-
-
-1. Certificate Flag Values
-
-Additional processing flags that can be present in the Flags field of both
-the 0x0017 field of the central directory Extra Field and the Decryption
-header record preceding compressed file data are:
-
- 0x0007 - reserved for future use
- 0x000F - reserved for future use
- 0x0100 - Indicates non-OAEP key wrapping was used. If this
- this field is set, the version needed to extract must
- be at least 61. This means OAEP key wrapping is not
- used when generating a Master Session Key using
- ErdData.
- 0x4000 - ErdData must be decrypted using 3DES-168, otherwise use the
- same algorithm used for encrypting the file contents.
- 0x8000 - reserved for future use
-
-
-2. CertData - Extra Field 0x0017 record certificate data structure
-
-The data structure used to store certificate data within the section
-of the Extra Field defined by the CertData field of the 0x0017
-record are as shown:
-
- Value Size Description
- ----- ---- -----------
- RCount 4 bytes Number of recipients.
- HashAlg 2 bytes Hash algorithm identifier
- HSize 2 bytes Hash size
- SRList (var) Simple list of recipients hashed public keys
-
-
- RCount This defines the number intended recipients whose
- public keys were used for encryption. This identifies
- the number of elements in the SRList.
-
- HashAlg This defines the hash algorithm used to calculate
- the public key hash of each public key used
- for encryption. This field currently supports
- only the following value for SHA-1
-
- 0x8004 - SHA1
-
- HSize This defines the size of a hashed public key.
-
- SRList This is a variable length list of the hashed
- public keys for each intended recipient. Each
- element in this list is HSize. The total size of
- SRList is determined using RCount * HSize.
-
-
-3. Reserved1 - Certificate Decryption Header Reserved1 Data:
-
- Value Size Description
- ----- ---- -----------
- RCount 4 bytes Number of recipients.
-
- RCount This defines the number intended recipients whose
- public keys were used for encryption. This defines
- the number of elements in the REList field defined below.
-
-
-4. Reserved2 - Certificate Decryption Header Reserved2 Data Structures:
-
-
- Value Size Description
- ----- ---- -----------
- HashAlg 2 bytes Hash algorithm identifier
- HSize 2 bytes Hash size
- REList (var) List of recipient data elements
-
-
- HashAlg This defines the hash algorithm used to calculate
- the public key hash of each public key used
- for encryption. This field currently supports
- only the following value for SHA-1
-
- 0x8004 - SHA1
-
- HSize This defines the size of a hashed public key
- defined in REHData.
-
- REList This is a variable length of list of recipient data.
- Each element in this list consists of a Recipient
- Element data structure as follows:
-
-
- Recipient Element (REList) Data Structure:
-
- Value Size Description
- ----- ---- -----------
- RESize 2 bytes Size of REHData + REKData
- REHData HSize Hash of recipients public key
- REKData (var) Simple key blob
-
-
- RESize This defines the size of an individual REList
- element. This value is the combined size of the
- REHData field + REKData field. REHData is defined by
- HSize. REKData is variable and can be calculated
- for each REList element using RESize and HSize.
-
- REHData Hashed public key for this recipient.
-
- REKData Simple Key Blob. The format of this data structure
- is identical to that defined in the Microsoft
- CryptoAPI and generated using the CryptExportKey()
- function. The version of the Simple Key Blob
- supported at this time is 0x02 as defined by
- Microsoft.
-
-5. Certificate Processing - Central Directory Encryption:
-
-Central Directory Encryption using Digital Certificates will
-operate in a manner similar to that of Single Password Central
-Directory Encryption. This record will only be present when there
-is data to place into it. Currently, data is placed into this
-record when digital certificates are used for either encrypting
-or signing the files within a ZIP file. When only password
-encryption is used with no certificate encryption or digital
-signing, this record is not currently needed. When present, this
-record will appear before the start of the actual Central Directory
-data structure and will be located immediately after the Archive
-Decryption Header if the Central Directory is encrypted.
-
-The Archive Extra Data record will be used to store the following
-information. Additional data may be added in future versions.
-
-Extra Data Fields:
-
-0x0014 - PKCS#7 Store for X.509 Certificates
-0x0016 - X.509 Certificate ID and Signature for central directory
-0x0019 - PKCS#7 Encryption Recipient Certificate List
-
-The 0x0014 and 0x0016 Extra Data records that otherwise would be
-located in the first record of the Central Directory for digital
-certificate processing. When encrypting or compressing the Central
-Directory, the 0x0014 and 0x0016 records must be located in the
-Archive Extra Data record and they should not remain in the first
-Central Directory record. The Archive Extra Data record will also
-be used to store the 0x0019 data.
-
-When present, the size of the Archive Extra Data record will be
-included in the size of the Central Directory. The data of the
-Archive Extra Data record will also be compressed and encrypted
-along with the Central Directory data structure.
-
-6. Certificate Processing Differences:
-
-The Certificate Processing Method of encryption differs from the
-Single Password Symmetric Encryption Method as follows. Instead
-of using a user-defined password to generate a master session key,
-cryptographically random data is used. The key material is then
-wrapped using standard key-wrapping techniques. This key material
-is wrapped using the public key of each recipient that will need
-to decrypt the file using their corresponding private key.
-
-This specification currently assumes digital certificates will follow
-the X.509 V3 format for 1024 bit and higher RSA format digital
-certificates. Implementation of this Certificate Processing Method
-requires supporting logic for key access and management. This logic
-is outside the scope of this specification.
-
-
-License Agreement:
------------------
-
-The features set forth in this Section XIV (the "Strong Encryption
-Specification") are covered by a pending patent application. Portions of
-this Strong Encryption technology are available for use at no charge
-under the following terms and conditions.
-
-1. License Grant.
-
- a. NOTICE TO USER. PLEASE READ THIS ENTIRE SECTION XIV OF THE
- APPNOTE (THE "AGREEMENT") CAREFULLY. BY USING ALL OR ANY PORTION OF THE
- LICENSED TECHNOLOGY, YOU ACCEPT ALL THE TERMS AND CONDITIONS OF THIS
- AGREEMENT AND YOU AGREE THAT THIS AGREEMENT IS ENFORCEABLE LIKE ANY
- WRITTEN NEGOTIATED AGREEMENT SIGNED BY YOU. IF YOU DO NOT AGREE, DO NOT
- USE THE LICENSED TECHNOLOGY.
-
- b. Definitions.
-
- i. "Licensed Technology" shall mean that proprietary technology now or
- hereafter owned or controlled by PKWare, Inc. ("PKWARE") or any
- subsidiary or affiliate that covers or is necessary to be used to give
- software the ability to a) extract and decrypt data from zip files
- encrypted using any methods of data encryption and key processing which
- are published in this APPNOTE or any prior APPNOTE, as supplemented by
- any Additional Compatibility Information; and b) encrypt file contents
- as part of .ZIP file processing using only the Single Password Symmetric
- Encryption Method as published in this APPNOTE or any prior APPNOTE, as
- supplemented by any Additional Compatibility Information. For purposes
- of this AGREEMENT, "Additional Compatibility Information" means, with
- regard to any method of data encryption and key processing published in
- this or any prior APPNOTE, any corrections, additions, or clarifications
- to the information in such APPNOTE that are required in order to give
- software the ability to successfully extract and decrypt zip files (or,
- but solely in the case of the Single Password Symmetric Encryption Method,
- to successfully encrypt zip files) in a manner interoperable with the
- actual implementation of such method in any PKWARE product that is
- documented or publicly described by PKWARE as being able to create, or
- to extract and decrypt, zip files using that method.
-
- ii. "Licensed Products" shall mean any products you produce that
- incorporate the Licensed Technology.
-
- c. License to Licensed Technology.
-
- PKWARE hereby grants to you a non-exclusive license to use the Licensed
- Technology for the purpose of manufacturing, offering, selling and using
- Licensed Products, which license shall extend to permit the practice of all
- claims in any patent or patent application (collectively, "Patents") now or
- hereafter owned or controlled by PKWARE in any jurisdiction in the world
- that are infringed by implementation of the Licensed Technology. You have
- the right to sublicense rights you receive under the terms of this AGREEMENT
- for the purpose of allowing sublicensee to manufacture, offer, sell and use
- products that incorporate all or a portion of any of your Licensed Products,
- but if you do, you agree to i) impose the same restrictions on any such
- sublicensee as these terms impose on you and ii) notify the sublicensee,
- by means chosen by you in your unfettered discretion, including a notice on
- your web site, of the terms of this AGREEMENT and make available to each
- sublicensee the full text of this APPNOTE. Further, PKWARE hereby grants to
- you a non-exclusive right to reproduce and distribute, in any form, copies of
- this APPNOTE, without modification. Notwithstanding anything to the contrary
- in this AGREEMENT, you have the right to sublicense the rights, without any of
- the restrictions described above or elsewhere in this AGREEMENT, to use, offer
- to sell and sell Licensed Technology as incorporated in executable object code
- or byte code forms of your Licensed Products. Any sublicense to use the
- Licensed Technology incorporated in a Licensed Product granted by you shall
- survive the termination of this AGREEMENT for any reason. PKWARE warrants that
- this license shall continue to encumber the Licensed Technology regardless of
- changes in ownership of the Licensed Technology.
-
- d. Proprietary Notices.
-
- i. With respect to any Licensed Product that is distributed by you either
- in source code form or in the form of an object code library of externally
- callable functions that has been designed by you for incorporation into third
- party products, you agree to include, in the source code, or in the case of
- an object code library, in accompanying documentation, a notice using the
- words "patent pending" until a patent is issued to PKWARE covering any
- portion of the Licensed Technology or PKWARE provides notice, by means
- chosen by PKWARE in its unfettered discretion, that it no longer has any
- patent pending covering any portion of the Licensed Technology. With respect
- to any Licensed Product, upon your becoming aware that at least one patent has
- been granted covering the Licensed Technology, you agree to include in any
- revisions made by you to the documentation (or any source code distributed
- by you) the words "Pat. No.", or "Patent Number" and the patent number or
- numbers of the applicable patent or patents. PKWARE shall, from time to time,
- inform you of the patent number or numbers of the patents covering the
- Licensed Technology, by means chosen by PKWARE in its unfettered discretion,
- including a notice on its web site. It shall be a violation of the terms of
- this AGREEMENT for you to sell Licensed Products without complying with the
- foregoing marking provisions.
-
- ii. You acknowledge that the terms of this AGREEMENT do not grant you any
- license or other right to use any PKWARE trademark in connection with the sale,
- offering for sale, distribution and delivery of the Licensed Products, or in
- connection with the advertising, promotion and offering of the Licensed Products.
- You acknowledge PKWARE's ownership of the PKZIP trademark and all other marks
- owned by PKWARE.
-
- e. Covenant of Compliance and Remedies.
-
- To the extent that you have elected to implement portions of the Licensed
- Technology, you agree to use reasonable diligence to comply with those portions
- of this Section XIV, as modified or supplemented by Additional Compatibility
- Information available to you, describing the portions of the Licensed Technology
- that you have elected to implement. Upon reasonable request by PKWARE, you will
- provide written notice to PKWARE identifying which version of this APPNOTE you
- have relied upon for your implementation of any specified Licensed Product.
-
- If any substantial non-compliance with the terms of this AGREEMENT is determined
- to exist, you will make such changes as necessary to bring your Licensed Products
- into substantial compliance with the terms of this AGREEMENT. If, within sixty
- days of receipt of notice that a Licensed Product fails to comply with the terms
- of this AGREEMENT, you fail to make such changes as necessary to bring your
- Licensed Products into compliance with the terms of this AGREEMENT, PKWARE may
- terminate your rights under this AGREEMENT. PKWARE does not waive and expressly
- reserves the right to pursue any and all additional remedies that are or may
- become available to PKWARE.
-
- f. Warranty and Indemnification Regarding Exportation.
-
- You realize and acknowledge that, as between yourself and PKWARE, you are fully
- responsible for compliance with the import and export laws and regulations of
- any country in or to which you import or export any Licensed Products, and you
- agree to hold PKWARE harmless from any claim of violation of any such import
- or export laws.
-
- g. Patent Infringement.
-
- You agree that you will not bring or threaten to bring any action against PKWARE
- for infringement of the claims of any patent owned or controlled by you solely
- as a result of PKWARE's own implementation of the Licensed Technology. As its
- exclusive remedy for your breach of the foregoing agreement, PKWARE reserves
- the right to suspend or terminate all rights granted under the terms of this
- AGREEMENT if you bring or threaten to bring any such action against PKWARE,
- effective immediately upon delivery of written notice of suspension or
- termination to you.
-
- h. Governing Law.
-
- The license granted in this AGREEMENT shall be governed by and construed under
- the laws of the State of Wisconsin and the United States.
-
- i. Revisions and Notice.
-
- The license granted in this APPNOTE is irrevocable, except as expressly set
- forth above. You agree and understand that any changes which PKWARE determines
- to make to this APPNOTE shall be posted at the same location as the current
- APPNOTE or at a location which will be identified by means chosen by PKWARE,
- including a notice on its web site, and shall be available for adoption by you
- immediately upon such posting, or at such other time as PKWARE shall determine.
- Any changes to the terms of the license published in a subsequent version of
- this AGREEMENT shall be binding upon you only with respect to your products
- that (i) incorporate any Licensed Technology (as defined in the subsequent
- AGREEMENT) that is not otherwise included in the definition of Licensed
- Technology under this AGREEMENT, or (ii) that you expressly identify are to
- be licensed under the subsequent AGREEMENT, which identification shall be by
- written notice with reference to the APPNOTE (version and release date or other
- unique identifier) in which the subsequent AGREEMENT is published. PKWARE
- agrees to identify each change to this APPNOTE by using a unique version and
- release date identifier or other unique identifier.
-
- j. Warranty by PKWARE
-
- PKWare, Inc. warrants that it has the right to grant the license hereunder.
-
-XV. Change Process
-------------------
-
-In order for the .ZIP file format to remain a viable definition, this
-specification should be considered as open for periodic review and
-revision. Although this format was originally designed with a
-certain level of extensibility, not all changes in technology
-(present or future) were or will be necessarily considered in its
-design. If your application requires new definitions to the
-extensible sections in this format, or if you would like to
-submit new data structures, please forward your request to
-zipformat@pkware.com. All submissions will be reviewed by the
-ZIP File Specification Committee for possible inclusion into
-future versions of this specification. Periodic revisions
-to this specification will be published to ensure interoperability.
-We encourage comments and feedback that may help improve clarity
-or content.
-
-
-XVI. Acknowledgements
----------------------
-
-In addition to the above mentioned contributors to PKZIP and PKUNZIP,
-I would like to extend special thanks to Robert Mahoney for suggesting
-the extension .ZIP for this software.
-
-
-XVII. References
-----------------
-
- Fiala, Edward R., and Greene, Daniel H., "Data compression with
- finite windows", Communications of the ACM, Volume 32, Number 4,
- April 1989, pages 490-505.
-
- Held, Gilbert, "Data Compression, Techniques and Applications,
- Hardware and Software Considerations", John Wiley & Sons, 1987.
-
- Huffman, D.A., "A method for the construction of minimum-redundancy
- codes", Proceedings of the IRE, Volume 40, Number 9, September 1952,
- pages 1098-1101.
-
- Nelson, Mark, "LZW Data Compression", Dr. Dobbs Journal, Volume 14,
- Number 10, October 1989, pages 29-37.
-
- Nelson, Mark, "The Data Compression Book", M&T Books, 1991.
-
- Storer, James A., "Data Compression, Methods and Theory",
- Computer Science Press, 1988
-
- Welch, Terry, "A Technique for High-Performance Data Compression",
- IEEE Computer, Volume 17, Number 6, June 1984, pages 8-19.
-
- Ziv, J. and Lempel, A., "A universal algorithm for sequential data
- compression", Communications of the ACM, Volume 30, Number 6,
- June 1987, pages 520-540.
-
- Ziv, J. and Lempel, A., "Compression of individual sequences via
- variable-rate coding", IEEE Transactions on Information Theory,
- Volume 24, Number 5, September 1978, pages 530-536.
diff --git a/docs/appnote.txt b/docs/appnote.txt
deleted file mode 100644
index 985f6f8..0000000
--- a/docs/appnote.txt
+++ /dev/null
@@ -1,3497 +0,0 @@
-File: APPNOTE.TXT - .ZIP File Format Specification
-Version: 6.3.4
-Status: Final - replaces version 6.3.3
-Revised: October 1, 2014
-Copyright (c) 1989 - 2014 PKWARE Inc., All Rights Reserved.
-
-1.0 Introduction
----------------
-
-1.1 Purpose
------------
-
- 1.1.1 This specification is intended to define a cross-platform,
- interoperable file storage and transfer format. Since its
- first publication in 1989, PKWARE, Inc. ("PKWARE") has remained
- committed to ensuring the interoperability of the .ZIP file
- format through periodic publication and maintenance of this
- specification. We trust that all .ZIP compatible vendors and
- application developers that use and benefit from this format
- will share and support this commitment to interoperability.
-
-1.2 Scope
----------
-
- 1.2.1 ZIP is one of the most widely used compressed file formats. It is
- universally used to aggregate, compress, and encrypt files into a single
- interoperable container. No specific use or application need is
- defined by this format and no specific implementation guidance is
- provided. This document provides details on the storage format for
- creating ZIP files. Information is provided on the records and
- fields that describe what a ZIP file is.
-
-1.3 Trademarks
---------------
-
- 1.3.1 PKWARE, PKZIP, SecureZIP, and PKSFX are registered trademarks of
- PKWARE, Inc. in the United States and elsewhere. PKPatchMaker,
- Deflate64, and ZIP64 are trademarks of PKWARE, Inc. Other marks
- referenced within this document appear for identification
- purposes only and are the property of their respective owners.
-
-
-1.4 Permitted Use
------------------
-
- 1.4.1 This document, "APPNOTE.TXT - .ZIP File Format Specification" is the
- exclusive property of PKWARE. Use of the information contained in this
- document is permitted solely for the purpose of creating products,
- programs and processes that read and write files in the ZIP format
- subject to the terms and conditions herein.
-
- 1.4.2 Use of the content of this document within other publications is
- permitted only through reference to this document. Any reproduction
- or distribution of this document in whole or in part without prior
- written permission from PKWARE is strictly prohibited.
-
- 1.4.3 Certain technological components provided in this document are the
- patented proprietary technology of PKWARE and as such require a
- separate, executed license agreement from PKWARE. Applicable
- components are marked with the following, or similar, statement:
- 'Refer to the section in this document entitled "Incorporating
- PKWARE Proprietary Technology into Your Product" for more information'.
-
-1.5 Contacting PKWARE
----------------------
-
- 1.5.1 If you have questions on this format, its use, or licensing, or if you
- wish to report defects, request changes or additions, please contact:
-
- PKWARE, Inc.
- 201 E. Pittsburgh Avenue, Suite 400
- Milwaukee, WI 53204
- +1-414-289-9788
- +1-414-289-9789 FAX
- zipformat@pkware.com
-
- 1.5.2 Information about this format and copies of this document are publicly
- available at:
-
- http://www.pkware.com/appnote
-
-1.6 Disclaimer
---------------
-
- 1.6.1 Although PKWARE will attempt to supply current and accurate
- information relating to its file formats, algorithms, and the
- subject programs, the possibility of error or omission cannot
- be eliminated. PKWARE therefore expressly disclaims any warranty
- that the information contained in the associated materials relating
- to the subject programs and/or the format of the files created or
- accessed by the subject programs and/or the algorithms used by
- the subject programs, or any other matter, is current, correct or
- accurate as delivered. Any risk of damage due to any possible
- inaccurate information is assumed by the user of the information.
- Furthermore, the information relating to the subject programs
- and/or the file formats created or accessed by the subject
- programs and/or the algorithms used by the subject programs is
- subject to change without notice.
-
-2.0 Revisions
---------------
-
-2.1 Document Status
---------------------
-
- 2.1.1 If the STATUS of this file is marked as DRAFT, the content
- defines proposed revisions to this specification which may consist
- of changes to the ZIP format itself, or that may consist of other
- content changes to this document. Versions of this document and
- the format in DRAFT form may be subject to modification prior to
- publication STATUS of FINAL. DRAFT versions are published periodically
- to provide notification to the ZIP community of pending changes and to
- provide opportunity for review and comment.
-
- 2.1.2 Versions of this document having a STATUS of FINAL are
- considered to be in the final form for that version of the document
- and are not subject to further change until a new, higher version
- numbered document is published. Newer versions of this format
- specification are intended to remain interoperable with with all prior
- versions whenever technically possible.
-
-2.2 Change Log
---------------
-
- Version Change Description Date
- ------- ------------------ ----------
- 5.2 -Single Password Symmetric Encryption 06/02/2003
- storage
-
- 6.1.0 -Smartcard compatibility 01/20/2004
- -Documentation on certificate storage
-
- 6.2.0 -Introduction of Central Directory 04/26/2004
- Encryption for encrypting metadata
- -Added OS X to Version Made By values
-
- 6.2.1 -Added Extra Field placeholder for 04/01/2005
- POSZIP using ID 0x4690
-
- -Clarified size field on
- "zip64 end of central directory record"
-
- 6.2.2 -Documented Final Feature Specification 01/06/2006
- for Strong Encryption
-
- -Clarifications and typographical
- corrections
-
- 6.3.0 -Added tape positioning storage 09/29/2006
- parameters
-
- -Expanded list of supported hash algorithms
-
- -Expanded list of supported compression
- algorithms
-
- -Expanded list of supported encryption
- algorithms
-
- -Added option for Unicode filename
- storage
-
- -Clarifications for consistent use
- of Data Descriptor records
-
- -Added additional "Extra Field"
- definitions
-
- 6.3.1 -Corrected standard hash values for 04/11/2007
- SHA-256/384/512
-
- 6.3.2 -Added compression method 97 09/28/2007
-
- -Documented InfoZIP "Extra Field"
- values for UTF-8 file name and
- file comment storage
-
- 6.3.3 -Formatting changes to support 09/01/2012
- easier referencing of this APPNOTE
- from other documents and standards
-
- 6.3.4 -Address change 10/01/2014
-
-
-3.0 Notations
--------------
-
- 3.1 Use of the term MUST or SHALL indicates a required element.
-
- 3.2 MAY NOT or SHALL NOT indicates an element is prohibited from use.
-
- 3.3 SHOULD indicates a RECOMMENDED element.
-
- 3.4 SHOULD NOT indicates an element NOT RECOMMENDED for use.
-
- 3.5 MAY indicates an OPTIONAL element.
-
-
-4.0 ZIP Files
--------------
-
-4.1 What is a ZIP file
-----------------------
-
- 4.1.1 ZIP files MAY be identified by the standard .ZIP file extension
- although use of a file extension is not required. Use of the
- extension .ZIPX is also recognized and MAY be used for ZIP files.
- Other common file extensions using the ZIP format include .JAR, .WAR,
- .DOCX, .XLXS, .PPTX, .ODT, .ODS, .ODP and others. Programs reading or
- writing ZIP files SHOULD rely on internal record signatures described
- in this document to identify files in this format.
-
- 4.1.2 ZIP files SHOULD contain at least one file and MAY contain
- multiple files.
-
- 4.1.3 Data compression MAY be used to reduce the size of files
- placed into a ZIP file, but is not required. This format supports the
- use of multiple data compression algorithms. When compression is used,
- one of the documented compression algorithms MUST be used. Implementors
- are advised to experiment with their data to determine which of the
- available algorithms provides the best compression for their needs.
- Compression method 8 (Deflate) is the method used by default by most
- ZIP compatible application programs.
-
-
- 4.1.4 Data encryption MAY be used to protect files within a ZIP file.
- Keying methods supported for encryption within this format include
- passwords and public/private keys. Either MAY be used individually
- or in combination. Encryption MAY be applied to individual files.
- Additional security MAY be used through the encryption of ZIP file
- metadata stored within the Central Directory. See the section on the
- Strong Encryption Specification for information. Refer to the section
- in this document entitled "Incorporating PKWARE Proprietary Technology
- into Your Product" for more information.
-
- 4.1.5 Data integrity MUST be provided for each file using CRC32.
-
- 4.1.6 Additional data integrity MAY be included through the use of
- digital signatures. Individual files MAY be signed with one or more
- digital signatures. The Central Directory, if signed, MUST use a
- single signature.
-
- 4.1.7 Files MAY be placed within a ZIP file uncompressed or stored.
- The term "stored" as used in the context of this document means the file
- is copied into the ZIP file uncompressed.
-
- 4.1.8 Each data file placed into a ZIP file MAY be compressed, stored,
- encrypted or digitally signed independent of how other data files in the
- same ZIP file are archived.
-
- 4.1.9 ZIP files MAY be streamed, split into segments (on fixed or on
- removable media) or "self-extracting". Self-extracting ZIP
- files MUST include extraction code for a target platform within
- the ZIP file.
-
- 4.1.10 Extensibility is provided for platform or application specific
- needs through extra data fields that MAY be defined for custom
- purposes. Extra data definitions MUST NOT conflict with existing
- documented record definitions.
-
- 4.1.11 Common uses for ZIP MAY also include the use of manifest files.
- Manifest files store application specific information within a file stored
- within the ZIP file. This manifest file SHOULD be the first file in the
- ZIP file. This specification does not provide any information or guidance on
- the use of manifest files within ZIP files. Refer to the application developer
- for information on using manifest files and for any additional profile
- information on using ZIP within an application.
-
- 4.1.12 ZIP files MAY be placed within other ZIP files.
-
-4.2 ZIP Metadata
-----------------
-
- 4.2.1 ZIP files are identified by metadata consisting of defined record types
- containing the storage information necessary for maintaining the files
- placed into a ZIP file. Each record type MUST be identified using a header
- signature that identifies the record type. Signature values begin with the
- two byte constant marker of 0x4b50, representing the characters "PK".
-
-
-4.3 General Format of a .ZIP file
----------------------------------
-
- 4.3.1 A ZIP file MUST contain an "end of central directory record". A ZIP
- file containing only an "end of central directory record" is considered an
- empty ZIP file. Files may be added or replaced within a ZIP file, or deleted.
- A ZIP file MUST have only one "end of central directory record". Other
- records defined in this specification MAY be used as needed to support
- storage requirements for individual ZIP files.
-
- 4.3.2 Each file placed into a ZIP file MUST be preceeded by a "local
- file header" record for that file. Each "local file header" MUST be
- accompanied by a corresponding "central directory header" record within
- the central directory section of the ZIP file.
-
- 4.3.3 Files MAY be stored in arbitrary order within a ZIP file. A ZIP
- file MAY span multiple volumes or it MAY be split into user-defined
- segment sizes. All values MUST be stored in little-endian byte order unless
- otherwise specified in this document for a specific data element.
-
- 4.3.4 Compression MUST NOT be applied to a "local file header", an "encryption
- header", or an "end of central directory record". Individual "central
- directory records" must not be compressed, but the aggregate of all central
- directory records MAY be compressed.
-
- 4.3.5 File data MAY be followed by a "data descriptor" for the file. Data
- descriptors are used to facilitate ZIP file streaming.
-
-
- 4.3.6 Overall .ZIP file format:
-
- [local file header 1]
- [encryption header 1]
- [file data 1]
- [data descriptor 1]
- .
- .
- .
- [local file header n]
- [encryption header n]
- [file data n]
- [data descriptor n]
- [archive decryption header]
- [archive extra data record]
- [central directory header 1]
- .
- .
- .
- [central directory header n]
- [zip64 end of central directory record]
- [zip64 end of central directory locator]
- [end of central directory record]
-
-
- 4.3.7 Local file header:
-
- local file header signature 4 bytes (0x04034b50)
- version needed to extract 2 bytes
- general purpose bit flag 2 bytes
- compression method 2 bytes
- last mod file time 2 bytes
- last mod file date 2 bytes
- crc-32 4 bytes
- compressed size 4 bytes
- uncompressed size 4 bytes
- file name length 2 bytes
- extra field length 2 bytes
-
- file name (variable size)
- extra field (variable size)
-
- 4.3.8 File data
-
- Immediately following the local header for a file
- SHOULD be placed the compressed or stored data for the file.
- If the file is encrypted, the encryption header for the file
- SHOULD be placed after the local header and before the file
- data. The series of [local file header][encryption header]
- [file data][data descriptor] repeats for each file in the
- .ZIP archive.
-
- Zero-byte files, directories, and other file types that
- contain no content MUST not include file data.
-
- 4.3.9 Data descriptor:
-
- crc-32 4 bytes
- compressed size 4 bytes
- uncompressed size 4 bytes
-
- 4.3.9.1 This descriptor MUST exist if bit 3 of the general
- purpose bit flag is set (see below). It is byte aligned
- and immediately follows the last byte of compressed data.
- This descriptor SHOULD be used only when it was not possible to
- seek in the output .ZIP file, e.g., when the output .ZIP file
- was standard output or a non-seekable device. For ZIP64(tm) format
- archives, the compressed and uncompressed sizes are 8 bytes each.
-
- 4.3.9.2 When compressing files, compressed and uncompressed sizes
- should be stored in ZIP64 format (as 8 byte values) when a
- file's size exceeds 0xFFFFFFFF. However ZIP64 format may be
- used regardless of the size of a file. When extracting, if
- the zip64 extended information extra field is present for
- the file the compressed and uncompressed sizes will be 8
- byte values.
-
- 4.3.9.3 Although not originally assigned a signature, the value
- 0x08074b50 has commonly been adopted as a signature value
- for the data descriptor record. Implementers should be
- aware that ZIP files may be encountered with or without this
- signature marking data descriptors and SHOULD account for
- either case when reading ZIP files to ensure compatibility.
-
- 4.3.9.4 When writing ZIP files, implementors SHOULD include the
- signature value marking the data descriptor record. When
- the signature is used, the fields currently defined for
- the data descriptor record will immediately follow the
- signature.
-
- 4.3.9.5 An extensible data descriptor will be released in a
- future version of this APPNOTE. This new record is intended to
- resolve conflicts with the use of this record going forward,
- and to provide better support for streamed file processing.
-
- 4.3.9.6 When the Central Directory Encryption method is used,
- the data descriptor record is not required, but MAY be used.
- If present, and bit 3 of the general purpose bit field is set to
- indicate its presence, the values in fields of the data descriptor
- record MUST be set to binary zeros. See the section on the Strong
- Encryption Specification for information. Refer to the section in
- this document entitled "Incorporating PKWARE Proprietary Technology
- into Your Product" for more information.
-
-
- 4.3.10 Archive decryption header:
-
- 4.3.10.1 The Archive Decryption Header is introduced in version 6.2
- of the ZIP format specification. This record exists in support
- of the Central Directory Encryption Feature implemented as part of
- the Strong Encryption Specification as described in this document.
- When the Central Directory Structure is encrypted, this decryption
- header MUST precede the encrypted data segment.
-
- 4.3.10.2 The encrypted data segment SHALL consist of the Archive
- extra data record (if present) and the encrypted Central Directory
- Structure data. The format of this data record is identical to the
- Decryption header record preceding compressed file data. If the
- central directory structure is encrypted, the location of the start of
- this data record is determined using the Start of Central Directory
- field in the Zip64 End of Central Directory record. See the
- section on the Strong Encryption Specification for information
- on the fields used in the Archive Decryption Header record.
- Refer to the section in this document entitled "Incorporating
- PKWARE Proprietary Technology into Your Product" for more information.
-
-
- 4.3.11 Archive extra data record:
-
- archive extra data signature 4 bytes (0x08064b50)
- extra field length 4 bytes
- extra field data (variable size)
-
- 4.3.11.1 The Archive Extra Data Record is introduced in version 6.2
- of the ZIP format specification. This record MAY be used in support
- of the Central Directory Encryption Feature implemented as part of
- the Strong Encryption Specification as described in this document.
- When present, this record MUST immediately precede the central
- directory data structure.
-
- 4.3.11.2 The size of this data record SHALL be included in the
- Size of the Central Directory field in the End of Central
- Directory record. If the central directory structure is compressed,
- but not encrypted, the location of the start of this data record is
- determined using the Start of Central Directory field in the Zip64
- End of Central Directory record. Refer to the section in this document
- entitled "Incorporating PKWARE Proprietary Technology into Your
- Product" for more information.
-
- 4.3.12 Central directory structure:
-
- [central directory header 1]
- .
- .
- .
- [central directory header n]
- [digital signature]
-
- File header:
-
- central file header signature 4 bytes (0x02014b50)
- version made by 2 bytes
- version needed to extract 2 bytes
- general purpose bit flag 2 bytes
- compression method 2 bytes
- last mod file time 2 bytes
- last mod file date 2 bytes
- crc-32 4 bytes
- compressed size 4 bytes
- uncompressed size 4 bytes
- file name length 2 bytes
- extra field length 2 bytes
- file comment length 2 bytes
- disk number start 2 bytes
- internal file attributes 2 bytes
- external file attributes 4 bytes
- relative offset of local header 4 bytes
-
- file name (variable size)
- extra field (variable size)
- file comment (variable size)
-
- 4.3.13 Digital signature:
-
- header signature 4 bytes (0x05054b50)
- size of data 2 bytes
- signature data (variable size)
-
- With the introduction of the Central Directory Encryption
- feature in version 6.2 of this specification, the Central
- Directory Structure MAY be stored both compressed and encrypted.
- Although not required, it is assumed when encrypting the
- Central Directory Structure, that it will be compressed
- for greater storage efficiency. Information on the
- Central Directory Encryption feature can be found in the section
- describing the Strong Encryption Specification. The Digital
- Signature record will be neither compressed nor encrypted.
-
- 4.3.14 Zip64 end of central directory record
-
- zip64 end of central dir
- signature 4 bytes (0x06064b50)
- size of zip64 end of central
- directory record 8 bytes
- version made by 2 bytes
- version needed to extract 2 bytes
- number of this disk 4 bytes
- number of the disk with the
- start of the central directory 4 bytes
- total number of entries in the
- central directory on this disk 8 bytes
- total number of entries in the
- central directory 8 bytes
- size of the central directory 8 bytes
- offset of start of central
- directory with respect to
- the starting disk number 8 bytes
- zip64 extensible data sector (variable size)
-
- 4.3.14.1 The value stored into the "size of zip64 end of central
- directory record" should be the size of the remaining
- record and should not include the leading 12 bytes.
-
- Size = SizeOfFixedFields + SizeOfVariableData - 12.
-
- 4.3.14.2 The above record structure defines Version 1 of the
- zip64 end of central directory record. Version 1 was
- implemented in versions of this specification preceding
- 6.2 in support of the ZIP64 large file feature. The
- introduction of the Central Directory Encryption feature
- implemented in version 6.2 as part of the Strong Encryption
- Specification defines Version 2 of this record structure.
- Refer to the section describing the Strong Encryption
- Specification for details on the version 2 format for
- this record. Refer to the section in this document entitled
- "Incorporating PKWARE Proprietary Technology into Your Product"
- for more information applicable to use of Version 2 of this
- record.
-
- 4.3.14.3 Special purpose data MAY reside in the zip64 extensible
- data sector field following either a V1 or V2 version of this
- record. To ensure identification of this special purpose data
- it must include an identifying header block consisting of the
- following:
-
- Header ID - 2 bytes
- Data Size - 4 bytes
-
- The Header ID field indicates the type of data that is in the
- data block that follows.
-
- Data Size identifies the number of bytes that follow for this
- data block type.
-
- 4.3.14.4 Multiple special purpose data blocks MAY be present.
- Each MUST be preceded by a Header ID and Data Size field. Current
- mappings of Header ID values supported in this field are as
- defined in APPENDIX C.
-
- 4.3.15 Zip64 end of central directory locator
-
- zip64 end of central dir locator
- signature 4 bytes (0x07064b50)
- number of the disk with the
- start of the zip64 end of
- central directory 4 bytes
- relative offset of the zip64
- end of central directory record 8 bytes
- total number of disks 4 bytes
-
- 4.3.16 End of central directory record:
-
- end of central dir signature 4 bytes (0x06054b50)
- number of this disk 2 bytes
- number of the disk with the
- start of the central directory 2 bytes
- total number of entries in the
- central directory on this disk 2 bytes
- total number of entries in
- the central directory 2 bytes
- size of the central directory 4 bytes
- offset of start of central
- directory with respect to
- the starting disk number 4 bytes
- .ZIP file comment length 2 bytes
- .ZIP file comment (variable size)
-
-4.4 Explanation of fields
---------------------------
-
- 4.4.1 General notes on fields
-
- 4.4.1.1 All fields unless otherwise noted are unsigned and stored
- in Intel low-byte:high-byte, low-word:high-word order.
-
- 4.4.1.2 String fields are not null terminated, since the length
- is given explicitly.
-
- 4.4.1.3 The entries in the central directory may not necessarily
- be in the same order that files appear in the .ZIP file.
-
- 4.4.1.4 If one of the fields in the end of central directory
- record is too small to hold required data, the field should be
- set to -1 (0xFFFF or 0xFFFFFFFF) and the ZIP64 format record
- should be created.
-
- 4.4.1.5 The end of central directory record and the Zip64 end
- of central directory locator record MUST reside on the same
- disk when splitting or spanning an archive.
-
- 4.4.2 version made by (2 bytes)
-
- 4.4.2.1 The upper byte indicates the compatibility of the file
- attribute information. If the external file attributes
- are compatible with MS-DOS and can be read by PKZIP for
- DOS version 2.04g then this value will be zero. If these
- attributes are not compatible, then this value will
- identify the host system on which the attributes are
- compatible. Software can use this information to determine
- the line record format for text files etc.
-
- 4.4.2.2 The current mappings are:
-
- 0 - MS-DOS and OS/2 (FAT / VFAT / FAT32 file systems)
- 1 - Amiga 2 - OpenVMS
- 3 - UNIX 4 - VM/CMS
- 5 - Atari ST 6 - OS/2 H.P.F.S.
- 7 - Macintosh 8 - Z-System
- 9 - CP/M 10 - Windows NTFS
- 11 - MVS (OS/390 - Z/OS) 12 - VSE
- 13 - Acorn Risc 14 - VFAT
- 15 - alternate MVS 16 - BeOS
- 17 - Tandem 18 - OS/400
- 19 - OS X (Darwin) 20 thru 255 - unused
-
- 4.4.2.3 The lower byte indicates the ZIP specification version
- (the version of this document) supported by the software
- used to encode the file. The value/10 indicates the major
- version number, and the value mod 10 is the minor version
- number.
-
- 4.4.3 version needed to extract (2 bytes)
-
- 4.4.3.1 The minimum supported ZIP specification version needed
- to extract the file, mapped as above. This value is based on
- the specific format features a ZIP program MUST support to
- be able to extract the file. If multiple features are
- applied to a file, the minimum version MUST be set to the
- feature having the highest value. New features or feature
- changes affecting the published format specification will be
- implemented using higher version numbers than the last
- published value to avoid conflict.
-
- 4.4.3.2 Current minimum feature versions are as defined below:
-
- 1.0 - Default value
- 1.1 - File is a volume label
- 2.0 - File is a folder (directory)
- 2.0 - File is compressed using Deflate compression
- 2.0 - File is encrypted using traditional PKWARE encryption
- 2.1 - File is compressed using Deflate64(tm)
- 2.5 - File is compressed using PKWARE DCL Implode
- 2.7 - File is a patch data set
- 4.5 - File uses ZIP64 format extensions
- 4.6 - File is compressed using BZIP2 compression*
- 5.0 - File is encrypted using DES
- 5.0 - File is encrypted using 3DES
- 5.0 - File is encrypted using original RC2 encryption
- 5.0 - File is encrypted using RC4 encryption
- 5.1 - File is encrypted using AES encryption
- 5.1 - File is encrypted using corrected RC2 encryption**
- 5.2 - File is encrypted using corrected RC2-64 encryption**
- 6.1 - File is encrypted using non-OAEP key wrapping***
- 6.2 - Central directory encryption
- 6.3 - File is compressed using LZMA
- 6.3 - File is compressed using PPMd+
- 6.3 - File is encrypted using Blowfish
- 6.3 - File is encrypted using Twofish
-
- 4.4.3.3 Notes on version needed to extract
-
- * Early 7.x (pre-7.2) versions of PKZIP incorrectly set the
- version needed to extract for BZIP2 compression to be 50
- when it should have been 46.
-
- ** Refer to the section on Strong Encryption Specification
- for additional information regarding RC2 corrections.
-
- *** Certificate encryption using non-OAEP key wrapping is the
- intended mode of operation for all versions beginning with 6.1.
- Support for OAEP key wrapping MUST only be used for
- backward compatibility when sending ZIP files to be opened by
- versions of PKZIP older than 6.1 (5.0 or 6.0).
-
- + Files compressed using PPMd MUST set the version
- needed to extract field to 6.3, however, not all ZIP
- programs enforce this and may be unable to decompress
- data files compressed using PPMd if this value is set.
-
- When using ZIP64 extensions, the corresponding value in the
- zip64 end of central directory record MUST also be set.
- This field should be set appropriately to indicate whether
- Version 1 or Version 2 format is in use.
-
-
- 4.4.4 general purpose bit flag: (2 bytes)
-
- Bit 0: If set, indicates that the file is encrypted.
-
- (For Method 6 - Imploding)
- Bit 1: If the compression method used was type 6,
- Imploding, then this bit, if set, indicates
- an 8K sliding dictionary was used. If clear,
- then a 4K sliding dictionary was used.
-
- Bit 2: If the compression method used was type 6,
- Imploding, then this bit, if set, indicates
- 3 Shannon-Fano trees were used to encode the
- sliding dictionary output. If clear, then 2
- Shannon-Fano trees were used.
-
- (For Methods 8 and 9 - Deflating)
- Bit 2 Bit 1
- 0 0 Normal (-en) compression option was used.
- 0 1 Maximum (-exx/-ex) compression option was used.
- 1 0 Fast (-ef) compression option was used.
- 1 1 Super Fast (-es) compression option was used.
-
- (For Method 14 - LZMA)
- Bit 1: If the compression method used was type 14,
- LZMA, then this bit, if set, indicates
- an end-of-stream (EOS) marker is used to
- mark the end of the compressed data stream.
- If clear, then an EOS marker is not present
- and the compressed data size must be known
- to extract.
-
- Note: Bits 1 and 2 are undefined if the compression
- method is any other.
-
- Bit 3: If this bit is set, the fields crc-32, compressed
- size and uncompressed size are set to zero in the
- local header. The correct values are put in the
- data descriptor immediately following the compressed
- data. (Note: PKZIP version 2.04g for DOS only
- recognizes this bit for method 8 compression, newer
- versions of PKZIP recognize this bit for any
- compression method.)
-
- Bit 4: Reserved for use with method 8, for enhanced
- deflating.
-
- Bit 5: If this bit is set, this indicates that the file is
- compressed patched data. (Note: Requires PKZIP
- version 2.70 or greater)
-
- Bit 6: Strong encryption. If this bit is set, you MUST
- set the version needed to extract value to at least
- 50 and you MUST also set bit 0. If AES encryption
- is used, the version needed to extract value MUST
- be at least 51. See the section describing the Strong
- Encryption Specification for details. Refer to the
- section in this document entitled "Incorporating PKWARE
- Proprietary Technology into Your Product" for more
- information.
-
- Bit 7: Currently unused.
-
- Bit 8: Currently unused.
-
- Bit 9: Currently unused.
-
- Bit 10: Currently unused.
-
- Bit 11: Language encoding flag (EFS). If this bit is set,
- the filename and comment fields for this file
- MUST be encoded using UTF-8. (see APPENDIX D)
-
- Bit 12: Reserved by PKWARE for enhanced compression.
-
- Bit 13: Set when encrypting the Central Directory to indicate
- selected data values in the Local Header are masked to
- hide their actual values. See the section describing
- the Strong Encryption Specification for details. Refer
- to the section in this document entitled "Incorporating
- PKWARE Proprietary Technology into Your Product" for
- more information.
-
- Bit 14: Reserved by PKWARE.
-
- Bit 15: Reserved by PKWARE.
-
- 4.4.5 compression method: (2 bytes)
-
- 0 - The file is stored (no compression)
- 1 - The file is Shrunk
- 2 - The file is Reduced with compression factor 1
- 3 - The file is Reduced with compression factor 2
- 4 - The file is Reduced with compression factor 3
- 5 - The file is Reduced with compression factor 4
- 6 - The file is Imploded
- 7 - Reserved for Tokenizing compression algorithm
- 8 - The file is Deflated
- 9 - Enhanced Deflating using Deflate64(tm)
- 10 - PKWARE Data Compression Library Imploding (old IBM TERSE)
- 11 - Reserved by PKWARE
- 12 - File is compressed using BZIP2 algorithm
- 13 - Reserved by PKWARE
- 14 - LZMA (EFS)
- 15 - Reserved by PKWARE
- 16 - Reserved by PKWARE
- 17 - Reserved by PKWARE
- 18 - File is compressed using IBM TERSE (new)
- 19 - IBM LZ77 z Architecture (PFS)
- 97 - WavPack compressed data
- 98 - PPMd version I, Rev 1
-
-
- 4.4.6 date and time fields: (2 bytes each)
-
- The date and time are encoded in standard MS-DOS format.
- If input came from standard input, the date and time are
- those at which compression was started for this data.
- If encrypting the central directory and general purpose bit
- flag 13 is set indicating masking, the value stored in the
- Local Header will be zero.
-
- 4.4.7 CRC-32: (4 bytes)
-
- The CRC-32 algorithm was generously contributed by
- David Schwaderer and can be found in his excellent
- book "C Programmers Guide to NetBIOS" published by
- Howard W. Sams & Co. Inc. The 'magic number' for
- the CRC is 0xdebb20e3. The proper CRC pre and post
- conditioning is used, meaning that the CRC register
- is pre-conditioned with all ones (a starting value
- of 0xffffffff) and the value is post-conditioned by
- taking the one's complement of the CRC residual.
- If bit 3 of the general purpose flag is set, this
- field is set to zero in the local header and the correct
- value is put in the data descriptor and in the central
- directory. When encrypting the central directory, if the
- local header is not in ZIP64 format and general purpose
- bit flag 13 is set indicating masking, the value stored
- in the Local Header will be zero.
-
- 4.4.8 compressed size: (4 bytes)
- 4.4.9 uncompressed size: (4 bytes)
-
- The size of the file compressed (4.4.8) and uncompressed,
- (4.4.9) respectively. When a decryption header is present it
- will be placed in front of the file data and the value of the
- compressed file size will include the bytes of the decryption
- header. If bit 3 of the general purpose bit flag is set,
- these fields are set to zero in the local header and the
- correct values are put in the data descriptor and
- in the central directory. If an archive is in ZIP64 format
- and the value in this field is 0xFFFFFFFF, the size will be
- in the corresponding 8 byte ZIP64 extended information
- extra field. When encrypting the central directory, if the
- local header is not in ZIP64 format and general purpose bit
- flag 13 is set indicating masking, the value stored for the
- uncompressed size in the Local Header will be zero.
-
- 4.4.10 file name length: (2 bytes)
- 4.4.11 extra field length: (2 bytes)
- 4.4.12 file comment length: (2 bytes)
-
- The length of the file name, extra field, and comment
- fields respectively. The combined length of any
- directory record and these three fields should not
- generally exceed 65,535 bytes. If input came from standard
- input, the file name length is set to zero.
-
-
- 4.4.13 disk number start: (2 bytes)
-
- The number of the disk on which this file begins. If an
- archive is in ZIP64 format and the value in this field is
- 0xFFFF, the size will be in the corresponding 4 byte zip64
- extended information extra field.
-
- 4.4.14 internal file attributes: (2 bytes)
-
- Bits 1 and 2 are reserved for use by PKWARE.
-
- 4.4.14.1 The lowest bit of this field indicates, if set,
- that the file is apparently an ASCII or text file. If not
- set, that the file apparently contains binary data.
- The remaining bits are unused in version 1.0.
-
- 4.4.14.2 The 0x0002 bit of this field indicates, if set, that
- a 4 byte variable record length control field precedes each
- logical record indicating the length of the record. The
- record length control field is stored in little-endian byte
- order. This flag is independent of text control characters,
- and if used in conjunction with text data, includes any
- control characters in the total length of the record. This
- value is provided for mainframe data transfer support.
-
- 4.4.15 external file attributes: (4 bytes)
-
- The mapping of the external attributes is
- host-system dependent (see 'version made by'). For
- MS-DOS, the low order byte is the MS-DOS directory
- attribute byte. If input came from standard input, this
- field is set to zero.
-
- 4.4.16 relative offset of local header: (4 bytes)
-
- This is the offset from the start of the first disk on
- which this file appears, to where the local header should
- be found. If an archive is in ZIP64 format and the value
- in this field is 0xFFFFFFFF, the size will be in the
- corresponding 8 byte zip64 extended information extra field.
-
- 4.4.17 file name: (Variable)
-
- 4.4.17.1 The name of the file, with optional relative path.
- The path stored MUST not contain a drive or
- device letter, or a leading slash. All slashes
- MUST be forward slashes '/' as opposed to
- backwards slashes '\' for compatibility with Amiga
- and UNIX file systems etc. If input came from standard
- input, there is no file name field.
-
- 4.4.17.2 If using the Central Directory Encryption Feature and
- general purpose bit flag 13 is set indicating masking, the file
- name stored in the Local Header will not be the actual file name.
- A masking value consisting of a unique hexadecimal value will
- be stored. This value will be sequentially incremented for each
- file in the archive. See the section on the Strong Encryption
- Specification for details on retrieving the encrypted file name.
- Refer to the section in this document entitled "Incorporating PKWARE
- Proprietary Technology into Your Product" for more information.
-
-
- 4.4.18 file comment: (Variable)
-
- The comment for this file.
-
- 4.4.19 number of this disk: (2 bytes)
-
- The number of this disk, which contains central
- directory end record. If an archive is in ZIP64 format
- and the value in this field is 0xFFFF, the size will
- be in the corresponding 4 byte zip64 end of central
- directory field.
-
-
- 4.4.20 number of the disk with the start of the central
- directory: (2 bytes)
-
- The number of the disk on which the central
- directory starts. If an archive is in ZIP64 format
- and the value in this field is 0xFFFF, the size will
- be in the corresponding 4 byte zip64 end of central
- directory field.
-
- 4.4.21 total number of entries in the central dir on
- this disk: (2 bytes)
-
- The number of central directory entries on this disk.
- If an archive is in ZIP64 format and the value in
- this field is 0xFFFF, the size will be in the
- corresponding 8 byte zip64 end of central
- directory field.
-
- 4.4.22 total number of entries in the central dir: (2 bytes)
-
- The total number of files in the .ZIP file. If an
- archive is in ZIP64 format and the value in this field
- is 0xFFFF, the size will be in the corresponding 8 byte
- zip64 end of central directory field.
-
- 4.4.23 size of the central directory: (4 bytes)
-
- The size (in bytes) of the entire central directory.
- If an archive is in ZIP64 format and the value in
- this field is 0xFFFFFFFF, the size will be in the
- corresponding 8 byte zip64 end of central
- directory field.
-
- 4.4.24 offset of start of central directory with respect to
- the starting disk number: (4 bytes)
-
- Offset of the start of the central directory on the
- disk on which the central directory starts. If an
- archive is in ZIP64 format and the value in this
- field is 0xFFFFFFFF, the size will be in the
- corresponding 8 byte zip64 end of central
- directory field.
-
- 4.4.25 .ZIP file comment length: (2 bytes)
-
- The length of the comment for this .ZIP file.
-
- 4.4.26 .ZIP file comment: (Variable)
-
- The comment for this .ZIP file. ZIP file comment data
- is stored unsecured. No encryption or data authentication
- is applied to this area at this time. Confidential information
- should not be stored in this section.
-
- 4.4.27 zip64 extensible data sector (variable size)
-
- (currently reserved for use by PKWARE)
-
-
- 4.4.28 extra field: (Variable)
-
- This SHOULD be used for storage expansion. If additional
- information needs to be stored within a ZIP file for special
- application or platform needs, it SHOULD be stored here.
- Programs supporting earlier versions of this specification can
- then safely skip the file, and find the next file or header.
- This field will be 0 length in version 1.0.
-
- Existing extra fields are defined in the section
- Extensible data fields that follows.
-
-4.5 Extensible data fields
---------------------------
-
- 4.5.1 In order to allow different programs and different types
- of information to be stored in the 'extra' field in .ZIP
- files, the following structure MUST be used for all
- programs storing data in this field:
-
- header1+data1 + header2+data2 . . .
-
- Each header should consist of:
-
- Header ID - 2 bytes
- Data Size - 2 bytes
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- The Header ID field indicates the type of data that is in
- the following data block.
-
- Header IDs of 0 thru 31 are reserved for use by PKWARE.
- The remaining IDs can be used by third party vendors for
- proprietary usage.
-
- 4.5.2 The current Header ID mappings defined by PKWARE are:
-
- 0x0001 Zip64 extended information extra field
- 0x0007 AV Info
- 0x0008 Reserved for extended language encoding data (PFS)
- (see APPENDIX D)
- 0x0009 OS/2
- 0x000a NTFS
- 0x000c OpenVMS
- 0x000d UNIX
- 0x000e Reserved for file stream and fork descriptors
- 0x000f Patch Descriptor
- 0x0014 PKCS#7 Store for X.509 Certificates
- 0x0015 X.509 Certificate ID and Signature for
- individual file
- 0x0016 X.509 Certificate ID for Central Directory
- 0x0017 Strong Encryption Header
- 0x0018 Record Management Controls
- 0x0019 PKCS#7 Encryption Recipient Certificate List
- 0x0065 IBM S/390 (Z390), AS/400 (I400) attributes
- - uncompressed
- 0x0066 Reserved for IBM S/390 (Z390), AS/400 (I400)
- attributes - compressed
- 0x4690 POSZIP 4690 (reserved)
-
-
- 4.5.3 -Zip64 Extended Information Extra Field (0x0001):
-
- The following is the layout of the zip64 extended
- information "extra" block. If one of the size or
- offset fields in the Local or Central directory
- record is too small to hold the required data,
- a Zip64 extended information record is created.
- The order of the fields in the zip64 extended
- information record is fixed, but the fields MUST
- only appear if the corresponding Local or Central
- directory record field is set to 0xFFFF or 0xFFFFFFFF.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
-(ZIP64) 0x0001 2 bytes Tag for this "extra" block type
- Size 2 bytes Size of this "extra" block
- Original
- Size 8 bytes Original uncompressed file size
- Compressed
- Size 8 bytes Size of compressed data
- Relative Header
- Offset 8 bytes Offset of local header record
- Disk Start
- Number 4 bytes Number of the disk on which
- this file starts
-
- This entry in the Local header MUST include BOTH original
- and compressed file size fields. If encrypting the
- central directory and bit 13 of the general purpose bit
- flag is set indicating masking, the value stored in the
- Local Header for the original file size will be zero.
-
-
- 4.5.4 -OS/2 Extra Field (0x0009):
-
- The following is the layout of the OS/2 attributes "extra"
- block. (Last Revision 09/05/95)
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
-(OS/2) 0x0009 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size for the following data block
- BSize 4 bytes Uncompressed Block Size
- CType 2 bytes Compression type
- EACRC 4 bytes CRC value for uncompress block
- (var) variable Compressed block
-
- The OS/2 extended attribute structure (FEA2LIST) is
- compressed and then stored in its entirety within this
- structure. There will only ever be one "block" of data in
- VarFields[].
-
- 4.5.5 -NTFS Extra Field (0x000a):
-
- The following is the layout of the NTFS attributes
- "extra" block. (Note: At this time the Mtime, Atime
- and Ctime values MAY be used on any WIN32 system.)
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
-(NTFS) 0x000a 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size of the total "extra" block
- Reserved 4 bytes Reserved for future use
- Tag1 2 bytes NTFS attribute tag value #1
- Size1 2 bytes Size of attribute #1, in bytes
- (var) Size1 Attribute #1 data
- .
- .
- .
- TagN 2 bytes NTFS attribute tag value #N
- SizeN 2 bytes Size of attribute #N, in bytes
- (var) SizeN Attribute #N data
-
- For NTFS, values for Tag1 through TagN are as follows:
- (currently only one set of attributes is defined for NTFS)
-
- Tag Size Description
- ----- ---- -----------
- 0x0001 2 bytes Tag for attribute #1
- Size1 2 bytes Size of attribute #1, in bytes
- Mtime 8 bytes File last modification time
- Atime 8 bytes File last access time
- Ctime 8 bytes File creation time
-
- 4.5.6 -OpenVMS Extra Field (0x000c):
-
- The following is the layout of the OpenVMS attributes
- "extra" block.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- (VMS) 0x000c 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size of the total "extra" block
- CRC 4 bytes 32-bit CRC for remainder of the block
- Tag1 2 bytes OpenVMS attribute tag value #1
- Size1 2 bytes Size of attribute #1, in bytes
- (var) Size1 Attribute #1 data
- .
- .
- .
- TagN 2 bytes OpenVMS attribute tag value #N
- SizeN 2 bytes Size of attribute #N, in bytes
- (var) SizeN Attribute #N data
-
- OpenVMS Extra Field Rules:
-
- 4.5.6.1. There will be one or more attributes present, which
- will each be preceded by the above TagX & SizeX values.
- These values are identical to the ATR$C_XXXX and ATR$S_XXXX
- constants which are defined in ATR.H under OpenVMS C. Neither
- of these values will ever be zero.
-
- 4.5.6.2. No word alignment or padding is performed.
-
- 4.5.6.3. A well-behaved PKZIP/OpenVMS program should never produce
- more than one sub-block with the same TagX value. Also, there will
- never be more than one "extra" block of type 0x000c in a particular
- directory record.
-
- 4.5.7 -UNIX Extra Field (0x000d):
-
- The following is the layout of the UNIX "extra" block.
- Note: all fields are stored in Intel low-byte/high-byte
- order.
-
- Value Size Description
- ----- ---- -----------
-(UNIX) 0x000d 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size for the following data block
- Atime 4 bytes File last access time
- Mtime 4 bytes File last modification time
- Uid 2 bytes File user ID
- Gid 2 bytes File group ID
- (var) variable Variable length data field
-
- The variable length data field will contain file type
- specific data. Currently the only values allowed are
- the original "linked to" file names for hard or symbolic
- links, and the major and minor device node numbers for
- character and block device nodes. Since device nodes
- cannot be either symbolic or hard links, only one set of
- variable length data is stored. Link files will have the
- name of the original file stored. This name is NOT NULL
- terminated. Its size can be determined by checking TSize -
- 12. Device entries will have eight bytes stored as two 4
- byte entries (in little endian format). The first entry
- will be the major device number, and the second the minor
- device number.
-
- 4.5.8 -PATCH Descriptor Extra Field (0x000f):
-
- 4.5.8.1 The following is the layout of the Patch Descriptor
- "extra" block.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
-(Patch) 0x000f 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size of the total "extra" block
- Version 2 bytes Version of the descriptor
- Flags 4 bytes Actions and reactions (see below)
- OldSize 4 bytes Size of the file about to be patched
- OldCRC 4 bytes 32-bit CRC of the file to be patched
- NewSize 4 bytes Size of the resulting file
- NewCRC 4 bytes 32-bit CRC of the resulting file
-
- 4.5.8.2 Actions and reactions
-
- Bits Description
- ---- ----------------
- 0 Use for auto detection
- 1 Treat as a self-patch
- 2-3 RESERVED
- 4-5 Action (see below)
- 6-7 RESERVED
- 8-9 Reaction (see below) to absent file
- 10-11 Reaction (see below) to newer file
- 12-13 Reaction (see below) to unknown file
- 14-15 RESERVED
- 16-31 RESERVED
-
- 4.5.8.2.1 Actions
-
- Action Value
- ------ -----
- none 0
- add 1
- delete 2
- patch 3
-
- 4.5.8.2.2 Reactions
-
- Reaction Value
- -------- -----
- ask 0
- skip 1
- ignore 2
- fail 3
-
- 4.5.8.3 Patch support is provided by PKPatchMaker(tm) technology
- and is covered under U.S. Patents and Patents Pending. The use or
- implementation in a product of certain technological aspects set
- forth in the current APPNOTE, including those with regard to
- strong encryption or patching requires a license from PKWARE.
- Refer to the section in this document entitled "Incorporating
- PKWARE Proprietary Technology into Your Product" for more
- information.
-
- 4.5.9 -PKCS#7 Store for X.509 Certificates (0x0014):
-
- This field MUST contain information about each of the certificates
- files may be signed with. When the Central Directory Encryption
- feature is enabled for a ZIP file, this record will appear in
- the Archive Extra Data Record, otherwise it will appear in the
- first central directory record and will be ignored in any
- other record.
-
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
-(Store) 0x0014 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size of the store data
- TData TSize Data about the store
-
-
- 4.5.10 -X.509 Certificate ID and Signature for individual file (0x0015):
-
- This field contains the information about which certificate in
- the PKCS#7 store was used to sign a particular file. It also
- contains the signature data. This field can appear multiple
- times, but can only appear once per certificate.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
-(CID) 0x0015 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size of data that follows
- TData TSize Signature Data
-
- 4.5.11 -X.509 Certificate ID and Signature for central directory (0x0016):
-
- This field contains the information about which certificate in
- the PKCS#7 store was used to sign the central directory structure.
- When the Central Directory Encryption feature is enabled for a
- ZIP file, this record will appear in the Archive Extra Data Record,
- otherwise it will appear in the first central directory record.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
-(CDID) 0x0016 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size of data that follows
- TData TSize Data
-
- 4.5.12 -Strong Encryption Header (0x0017):
-
- Value Size Description
- ----- ---- -----------
- 0x0017 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size of data that follows
- Format 2 bytes Format definition for this record
- AlgID 2 bytes Encryption algorithm identifier
- Bitlen 2 bytes Bit length of encryption key
- Flags 2 bytes Processing flags
- CertData TSize-8 Certificate decryption extra field data
- (refer to the explanation for CertData
- in the section describing the
- Certificate Processing Method under
- the Strong Encryption Specification)
-
- See the section describing the Strong Encryption Specification
- for details. Refer to the section in this document entitled
- "Incorporating PKWARE Proprietary Technology into Your Product"
- for more information.
-
- 4.5.13 -Record Management Controls (0x0018):
-
- Value Size Description
- ----- ---- -----------
-(Rec-CTL) 0x0018 2 bytes Tag for this "extra" block type
- CSize 2 bytes Size of total extra block data
- Tag1 2 bytes Record control attribute 1
- Size1 2 bytes Size of attribute 1, in bytes
- Data1 Size1 Attribute 1 data
- .
- .
- .
- TagN 2 bytes Record control attribute N
- SizeN 2 bytes Size of attribute N, in bytes
- DataN SizeN Attribute N data
-
-
- 4.5.14 -PKCS#7 Encryption Recipient Certificate List (0x0019):
-
- This field MAY contain information about each of the certificates
- used in encryption processing and it can be used to identify who is
- allowed to decrypt encrypted files. This field should only appear
- in the archive extra data record. This field is not required and
- serves only to aid archive modifications by preserving public
- encryption key data. Individual security requirements may dictate
- that this data be omitted to deter information exposure.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
-(CStore) 0x0019 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size of the store data
- TData TSize Data about the store
-
- TData:
-
- Value Size Description
- ----- ---- -----------
- Version 2 bytes Format version number - must 0x0001 at this time
- CStore (var) PKCS#7 data blob
-
- See the section describing the Strong Encryption Specification
- for details. Refer to the section in this document entitled
- "Incorporating PKWARE Proprietary Technology into Your Product"
- for more information.
-
- 4.5.15 -MVS Extra Field (0x0065):
-
- The following is the layout of the MVS "extra" block.
- Note: Some fields are stored in Big Endian format.
- All text is in EBCDIC format unless otherwise specified.
-
- Value Size Description
- ----- ---- -----------
-(MVS) 0x0065 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size for the following data block
- ID 4 bytes EBCDIC "Z390" 0xE9F3F9F0 or
- "T4MV" for TargetFour
- (var) TSize-4 Attribute data (see APPENDIX B)
-
-
- 4.5.16 -OS/400 Extra Field (0x0065):
-
- The following is the layout of the OS/400 "extra" block.
- Note: Some fields are stored in Big Endian format.
- All text is in EBCDIC format unless otherwise specified.
-
- Value Size Description
- ----- ---- -----------
-(OS400) 0x0065 2 bytes Tag for this "extra" block type
- TSize 2 bytes Size for the following data block
- ID 4 bytes EBCDIC "I400" 0xC9F4F0F0 or
- "T4MV" for TargetFour
- (var) TSize-4 Attribute data (see APPENDIX A)
-
-4.6 Third Party Mappings
-------------------------
-
- 4.6.1 Third party mappings commonly used are:
-
- 0x07c8 Macintosh
- 0x2605 ZipIt Macintosh
- 0x2705 ZipIt Macintosh 1.3.5+
- 0x2805 ZipIt Macintosh 1.3.5+
- 0x334d Info-ZIP Macintosh
- 0x4341 Acorn/SparkFS
- 0x4453 Windows NT security descriptor (binary ACL)
- 0x4704 VM/CMS
- 0x470f MVS
- 0x4b46 FWKCS MD5 (see below)
- 0x4c41 OS/2 access control list (text ACL)
- 0x4d49 Info-ZIP OpenVMS
- 0x4f4c Xceed original location extra field
- 0x5356 AOS/VS (ACL)
- 0x5455 extended timestamp
- 0x554e Xceed unicode extra field
- 0x5855 Info-ZIP UNIX (original, also OS/2, NT, etc)
- 0x6375 Info-ZIP Unicode Comment Extra Field
- 0x6542 BeOS/BeBox
- 0x7075 Info-ZIP Unicode Path Extra Field
- 0x756e ASi UNIX
- 0x7855 Info-ZIP UNIX (new)
- 0xa220 Microsoft Open Packaging Growth Hint
- 0xfd4a SMS/QDOS
-
- Detailed descriptions of Extra Fields defined by third
- party mappings will be documented as information on
- these data structures is made available to PKWARE.
- PKWARE does not guarantee the accuracy of any published
- third party data.
-
- 4.6.2 Third-party Extra Fields must include a Header ID using
- the format defined in the section of this document
- titled Extensible Data Fields (section 4.5).
-
- The Data Size field indicates the size of the following
- data block. Programs can use this value to skip to the
- next header block, passing over any data blocks that are
- not of interest.
-
- Note: As stated above, the size of the entire .ZIP file
- header, including the file name, comment, and extra
- field should not exceed 64K in size.
-
- 4.6.3 In case two different programs should appropriate the same
- Header ID value, it is strongly recommended that each
- program SHOULD place a unique signature of at least two bytes in
- size (and preferably 4 bytes or bigger) at the start of
- each data area. Every program SHOULD verify that its
- unique signature is present, in addition to the Header ID
- value being correct, before assuming that it is a block of
- known type.
-
- Third-party Mappings:
-
- 4.6.4 -ZipIt Macintosh Extra Field (long) (0x2605):
-
- The following is the layout of the ZipIt extra block
- for Macintosh. The local-header and central-header versions
- are identical. This block must be present if the file is
- stored MacBinary-encoded and it should not be used if the file
- is not stored MacBinary-encoded.
-
- Value Size Description
- ----- ---- -----------
- (Mac2) 0x2605 Short tag for this extra block type
- TSize Short total data size for this block
- "ZPIT" beLong extra-field signature
- FnLen Byte length of FileName
- FileName variable full Macintosh filename
- FileType Byte[4] four-byte Mac file type string
- Creator Byte[4] four-byte Mac creator string
-
-
- 4.6.5 -ZipIt Macintosh Extra Field (short, for files) (0x2705):
-
- The following is the layout of a shortened variant of the
- ZipIt extra block for Macintosh (without "full name" entry).
- This variant is used by ZipIt 1.3.5 and newer for entries of
- files (not directories) that do not have a MacBinary encoded
- file. The local-header and central-header versions are identical.
-
- Value Size Description
- ----- ---- -----------
- (Mac2b) 0x2705 Short tag for this extra block type
- TSize Short total data size for this block (12)
- "ZPIT" beLong extra-field signature
- FileType Byte[4] four-byte Mac file type string
- Creator Byte[4] four-byte Mac creator string
- fdFlags beShort attributes from FInfo.frFlags,
- may be omitted
- 0x0000 beShort reserved, may be omitted
-
-
- 4.6.6 -ZipIt Macintosh Extra Field (short, for directories) (0x2805):
-
- The following is the layout of a shortened variant of the
- ZipIt extra block for Macintosh used only for directory
- entries. This variant is used by ZipIt 1.3.5 and newer to
- save some optional Mac-specific information about directories.
- The local-header and central-header versions are identical.
-
- Value Size Description
- ----- ---- -----------
- (Mac2c) 0x2805 Short tag for this extra block type
- TSize Short total data size for this block (12)
- "ZPIT" beLong extra-field signature
- frFlags beShort attributes from DInfo.frFlags, may
- be omitted
- View beShort ZipIt view flag, may be omitted
-
-
- The View field specifies ZipIt-internal settings as follows:
-
- Bits of the Flags:
- bit 0 if set, the folder is shown expanded (open)
- when the archive contents are viewed in ZipIt.
- bits 1-15 reserved, zero;
-
-
- 4.6.7 -FWKCS MD5 Extra Field (0x4b46):
-
- The FWKCS Contents_Signature System, used in
- automatically identifying files independent of file name,
- optionally adds and uses an extra field to support the
- rapid creation of an enhanced contents_signature:
-
- Header ID = 0x4b46
- Data Size = 0x0013
- Preface = 'M','D','5'
- followed by 16 bytes containing the uncompressed file's
- 128_bit MD5 hash(1), low byte first.
-
- When FWKCS revises a .ZIP file central directory to add
- this extra field for a file, it also replaces the
- central directory entry for that file's uncompressed
- file length with a measured value.
-
- FWKCS provides an option to strip this extra field, if
- present, from a .ZIP file central directory. In adding
- this extra field, FWKCS preserves .ZIP file Authenticity
- Verification; if stripping this extra field, FWKCS
- preserves all versions of AV through PKZIP version 2.04g.
-
- FWKCS, and FWKCS Contents_Signature System, are
- trademarks of Frederick W. Kantor.
-
- (1) R. Rivest, RFC1321.TXT, MIT Laboratory for Computer
- Science and RSA Data Security, Inc., April 1992.
- ll.76-77: "The MD5 algorithm is being placed in the
- public domain for review and possible adoption as a
- standard."
-
-
- 4.6.8 -Info-ZIP Unicode Comment Extra Field (0x6375):
-
- Stores the UTF-8 version of the file comment as stored in the
- central directory header. (Last Revision 20070912)
-
- Value Size Description
- ----- ---- -----------
- (UCom) 0x6375 Short tag for this extra block type ("uc")
- TSize Short total data size for this block
- Version 1 byte version of this extra field, currently 1
- ComCRC32 4 bytes Comment Field CRC32 Checksum
- UnicodeCom Variable UTF-8 version of the entry comment
-
- Currently Version is set to the number 1. If there is a need
- to change this field, the version will be incremented. Changes
- may not be backward compatible so this extra field should not be
- used if the version is not recognized.
-
- The ComCRC32 is the standard zip CRC32 checksum of the File Comment
- field in the central directory header. This is used to verify that
- the comment field has not changed since the Unicode Comment extra field
- was created. This can happen if a utility changes the File Comment
- field but does not update the UTF-8 Comment extra field. If the CRC
- check fails, this Unicode Comment extra field should be ignored and
- the File Comment field in the header should be used instead.
-
- The UnicodeCom field is the UTF-8 version of the File Comment field
- in the header. As UnicodeCom is defined to be UTF-8, no UTF-8 byte
- order mark (BOM) is used. The length of this field is determined by
- subtracting the size of the previous fields from TSize. If both the
- File Name and Comment fields are UTF-8, the new General Purpose Bit
- Flag, bit 11 (Language encoding flag (EFS)), can be used to indicate
- both the header File Name and Comment fields are UTF-8 and, in this
- case, the Unicode Path and Unicode Comment extra fields are not
- needed and should not be created. Note that, for backward
- compatibility, bit 11 should only be used if the native character set
- of the paths and comments being zipped up are already in UTF-8. It is
- expected that the same file comment storage method, either general
- purpose bit 11 or extra fields, be used in both the Local and Central
- Directory Header for a file.
-
-
- 4.6.9 -Info-ZIP Unicode Path Extra Field (0x7075):
-
- Stores the UTF-8 version of the file name field as stored in the
- local header and central directory header. (Last Revision 20070912)
-
- Value Size Description
- ----- ---- -----------
- (UPath) 0x7075 Short tag for this extra block type ("up")
- TSize Short total data size for this block
- Version 1 byte version of this extra field, currently 1
- NameCRC32 4 bytes File Name Field CRC32 Checksum
- UnicodeName Variable UTF-8 version of the entry File Name
-
- Currently Version is set to the number 1. If there is a need
- to change this field, the version will be incremented. Changes
- may not be backward compatible so this extra field should not be
- used if the version is not recognized.
-
- The NameCRC32 is the standard zip CRC32 checksum of the File Name
- field in the header. This is used to verify that the header
- File Name field has not changed since the Unicode Path extra field
- was created. This can happen if a utility renames the File Name but
- does not update the UTF-8 path extra field. If the CRC check fails,
- this UTF-8 Path Extra Field should be ignored and the File Name field
- in the header should be used instead.
-
- The UnicodeName is the UTF-8 version of the contents of the File Name
- field in the header. As UnicodeName is defined to be UTF-8, no UTF-8
- byte order mark (BOM) is used. The length of this field is determined
- by subtracting the size of the previous fields from TSize. If both
- the File Name and Comment fields are UTF-8, the new General Purpose
- Bit Flag, bit 11 (Language encoding flag (EFS)), can be used to
- indicate that both the header File Name and Comment fields are UTF-8
- and, in this case, the Unicode Path and Unicode Comment extra fields
- are not needed and should not be created. Note that, for backward
- compatibility, bit 11 should only be used if the native character set
- of the paths and comments being zipped up are already in UTF-8. It is
- expected that the same file name storage method, either general
- purpose bit 11 or extra fields, be used in both the Local and Central
- Directory Header for a file.
-
-
- 4.6.10 -Microsoft Open Packaging Growth Hint (0xa220):
-
- Value Size Description
- ----- ---- -----------
- 0xa220 Short tag for this extra block type
- TSize Short size of Sig + PadVal + Padding
- Sig Short verification signature (A028)
- PadVal Short Initial padding value
- Padding variable filled with NULL characters
-
-4.7 Manifest Files
-------------------
-
- 4.7.1 Applications using ZIP files may have a need for additional
- information that must be included with the files placed into
- a ZIP file. Application specific information that cannot be
- stored using the defined ZIP storage records SHOULD be stored
- using the extensible Extra Field convention defined in this
- document. However, some applications may use a manifest
- file as a means for storing additional information. One
- example is the META-INF/MANIFEST.MF file used in ZIP formatted
- files having the .JAR extension (JAR files).
-
- 4.7.2 A manifest file is a file created for the application process
- that requires this information. A manifest file MAY be of any
- file type required by the defining application process. It is
- placed within the same ZIP file as files to which this information
- applies. By convention, this file is typically the first file placed
- into the ZIP file and it may include a defined directory path.
-
- 4.7.3 Manifest files may be compressed or encrypted as needed for
- application processing of the files inside the ZIP files.
-
- Manifest files are outside of the scope of this specification.
-
-
-5.0 Explanation of compression methods
---------------------------------------
-
-
-5.1 UnShrinking - Method 1
---------------------------
-
- 5.1.1 Shrinking is a Dynamic Ziv-Lempel-Welch compression algorithm
- with partial clearing. The initial code size is 9 bits, and the
- maximum code size is 13 bits. Shrinking differs from conventional
- Dynamic Ziv-Lempel-Welch implementations in several respects:
-
- 5.1.2 The code size is controlled by the compressor, and is
- not automatically increased when codes larger than the current
- code size are created (but not necessarily used). When
- the decompressor encounters the code sequence 256
- (decimal) followed by 1, it should increase the code size
- read from the input stream to the next bit size. No
- blocking of the codes is performed, so the next code at
- the increased size should be read from the input stream
- immediately after where the previous code at the smaller
- bit size was read. Again, the decompressor should not
- increase the code size used until the sequence 256,1 is
- encountered.
-
- 5.1.3 When the table becomes full, total clearing is not
- performed. Rather, when the compressor emits the code
- sequence 256,2 (decimal), the decompressor should clear
- all leaf nodes from the Ziv-Lempel tree, and continue to
- use the current code size. The nodes that are cleared
- from the Ziv-Lempel tree are then re-used, with the lowest
- code value re-used first, and the highest code value
- re-used last. The compressor can emit the sequence 256,2
- at any time.
-
-5.2 Expanding - Methods 2-5
----------------------------
-
- 5.2.1 The Reducing algorithm is actually a combination of two
- distinct algorithms. The first algorithm compresses repeated
- byte sequences, and the second algorithm takes the compressed
- stream from the first algorithm and applies a probabilistic
- compression method.
-
- 5.2.2 The probabilistic compression stores an array of 'follower
- sets' S(j), for j=0 to 255, corresponding to each possible
- ASCII character. Each set contains between 0 and 32
- characters, to be denoted as S(j)[0],...,S(j)[m], where m<32.
- The sets are stored at the beginning of the data area for a
- Reduced file, in reverse order, with S(255) first, and S(0)
- last.
-
- 5.2.3 The sets are encoded as { N(j), S(j)[0],...,S(j)[N(j)-1] },
- where N(j) is the size of set S(j). N(j) can be 0, in which
- case the follower set for S(j) is empty. Each N(j) value is
- encoded in 6 bits, followed by N(j) eight bit character values
- corresponding to S(j)[0] to S(j)[N(j)-1] respectively. If
- N(j) is 0, then no values for S(j) are stored, and the value
- for N(j-1) immediately follows.
-
- 5.2.4 Immediately after the follower sets, is the compressed data
- stream. The compressed data stream can be interpreted for the
- probabilistic decompression as follows:
-
- let Last-Character <- 0.
- loop until done
- if the follower set S(Last-Character) is empty then
- read 8 bits from the input stream, and copy this
- value to the output stream.
- otherwise if the follower set S(Last-Character) is non-empty then
- read 1 bit from the input stream.
- if this bit is not zero then
- read 8 bits from the input stream, and copy this
- value to the output stream.
- otherwise if this bit is zero then
- read B(N(Last-Character)) bits from the input
- stream, and assign this value to I.
- Copy the value of S(Last-Character)[I] to the
- output stream.
-
- assign the last value placed on the output stream to
- Last-Character.
- end loop
-
- B(N(j)) is defined as the minimal number of bits required to
- encode the value N(j)-1.
-
- 5.2.5 The decompressed stream from above can then be expanded to
- re-create the original file as follows:
-
- let State <- 0.
-
- loop until done
- read 8 bits from the input stream into C.
- case State of
- 0: if C is not equal to DLE (144 decimal) then
- copy C to the output stream.
- otherwise if C is equal to DLE then
- let State <- 1.
-
- 1: if C is non-zero then
- let V <- C.
- let Len <- L(V)
- let State <- F(Len).
- otherwise if C is zero then
- copy the value 144 (decimal) to the output stream.
- let State <- 0
-
- 2: let Len <- Len + C
- let State <- 3.
-
- 3: move backwards D(V,C) bytes in the output stream
- (if this position is before the start of the output
- stream, then assume that all the data before the
- start of the output stream is filled with zeros).
- copy Len+3 bytes from this position to the output stream.
- let State <- 0.
- end case
- end loop
-
- The functions F,L, and D are dependent on the 'compression
- factor', 1 through 4, and are defined as follows:
-
- For compression factor 1:
- L(X) equals the lower 7 bits of X.
- F(X) equals 2 if X equals 127 otherwise F(X) equals 3.
- D(X,Y) equals the (upper 1 bit of X) * 256 + Y + 1.
- For compression factor 2:
- L(X) equals the lower 6 bits of X.
- F(X) equals 2 if X equals 63 otherwise F(X) equals 3.
- D(X,Y) equals the (upper 2 bits of X) * 256 + Y + 1.
- For compression factor 3:
- L(X) equals the lower 5 bits of X.
- F(X) equals 2 if X equals 31 otherwise F(X) equals 3.
- D(X,Y) equals the (upper 3 bits of X) * 256 + Y + 1.
- For compression factor 4:
- L(X) equals the lower 4 bits of X.
- F(X) equals 2 if X equals 15 otherwise F(X) equals 3.
- D(X,Y) equals the (upper 4 bits of X) * 256 + Y + 1.
-
-5.3 Imploding - Method 6
-------------------------
-
- 5.3.1 The Imploding algorithm is actually a combination of two
- distinct algorithms. The first algorithm compresses repeated byte
- sequences using a sliding dictionary. The second algorithm is
- used to compress the encoding of the sliding dictionary output,
- using multiple Shannon-Fano trees.
-
- 5.3.2 The Imploding algorithm can use a 4K or 8K sliding dictionary
- size. The dictionary size used can be determined by bit 1 in the
- general purpose flag word; a 0 bit indicates a 4K dictionary
- while a 1 bit indicates an 8K dictionary.
-
- 5.3.3 The Shannon-Fano trees are stored at the start of the
- compressed file. The number of trees stored is defined by bit 2 in
- the general purpose flag word; a 0 bit indicates two trees stored,
- a 1 bit indicates three trees are stored. If 3 trees are stored,
- the first Shannon-Fano tree represents the encoding of the
- Literal characters, the second tree represents the encoding of
- the Length information, the third represents the encoding of the
- Distance information. When 2 Shannon-Fano trees are stored, the
- Length tree is stored first, followed by the Distance tree.
-
- 5.3.4 The Literal Shannon-Fano tree, if present is used to represent
- the entire ASCII character set, and contains 256 values. This
- tree is used to compress any data not compressed by the sliding
- dictionary algorithm. When this tree is present, the Minimum
- Match Length for the sliding dictionary is 3. If this tree is
- not present, the Minimum Match Length is 2.
-
- 5.3.5 The Length Shannon-Fano tree is used to compress the Length
- part of the (length,distance) pairs from the sliding dictionary
- output. The Length tree contains 64 values, ranging from the
- Minimum Match Length, to 63 plus the Minimum Match Length.
-
- 5.3.6 The Distance Shannon-Fano tree is used to compress the Distance
- part of the (length,distance) pairs from the sliding dictionary
- output. The Distance tree contains 64 values, ranging from 0 to
- 63, representing the upper 6 bits of the distance value. The
- distance values themselves will be between 0 and the sliding
- dictionary size, either 4K or 8K.
-
- 5.3.7 The Shannon-Fano trees themselves are stored in a compressed
- format. The first byte of the tree data represents the number of
- bytes of data representing the (compressed) Shannon-Fano tree
- minus 1. The remaining bytes represent the Shannon-Fano tree
- data encoded as:
-
- High 4 bits: Number of values at this bit length + 1. (1 - 16)
- Low 4 bits: Bit Length needed to represent value + 1. (1 - 16)
-
- 5.3.8 The Shannon-Fano codes can be constructed from the bit lengths
- using the following algorithm:
-
- 1) Sort the Bit Lengths in ascending order, while retaining the
- order of the original lengths stored in the file.
-
- 2) Generate the Shannon-Fano trees:
-
- Code <- 0
- CodeIncrement <- 0
- LastBitLength <- 0
- i <- number of Shannon-Fano codes - 1 (either 255 or 63)
-
- loop while i >= 0
- Code = Code + CodeIncrement
- if BitLength(i) <> LastBitLength then
- LastBitLength=BitLength(i)
- CodeIncrement = 1 shifted left (16 - LastBitLength)
- ShannonCode(i) = Code
- i <- i - 1
- end loop
-
- 3) Reverse the order of all the bits in the above ShannonCode()
- vector, so that the most significant bit becomes the least
- significant bit. For example, the value 0x1234 (hex) would
- become 0x2C48 (hex).
-
- 4) Restore the order of Shannon-Fano codes as originally stored
- within the file.
-
- Example:
-
- This example will show the encoding of a Shannon-Fano tree
- of size 8. Notice that the actual Shannon-Fano trees used
- for Imploding are either 64 or 256 entries in size.
-
- Example: 0x02, 0x42, 0x01, 0x13
-
- The first byte indicates 3 values in this table. Decoding the
- bytes:
- 0x42 = 5 codes of 3 bits long
- 0x01 = 1 code of 2 bits long
- 0x13 = 2 codes of 4 bits long
-
- This would generate the original bit length array of:
- (3, 3, 3, 3, 3, 2, 4, 4)
-
- There are 8 codes in this table for the values 0 thru 7. Using
- the algorithm to obtain the Shannon-Fano codes produces:
-
- Reversed Order Original
- Val Sorted Constructed Code Value Restored Length
- --- ------ ----------------- -------- -------- ------
- 0: 2 1100000000000000 11 101 3
- 1: 3 1010000000000000 101 001 3
- 2: 3 1000000000000000 001 110 3
- 3: 3 0110000000000000 110 010 3
- 4: 3 0100000000000000 010 100 3
- 5: 3 0010000000000000 100 11 2
- 6: 4 0001000000000000 1000 1000 4
- 7: 4 0000000000000000 0000 0000 4
-
- The values in the Val, Order Restored and Original Length columns
- now represent the Shannon-Fano encoding tree that can be used for
- decoding the Shannon-Fano encoded data. How to parse the
- variable length Shannon-Fano values from the data stream is beyond
- the scope of this document. (See the references listed at the end of
- this document for more information.) However, traditional decoding
- schemes used for Huffman variable length decoding, such as the
- Greenlaw algorithm, can be successfully applied.
-
- 5.3.9 The compressed data stream begins immediately after the
- compressed Shannon-Fano data. The compressed data stream can be
- interpreted as follows:
-
- loop until done
- read 1 bit from input stream.
-
- if this bit is non-zero then (encoded data is literal data)
- if Literal Shannon-Fano tree is present
- read and decode character using Literal Shannon-Fano tree.
- otherwise
- read 8 bits from input stream.
- copy character to the output stream.
- otherwise (encoded data is sliding dictionary match)
- if 8K dictionary size
- read 7 bits for offset Distance (lower 7 bits of offset).
- otherwise
- read 6 bits for offset Distance (lower 6 bits of offset).
-
- using the Distance Shannon-Fano tree, read and decode the
- upper 6 bits of the Distance value.
-
- using the Length Shannon-Fano tree, read and decode
- the Length value.
-
- Length <- Length + Minimum Match Length
-
- if Length = 63 + Minimum Match Length
- read 8 bits from the input stream,
- add this value to Length.
-
- move backwards Distance+1 bytes in the output stream, and
- copy Length characters from this position to the output
- stream. (if this position is before the start of the output
- stream, then assume that all the data before the start of
- the output stream is filled with zeros).
- end loop
-
-5.4 Tokenizing - Method 7
--------------------------
-
- 5.4.1 This method is not used by PKZIP.
-
-5.5 Deflating - Method 8
-------------------------
-
- 5.5.1 The Deflate algorithm is similar to the Implode algorithm using
- a sliding dictionary of up to 32K with secondary compression
- from Huffman/Shannon-Fano codes.
-
- 5.5.2 The compressed data is stored in blocks with a header describing
- the block and the Huffman codes used in the data block. The header
- format is as follows:
-
- Bit 0: Last Block bit This bit is set to 1 if this is the last
- compressed block in the data.
- Bits 1-2: Block type
- 00 (0) - Block is stored - All stored data is byte aligned.
- Skip bits until next byte, then next word = block
- length, followed by the ones compliment of the block
- length word. Remaining data in block is the stored
- data.
-
- 01 (1) - Use fixed Huffman codes for literal and distance codes.
- Lit Code Bits Dist Code Bits
- --------- ---- --------- ----
- 0 - 143 8 0 - 31 5
- 144 - 255 9
- 256 - 279 7
- 280 - 287 8
-
- Literal codes 286-287 and distance codes 30-31 are
- never used but participate in the huffman construction.
-
- 10 (2) - Dynamic Huffman codes. (See expanding Huffman codes)
-
- 11 (3) - Reserved - Flag a "Error in compressed data" if seen.
-
- 5.5.3 Expanding Huffman Codes
-
- If the data block is stored with dynamic Huffman codes, the Huffman
- codes are sent in the following compressed format:
-
- 5 Bits: # of Literal codes sent - 256 (256 - 286)
- All other codes are never sent.
- 5 Bits: # of Dist codes - 1 (1 - 32)
- 4 Bits: # of Bit Length codes - 3 (3 - 19)
-
- The Huffman codes are sent as bit lengths and the codes are built as
- described in the implode algorithm. The bit lengths themselves are
- compressed with Huffman codes. There are 19 bit length codes:
-
- 0 - 15: Represent bit lengths of 0 - 15
- 16: Copy the previous bit length 3 - 6 times.
- The next 2 bits indicate repeat length (0 = 3, ... ,3 = 6)
- Example: Codes 8, 16 (+2 bits 11), 16 (+2 bits 10) will
- expand to 12 bit lengths of 8 (1 + 6 + 5)
- 17: Repeat a bit length of 0 for 3 - 10 times. (3 bits of length)
- 18: Repeat a bit length of 0 for 11 - 138 times (7 bits of length)
-
- The lengths of the bit length codes are sent packed 3 bits per value
- (0 - 7) in the following order:
-
- 16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15
-
- The Huffman codes should be built as described in the Implode algorithm
- except codes are assigned starting at the shortest bit length, i.e. the
- shortest code should be all 0's rather than all 1's. Also, codes with
- a bit length of zero do not participate in the tree construction. The
- codes are then used to decode the bit lengths for the literal and
- distance tables.
-
- The bit lengths for the literal tables are sent first with the number
- of entries sent described by the 5 bits sent earlier. There are up
- to 286 literal characters; the first 256 represent the respective 8
- bit character, code 256 represents the End-Of-Block code, the remaining
- 29 codes represent copy lengths of 3 thru 258. There are up to 30
- distance codes representing distances from 1 thru 32k as described
- below.
-
- Length Codes
- ------------
- Extra Extra Extra Extra
- Code Bits Length Code Bits Lengths Code Bits Lengths Code Bits Length(s)
- ---- ---- ------ ---- ---- ------- ---- ---- ------- ---- ---- ---------
- 257 0 3 265 1 11,12 273 3 35-42 281 5 131-162
- 258 0 4 266 1 13,14 274 3 43-50 282 5 163-194
- 259 0 5 267 1 15,16 275 3 51-58 283 5 195-226
- 260 0 6 268 1 17,18 276 3 59-66 284 5 227-257
- 261 0 7 269 2 19-22 277 4 67-82 285 0 258
- 262 0 8 270 2 23-26 278 4 83-98
- 263 0 9 271 2 27-30 279 4 99-114
- 264 0 10 272 2 31-34 280 4 115-130
-
- Distance Codes
- --------------
- Extra Extra Extra Extra
- Code Bits Dist Code Bits Dist Code Bits Distance Code Bits Distance
- ---- ---- ---- ---- ---- ------ ---- ---- -------- ---- ---- --------
- 0 0 1 8 3 17-24 16 7 257-384 24 11 4097-6144
- 1 0 2 9 3 25-32 17 7 385-512 25 11 6145-8192
- 2 0 3 10 4 33-48 18 8 513-768 26 12 8193-12288
- 3 0 4 11 4 49-64 19 8 769-1024 27 12 12289-16384
- 4 1 5,6 12 5 65-96 20 9 1025-1536 28 13 16385-24576
- 5 1 7,8 13 5 97-128 21 9 1537-2048 29 13 24577-32768
- 6 2 9-12 14 6 129-192 22 10 2049-3072
- 7 2 13-16 15 6 193-256 23 10 3073-4096
-
- 5.5.4 The compressed data stream begins immediately after the
- compressed header data. The compressed data stream can be
- interpreted as follows:
-
- do
- read header from input stream.
-
- if stored block
- skip bits until byte aligned
- read count and 1's compliment of count
- copy count bytes data block
- otherwise
- loop until end of block code sent
- decode literal character from input stream
- if literal < 256
- copy character to the output stream
- otherwise
- if literal = end of block
- break from loop
- otherwise
- decode distance from input stream
-
- move backwards distance bytes in the output stream, and
- copy length characters from this position to the output
- stream.
- end loop
- while not last block
-
- if data descriptor exists
- skip bits until byte aligned
- read crc and sizes
- endif
-
-5.6 Enhanced Deflating - Method 9
----------------------------------
-
- 5.6.1 The Enhanced Deflating algorithm is similar to Deflate but uses
- a sliding dictionary of up to 64K. Deflate64(tm) is supported
- by the Deflate extractor.
-
-5.7 BZIP2 - Method 12
----------------------
-
- 5.7.1 BZIP2 is an open-source data compression algorithm developed by
- Julian Seward. Information and source code for this algorithm
- can be found on the internet.
-
-5.8 LZMA - Method 14
----------------------
-
- 5.8.1 LZMA is a block-oriented, general purpose data compression
- algorithm developed and maintained by Igor Pavlov. It is a derivative
- of LZ77 that utilizes Markov chains and a range coder. Information and
- source code for this algorithm can be found on the internet. Consult
- with the author of this algorithm for information on terms or
- restrictions on use.
-
- Support for LZMA within the ZIP format is defined as follows:
-
- 5.8.2 The Compression method field within the ZIP Local and Central
- Header records will be set to the value 14 to indicate data was
- compressed using LZMA.
-
- 5.8.3 The Version needed to extract field within the ZIP Local and
- Central Header records will be set to 6.3 to indicate the minimum
- ZIP format version supporting this feature.
-
- 5.8.4 File data compressed using the LZMA algorithm must be placed
- immediately following the Local Header for the file. If a standard
- ZIP encryption header is required, it will follow the Local Header
- and will precede the LZMA compressed file data segment. The location
- of LZMA compressed data segment within the ZIP format will be as shown:
-
- [local header file 1]
- [encryption header file 1]
- [LZMA compressed data segment for file 1]
- [data descriptor 1]
- [local header file 2]
-
- 5.8.5 The encryption header and data descriptor records may
- be conditionally present. The LZMA Compressed Data Segment
- will consist of an LZMA Properties Header followed by the
- LZMA Compressed Data as shown:
-
- [LZMA properties header for file 1]
- [LZMA compressed data for file 1]
-
- 5.8.6 The LZMA Compressed Data will be stored as provided by the
- LZMA compression library. Compressed size, uncompressed size and
- other file characteristics about the file being compressed must be
- stored in standard ZIP storage format.
-
- 5.8.7 The LZMA Properties Header will store specific data required
- to decompress the LZMA compressed Data. This data is set by the
- LZMA compression engine using the function WriteCoderProperties()
- as documented within the LZMA SDK.
-
- 5.8.8 Storage fields for the property information within the LZMA
- Properties Header are as follows:
-
- LZMA Version Information 2 bytes
- LZMA Properties Size 2 bytes
- LZMA Properties Data variable, defined by "LZMA Properties Size"
-
- 5.8.8.1 LZMA Version Information - this field identifies which version
- of the LZMA SDK was used to compress a file. The first byte will
- store the major version number of the LZMA SDK and the second
- byte will store the minor number.
-
- 5.8.8.2 LZMA Properties Size - this field defines the size of the
- remaining property data. Typically this size should be determined by
- the version of the SDK. This size field is included as a convenience
- and to help avoid any ambiguity should it arise in the future due
- to changes in this compression algorithm.
-
- 5.8.8.3 LZMA Property Data - this variable sized field records the
- required values for the decompressor as defined by the LZMA SDK.
- The data stored in this field should be obtained using the
- WriteCoderProperties() in the version of the SDK defined by
- the "LZMA Version Information" field.
-
- 5.8.8.4 The layout of the "LZMA Properties Data" field is a function of
- the LZMA compression algorithm. It is possible that this layout may be
- changed by the author over time. The data layout in version 4.3 of the
- LZMA SDK defines a 5 byte array that uses 4 bytes to store the dictionary
- size in little-endian order. This is preceded by a single packed byte as
- the first element of the array that contains the following fields:
-
- PosStateBits
- LiteralPosStateBits
- LiteralContextBits
-
- Refer to the LZMA documentation for a more detailed explanation of
- these fields.
-
- 5.8.9 Data compressed with method 14, LZMA, may include an end-of-stream
- (EOS) marker ending the compressed data stream. This marker is not
- required, but its use is highly recommended to facilitate processing
- and implementers should include the EOS marker whenever possible.
- When the EOS marker is used, general purpose bit 1 must be set. If
- general purpose bit 1 is not set, the EOS marker is not present.
-
-5.9 WavPack - Method 97
------------------------
-
- 5.9.1 Information describing the use of compression method 97 is
- provided by WinZIP International, LLC. This method relies on the
- open source WavPack audio compression utility developed by David Bryant.
- Information on WavPack is available at www.wavpack.com. Please consult
- with the author of this algorithm for information on terms and
- restrictions on use.
-
- 5.9.2 WavPack data for a file begins immediately after the end of the
- local header data. This data is the output from WavPack compression
- routines. Within the ZIP file, the use of WavPack compression is
- indicated by setting the compression method field to a value of 97
- in both the local header and the central directory header. The Version
- needed to extract and version made by fields use the same values as are
- used for data compressed using the Deflate algorithm.
-
- 5.9.3 An implementation note for storing digital sample data when using
- WavPack compression within ZIP files is that all of the bytes of
- the sample data should be compressed. This includes any unused
- bits up to the byte boundary. An example is a 2 byte sample that
- uses only 12 bits for the sample data with 4 unused bits. If only
- 12 bits are passed as the sample size to the WavPack routines, the 4
- unused bits will be set to 0 on extraction regardless of their original
- state. To avoid this, the full 16 bits of the sample data size
- should be provided.
-
-5.10 PPMd - Method 98
----------------------
-
- 5.10.1 PPMd is a data compression algorithm developed by Dmitry Shkarin
- which includes a carryless rangecoder developed by Dmitry Subbotin.
- This algorithm is based on predictive phrase matching on multiple
- order contexts. Information and source code for this algorithm
- can be found on the internet. Consult with the author of this
- algorithm for information on terms or restrictions on use.
-
- 5.10.2 Support for PPMd within the ZIP format currently is provided only
- for version I, revision 1 of the algorithm. Storage requirements
- for using this algorithm are as follows:
-
- 5.10.3 Parameters needed to control the algorithm are stored in the two
- bytes immediately preceding the compressed data. These bytes are
- used to store the following fields:
-
- Model order - sets the maximum model order, default is 8, possible
- values are from 2 to 16 inclusive
-
- Sub-allocator size - sets the size of sub-allocator in MB, default is 50,
- possible values are from 1MB to 256MB inclusive
-
- Model restoration method - sets the method used to restart context
- model at memory insufficiency, values are:
-
- 0 - restarts model from scratch - default
- 1 - cut off model - decreases performance by as much as 2x
- 2 - freeze context tree - not recommended
-
- 5.10.4 An example for packing these fields into the 2 byte storage field is
- illustrated below. These values are stored in Intel low-byte/high-byte
- order.
-
- wPPMd = (Model order - 1) +
- ((Sub-allocator size - 1) << 4) +
- (Model restoration method << 12)
-
-
-6.0 Traditional PKWARE Encryption
-----------------------------------
-
- 6.0.1 The following information discusses the decryption steps
- required to support traditional PKWARE encryption. This
- form of encryption is considered weak by today's standards
- and its use is recommended only for situations with
- low security needs or for compatibility with older .ZIP
- applications.
-
-6.1 Traditional PKWARE Decryption
----------------------------------
-
- 6.1.1 PKWARE is grateful to Mr. Roger Schlafly for his expert
- contribution towards the development of PKWARE's traditional
- encryption.
-
- 6.1.2 PKZIP encrypts the compressed data stream. Encrypted files
- must be decrypted before they can be extracted to their original
- form.
-
- 6.1.3 Each encrypted file has an extra 12 bytes stored at the start
- of the data area defining the encryption header for that file. The
- encryption header is originally set to random values, and then
- itself encrypted, using three, 32-bit keys. The key values are
- initialized using the supplied encryption password. After each byte
- is encrypted, the keys are then updated using pseudo-random number
- generation techniques in combination with the same CRC-32 algorithm
- used in PKZIP and described elsewhere in this document.
-
- 6.1.4 The following are the basic steps required to decrypt a file:
-
- 1) Initialize the three 32-bit keys with the password.
- 2) Read and decrypt the 12-byte encryption header, further
- initializing the encryption keys.
- 3) Read and decrypt the compressed data stream using the
- encryption keys.
-
- 6.1.5 Initializing the encryption keys
-
- Key(0) <- 305419896
- Key(1) <- 591751049
- Key(2) <- 878082192
-
- loop for i <- 0 to length(password)-1
- update_keys(password(i))
- end loop
-
- Where update_keys() is defined as:
-
- update_keys(char):
- Key(0) <- crc32(key(0),char)
- Key(1) <- Key(1) + (Key(0) & 000000ffH)
- Key(1) <- Key(1) * 134775813 + 1
- Key(2) <- crc32(key(2),key(1) >> 24)
- end update_keys
-
- Where crc32(old_crc,char) is a routine that given a CRC value and a
- character, returns an updated CRC value after applying the CRC-32
- algorithm described elsewhere in this document.
-
- 6.1.6 Decrypting the encryption header
-
- The purpose of this step is to further initialize the encryption
- keys, based on random data, to render a plaintext attack on the
- data ineffective.
-
- Read the 12-byte encryption header into Buffer, in locations
- Buffer(0) thru Buffer(11).
-
- loop for i <- 0 to 11
- C <- buffer(i) ^ decrypt_byte()
- update_keys(C)
- buffer(i) <- C
- end loop
-
- Where decrypt_byte() is defined as:
-
- unsigned char decrypt_byte()
- local unsigned short temp
- temp <- Key(2) | 2
- decrypt_byte <- (temp * (temp ^ 1)) >> 8
- end decrypt_byte
-
- After the header is decrypted, the last 1 or 2 bytes in Buffer
- should be the high-order word/byte of the CRC for the file being
- decrypted, stored in Intel low-byte/high-byte order. Versions of
- PKZIP prior to 2.0 used a 2 byte CRC check; a 1 byte CRC check is
- used on versions after 2.0. This can be used to test if the password
- supplied is correct or not.
-
- 6.1.7 Decrypting the compressed data stream
-
- The compressed data stream can be decrypted as follows:
-
- loop until done
- read a character into C
- Temp <- C ^ decrypt_byte()
- update_keys(temp)
- output Temp
- end loop
-
-
-7.0 Strong Encryption Specification
------------------------------------
-
- 7.0.1 Portions of the Strong Encryption technology defined in this
- specification are covered under patents and pending patent applications.
- Refer to the section in this document entitled "Incorporating
- PKWARE Proprietary Technology into Your Product" for more information.
-
-7.1 Strong Encryption Overview
-------------------------------
-
- 7.1.1 Version 5.x of this specification introduced support for strong
- encryption algorithms. These algorithms can be used with either
- a password or an X.509v3 digital certificate to encrypt each file.
- This format specification supports either password or certificate
- based encryption to meet the security needs of today, to enable
- interoperability between users within both PKI and non-PKI
- environments, and to ensure interoperability between different
- computing platforms that are running a ZIP program.
-
- 7.1.2 Password based encryption is the most common form of encryption
- people are familiar with. However, inherent weaknesses with
- passwords (e.g. susceptibility to dictionary/brute force attack)
- as well as password management and support issues make certificate
- based encryption a more secure and scalable option. Industry
- efforts and support are defining and moving towards more advanced
- security solutions built around X.509v3 digital certificates and
- Public Key Infrastructures(PKI) because of the greater scalability,
- administrative options, and more robust security over traditional
- password based encryption.
-
- 7.1.3 Most standard encryption algorithms are supported with this
- specification. Reference implementations for many of these
- algorithms are available from either commercial or open source
- distributors. Readily available cryptographic toolkits make
- implementation of the encryption features straight-forward.
- This document is not intended to provide a treatise on data
- encryption principles or theory. Its purpose is to document the
- data structures required for implementing interoperable data
- encryption within the .ZIP format. It is strongly recommended that
- you have a good understanding of data encryption before reading
- further.
-
- 7.1.4 The algorithms introduced in Version 5.0 of this specification
- include:
-
- RC2 40 bit, 64 bit, and 128 bit
- RC4 40 bit, 64 bit, and 128 bit
- DES
- 3DES 112 bit and 168 bit
-
- Version 5.1 adds support for the following:
-
- AES 128 bit, 192 bit, and 256 bit
-
-
- 7.1.5 Version 6.1 introduces encryption data changes to support
- interoperability with Smartcard and USB Token certificate storage
- methods which do not support the OAEP strengthening standard.
-
- 7.1.6 Version 6.2 introduces support for encrypting metadata by compressing
- and encrypting the central directory data structure to reduce information
- leakage. Information leakage can occur in legacy ZIP applications
- through exposure of information about a file even though that file is
- stored encrypted. The information exposed consists of file
- characteristics stored within the records and fields defined by this
- specification. This includes data such as a file's name, its original
- size, timestamp and CRC32 value.
-
- 7.1.7 Version 6.3 introduces support for encrypting data using the Blowfish
- and Twofish algorithms. These are symmetric block ciphers developed
- by Bruce Schneier. Blowfish supports using a variable length key from
- 32 to 448 bits. Block size is 64 bits. Implementations should use 16
- rounds and the only mode supported within ZIP files is CBC. Twofish
- supports key sizes 128, 192 and 256 bits. Block size is 128 bits.
- Implementations should use 16 rounds and the only mode supported within
- ZIP files is CBC. Information and source code for both Blowfish and
- Twofish algorithms can be found on the internet. Consult with the author
- of these algorithms for information on terms or restrictions on use.
-
- 7.1.8 Central Directory Encryption provides greater protection against
- information leakage by encrypting the Central Directory structure and
- by masking key values that are replicated in the unencrypted Local
- Header. ZIP compatible programs that cannot interpret an encrypted
- Central Directory structure cannot rely on the data in the corresponding
- Local Header for decompression information.
-
- 7.1.9 Extra Field records that may contain information about a file that should
- not be exposed should not be stored in the Local Header and should only
- be written to the Central Directory where they can be encrypted. This
- design currently does not support streaming. Information in the End of
- Central Directory record, the Zip64 End of Central Directory Locator,
- and the Zip64 End of Central Directory records are not encrypted. Access
- to view data on files within a ZIP file with an encrypted Central Directory
- requires the appropriate password or private key for decryption prior to
- viewing any files, or any information about the files, in the archive.
-
- 7.1.10 Older ZIP compatible programs not familiar with the Central Directory
- Encryption feature will no longer be able to recognize the Central
- Directory and may assume the ZIP file is corrupt. Programs that
- attempt streaming access using Local Headers will see invalid
- information for each file. Central Directory Encryption need not be
- used for every ZIP file. Its use is recommended for greater security.
- ZIP files not using Central Directory Encryption should operate as
- in the past.
-
- 7.1.11 This strong encryption feature specification is intended to provide for
- scalable, cross-platform encryption needs ranging from simple password
- encryption to authenticated public/private key encryption.
-
- 7.1.12 Encryption provides data confidentiality and privacy. It is
- recommended that you combine X.509 digital signing with encryption
- to add authentication and non-repudiation.
-
-
-7.2 Single Password Symmetric Encryption Method
------------------------------------------------
-
- 7.2.1 The Single Password Symmetric Encryption Method using strong
- encryption algorithms operates similarly to the traditional
- PKWARE encryption defined in this format. Additional data
- structures are added to support the processing needs of the
- strong algorithms.
-
- The Strong Encryption data structures are:
-
- 7.2.2 General Purpose Bits - Bits 0 and 6 of the General Purpose bit
- flag in both local and central header records. Both bits set
- indicates strong encryption. Bit 13, when set indicates the Central
- Directory is encrypted and that selected fields in the Local Header
- are masked to hide their actual value.
-
-
- 7.2.3 Extra Field 0x0017 in central header only.
-
- Fields to consider in this record are:
-
- 7.2.3.1 Format - the data format identifier for this record. The only
- value allowed at this time is the integer value 2.
-
- 7.2.3.2 AlgId - integer identifier of the encryption algorithm from the
- following range
-
- 0x6601 - DES
- 0x6602 - RC2 (version needed to extract < 5.2)
- 0x6603 - 3DES 168
- 0x6609 - 3DES 112
- 0x660E - AES 128
- 0x660F - AES 192
- 0x6610 - AES 256
- 0x6702 - RC2 (version needed to extract >= 5.2)
- 0x6720 - Blowfish
- 0x6721 - Twofish
- 0x6801 - RC4
- 0xFFFF - Unknown algorithm
-
- 7.2.3.3 Bitlen - Explicit bit length of key
-
- 32 - 448 bits
-
- 7.2.3.4 Flags - Processing flags needed for decryption
-
- 0x0001 - Password is required to decrypt
- 0x0002 - Certificates only
- 0x0003 - Password or certificate required to decrypt
-
- Values > 0x0003 reserved for certificate processing
-
-
- 7.2.4 Decryption header record preceding compressed file data.
-
- -Decryption Header:
-
- Value Size Description
- ----- ---- -----------
- IVSize 2 bytes Size of initialization vector (IV)
- IVData IVSize Initialization vector for this file
- Size 4 bytes Size of remaining decryption header data
- Format 2 bytes Format definition for this record
- AlgID 2 bytes Encryption algorithm identifier
- Bitlen 2 bytes Bit length of encryption key
- Flags 2 bytes Processing flags
- ErdSize 2 bytes Size of Encrypted Random Data
- ErdData ErdSize Encrypted Random Data
- Reserved1 4 bytes Reserved certificate processing data
- Reserved2 (var) Reserved for certificate processing data
- VSize 2 bytes Size of password validation data
- VData VSize-4 Password validation data
- VCRC32 4 bytes Standard ZIP CRC32 of password validation data
-
- 7.2.4.1 IVData - The size of the IV should match the algorithm block size.
- The IVData can be completely random data. If the size of
- the randomly generated data does not match the block size
- it should be complemented with zero's or truncated as
- necessary. If IVSize is 0,then IV = CRC32 + Uncompressed
- File Size (as a 64 bit little-endian, unsigned integer value).
-
- 7.2.4.2 Format - the data format identifier for this record. The only
- value allowed at this time is the integer value 3.
-
- 7.2.4.3 AlgId - integer identifier of the encryption algorithm from the
- following range
-
- 0x6601 - DES
- 0x6602 - RC2 (version needed to extract < 5.2)
- 0x6603 - 3DES 168
- 0x6609 - 3DES 112
- 0x660E - AES 128
- 0x660F - AES 192
- 0x6610 - AES 256
- 0x6702 - RC2 (version needed to extract >= 5.2)
- 0x6720 - Blowfish
- 0x6721 - Twofish
- 0x6801 - RC4
- 0xFFFF - Unknown algorithm
-
- 7.2.4.4 Bitlen - Explicit bit length of key
-
- 32 - 448 bits
-
- 7.2.4.5 Flags - Processing flags needed for decryption
-
- 0x0001 - Password is required to decrypt
- 0x0002 - Certificates only
- 0x0003 - Password or certificate required to decrypt
-
- Values > 0x0003 reserved for certificate processing
-
- 7.2.4.6 ErdData - Encrypted random data is used to store random data that
- is used to generate a file session key for encrypting
- each file. SHA1 is used to calculate hash data used to
- derive keys. File session keys are derived from a master
- session key generated from the user-supplied password.
- If the Flags field in the decryption header contains
- the value 0x4000, then the ErdData field must be
- decrypted using 3DES. If the value 0x4000 is not set,
- then the ErdData field must be decrypted using AlgId.
-
-
- 7.2.4.7 Reserved1 - Reserved for certificate processing, if value is
- zero, then Reserved2 data is absent. See the explanation
- under the Certificate Processing Method for details on
- this data structure.
-
- 7.2.4.8 Reserved2 - If present, the size of the Reserved2 data structure
- is located by skipping the first 4 bytes of this field
- and using the next 2 bytes as the remaining size. See
- the explanation under the Certificate Processing Method
- for details on this data structure.
-
- 7.2.4.9 VSize - This size value will always include the 4 bytes of the
- VCRC32 data and will be greater than 4 bytes.
-
- 7.2.4.10 VData - Random data for password validation. This data is VSize
- in length and VSize must be a multiple of the encryption
- block size. VCRC32 is a checksum value of VData.
- VData and VCRC32 are stored encrypted and start the
- stream of encrypted data for a file.
-
-
- 7.2.5 Useful Tips
-
- 7.2.5.1 Strong Encryption is always applied to a file after compression. The
- block oriented algorithms all operate in Cypher Block Chaining (CBC)
- mode. The block size used for AES encryption is 16. All other block
- algorithms use a block size of 8. Two IDs are defined for RC2 to
- account for a discrepancy found in the implementation of the RC2
- algorithm in the cryptographic library on Windows XP SP1 and all
- earlier versions of Windows. It is recommended that zero length files
- not be encrypted, however programs should be prepared to extract them
- if they are found within a ZIP file.
-
- 7.2.5.2 A pseudo-code representation of the encryption process is as follows:
-
- Password = GetUserPassword()
- MasterSessionKey = DeriveKey(SHA1(Password))
- RD = CryptographicStrengthRandomData()
- For Each File
- IV = CryptographicStrengthRandomData()
- VData = CryptographicStrengthRandomData()
- VCRC32 = CRC32(VData)
- FileSessionKey = DeriveKey(SHA1(IV + RD)
- ErdData = Encrypt(RD,MasterSessionKey,IV)
- Encrypt(VData + VCRC32 + FileData, FileSessionKey,IV)
- Done
-
- 7.2.5.3 The function names and parameter requirements will depend on
- the choice of the cryptographic toolkit selected. Almost any
- toolkit supporting the reference implementations for each
- algorithm can be used. The RSA BSAFE(r), OpenSSL, and Microsoft
- CryptoAPI libraries are all known to work well.
-
-
- 7.3 Single Password - Central Directory Encryption
- --------------------------------------------------
-
- 7.3.1 Central Directory Encryption is achieved within the .ZIP format by
- encrypting the Central Directory structure. This encapsulates the metadata
- most often used for processing .ZIP files. Additional metadata is stored for
- redundancy in the Local Header for each file. The process of concealing
- metadata by encrypting the Central Directory does not protect the data within
- the Local Header. To avoid information leakage from the exposed metadata
- in the Local Header, the fields containing information about a file are masked.
-
- 7.3.2 Local Header
-
- Masking replaces the true content of the fields for a file in the Local
- Header with false information. When masked, the Local Header is not
- suitable for streaming access and the options for data recovery of damaged
- archives is reduced. Extra Data fields that may contain confidential
- data should not be stored within the Local Header. The value set into
- the Version needed to extract field should be the correct value needed to
- extract the file without regard to Central Directory Encryption. The fields
- within the Local Header targeted for masking when the Central Directory is
- encrypted are:
-
- Field Name Mask Value
- ------------------ ---------------------------
- compression method 0
- last mod file time 0
- last mod file date 0
- crc-32 0
- compressed size 0
- uncompressed size 0
- file name (variable size) Base 16 value from the
- range 1 - 0xFFFFFFFFFFFFFFFF
- represented as a string whose
- size will be set into the
- file name length field
-
- The Base 16 value assigned as a masked file name is simply a sequentially
- incremented value for each file starting with 1 for the first file.
- Modifications to a ZIP file may cause different values to be stored for
- each file. For compatibility, the file name field in the Local Header
- should never be left blank. As of Version 6.2 of this specification,
- the Compression Method and Compressed Size fields are not yet masked.
- Fields having a value of 0xFFFF or 0xFFFFFFFF for the ZIP64 format
- should not be masked.
-
- 7.3.3 Encrypting the Central Directory
-
- Encryption of the Central Directory does not include encryption of the
- Central Directory Signature data, the Zip64 End of Central Directory
- record, the Zip64 End of Central Directory Locator, or the End
- of Central Directory record. The ZIP file comment data is never
- encrypted.
-
- Before encrypting the Central Directory, it may optionally be compressed.
- Compression is not required, but for storage efficiency it is assumed
- this structure will be compressed before encrypting. Similarly, this
- specification supports compressing the Central Directory without
- requiring that it also be encrypted. Early implementations of this
- feature will assume the encryption method applied to files matches the
- encryption applied to the Central Directory.
-
- Encryption of the Central Directory is done in a manner similar to
- that of file encryption. The encrypted data is preceded by a
- decryption header. The decryption header is known as the Archive
- Decryption Header. The fields of this record are identical to
- the decryption header preceding each encrypted file. The location
- of the Archive Decryption Header is determined by the value in the
- Start of the Central Directory field in the Zip64 End of Central
- Directory record. When the Central Directory is encrypted, the
- Zip64 End of Central Directory record will always be present.
-
- The layout of the Zip64 End of Central Directory record for all
- versions starting with 6.2 of this specification will follow the
- Version 2 format. The Version 2 format is as follows:
-
- The leading fixed size fields within the Version 1 format for this
- record remain unchanged. The record signature for both Version 1
- and Version 2 will be 0x06064b50. Immediately following the last
- byte of the field known as the Offset of Start of Central
- Directory With Respect to the Starting Disk Number will begin the
- new fields defining Version 2 of this record.
-
- 7.3.4 New fields for Version 2
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- Compression Method 2 bytes Method used to compress the
- Central Directory
- Compressed Size 8 bytes Size of the compressed data
- Original Size 8 bytes Original uncompressed size
- AlgId 2 bytes Encryption algorithm ID
- BitLen 2 bytes Encryption key length
- Flags 2 bytes Encryption flags
- HashID 2 bytes Hash algorithm identifier
- Hash Length 2 bytes Length of hash data
- Hash Data (variable) Hash data
-
- The Compression Method accepts the same range of values as the
- corresponding field in the Central Header.
-
- The Compressed Size and Original Size values will not include the
- data of the Central Directory Signature which is compressed or
- encrypted.
-
- The AlgId, BitLen, and Flags fields accept the same range of values
- the corresponding fields within the 0x0017 record.
-
- Hash ID identifies the algorithm used to hash the Central Directory
- data. This data does not have to be hashed, in which case the
- values for both the HashID and Hash Length will be 0. Possible
- values for HashID are:
-
- Value Algorithm
- ------ ---------
- 0x0000 none
- 0x0001 CRC32
- 0x8003 MD5
- 0x8004 SHA1
- 0x8007 RIPEMD160
- 0x800C SHA256
- 0x800D SHA384
- 0x800E SHA512
-
- 7.3.5 When the Central Directory data is signed, the same hash algorithm
- used to hash the Central Directory for signing should be used.
- This is recommended for processing efficiency, however, it is
- permissible for any of the above algorithms to be used independent
- of the signing process.
-
- The Hash Data will contain the hash data for the Central Directory.
- The length of this data will vary depending on the algorithm used.
-
- The Version Needed to Extract should be set to 62.
-
- The value for the Total Number of Entries on the Current Disk will
- be 0. These records will no longer support random access when
- encrypting the Central Directory.
-
- 7.3.6 When the Central Directory is compressed and/or encrypted, the
- End of Central Directory record will store the value 0xFFFFFFFF
- as the value for the Total Number of Entries in the Central
- Directory. The value stored in the Total Number of Entries in
- the Central Directory on this Disk field will be 0. The actual
- values will be stored in the equivalent fields of the Zip64
- End of Central Directory record.
-
- 7.3.7 Decrypting and decompressing the Central Directory is accomplished
- in the same manner as decrypting and decompressing a file.
-
- 7.4 Certificate Processing Method
- ---------------------------------
-
- The Certificate Processing Method for ZIP file encryption
- defines the following additional data fields:
-
- 7.4.1 Certificate Flag Values
-
- Additional processing flags that can be present in the Flags field of both
- the 0x0017 field of the central directory Extra Field and the Decryption
- header record preceding compressed file data are:
-
- 0x0007 - reserved for future use
- 0x000F - reserved for future use
- 0x0100 - Indicates non-OAEP key wrapping was used. If this
- this field is set, the version needed to extract must
- be at least 61. This means OAEP key wrapping is not
- used when generating a Master Session Key using
- ErdData.
- 0x4000 - ErdData must be decrypted using 3DES-168, otherwise use the
- same algorithm used for encrypting the file contents.
- 0x8000 - reserved for future use
-
-
- 7.4.2 CertData - Extra Field 0x0017 record certificate data structure
-
- The data structure used to store certificate data within the section
- of the Extra Field defined by the CertData field of the 0x0017
- record are as shown:
-
- Value Size Description
- ----- ---- -----------
- RCount 4 bytes Number of recipients.
- HashAlg 2 bytes Hash algorithm identifier
- HSize 2 bytes Hash size
- SRList (var) Simple list of recipients hashed public keys
-
-
- RCount This defines the number intended recipients whose
- public keys were used for encryption. This identifies
- the number of elements in the SRList.
-
- HashAlg This defines the hash algorithm used to calculate
- the public key hash of each public key used
- for encryption. This field currently supports
- only the following value for SHA-1
-
- 0x8004 - SHA1
-
- HSize This defines the size of a hashed public key.
-
- SRList This is a variable length list of the hashed
- public keys for each intended recipient. Each
- element in this list is HSize. The total size of
- SRList is determined using RCount * HSize.
-
-
- 7.4.3 Reserved1 - Certificate Decryption Header Reserved1 Data
-
- Value Size Description
- ----- ---- -----------
- RCount 4 bytes Number of recipients.
-
- RCount This defines the number intended recipients whose
- public keys were used for encryption. This defines
- the number of elements in the REList field defined below.
-
-
- 7.4.4 Reserved2 - Certificate Decryption Header Reserved2 Data Structures
-
-
- Value Size Description
- ----- ---- -----------
- HashAlg 2 bytes Hash algorithm identifier
- HSize 2 bytes Hash size
- REList (var) List of recipient data elements
-
-
- HashAlg This defines the hash algorithm used to calculate
- the public key hash of each public key used
- for encryption. This field currently supports
- only the following value for SHA-1
-
- 0x8004 - SHA1
-
- HSize This defines the size of a hashed public key
- defined in REHData.
-
- REList This is a variable length of list of recipient data.
- Each element in this list consists of a Recipient
- Element data structure as follows:
-
-
- Recipient Element (REList) Data Structure:
-
- Value Size Description
- ----- ---- -----------
- RESize 2 bytes Size of REHData + REKData
- REHData HSize Hash of recipients public key
- REKData (var) Simple key blob
-
-
- RESize This defines the size of an individual REList
- element. This value is the combined size of the
- REHData field + REKData field. REHData is defined by
- HSize. REKData is variable and can be calculated
- for each REList element using RESize and HSize.
-
- REHData Hashed public key for this recipient.
-
- REKData Simple Key Blob. The format of this data structure
- is identical to that defined in the Microsoft
- CryptoAPI and generated using the CryptExportKey()
- function. The version of the Simple Key Blob
- supported at this time is 0x02 as defined by
- Microsoft.
-
-7.5 Certificate Processing - Central Directory Encryption
----------------------------------------------------------
-
- 7.5.1 Central Directory Encryption using Digital Certificates will
- operate in a manner similar to that of Single Password Central
- Directory Encryption. This record will only be present when there
- is data to place into it. Currently, data is placed into this
- record when digital certificates are used for either encrypting
- or signing the files within a ZIP file. When only password
- encryption is used with no certificate encryption or digital
- signing, this record is not currently needed. When present, this
- record will appear before the start of the actual Central Directory
- data structure and will be located immediately after the Archive
- Decryption Header if the Central Directory is encrypted.
-
- 7.5.2 The Archive Extra Data record will be used to store the following
- information. Additional data may be added in future versions.
-
- Extra Data Fields:
-
- 0x0014 - PKCS#7 Store for X.509 Certificates
- 0x0016 - X.509 Certificate ID and Signature for central directory
- 0x0019 - PKCS#7 Encryption Recipient Certificate List
-
- The 0x0014 and 0x0016 Extra Data records that otherwise would be
- located in the first record of the Central Directory for digital
- certificate processing. When encrypting or compressing the Central
- Directory, the 0x0014 and 0x0016 records must be located in the
- Archive Extra Data record and they should not remain in the first
- Central Directory record. The Archive Extra Data record will also
- be used to store the 0x0019 data.
-
- 7.5.3 When present, the size of the Archive Extra Data record will be
- included in the size of the Central Directory. The data of the
- Archive Extra Data record will also be compressed and encrypted
- along with the Central Directory data structure.
-
-7.6 Certificate Processing Differences
---------------------------------------
-
- 7.6.1 The Certificate Processing Method of encryption differs from the
- Single Password Symmetric Encryption Method as follows. Instead
- of using a user-defined password to generate a master session key,
- cryptographically random data is used. The key material is then
- wrapped using standard key-wrapping techniques. This key material
- is wrapped using the public key of each recipient that will need
- to decrypt the file using their corresponding private key.
-
- 7.6.2 This specification currently assumes digital certificates will follow
- the X.509 V3 format for 1024 bit and higher RSA format digital
- certificates. Implementation of this Certificate Processing Method
- requires supporting logic for key access and management. This logic
- is outside the scope of this specification.
-
-7.7 OAEP Processing with Certificate-based Encryption
------------------------------------------------------
-
- 7.7.1 OAEP stands for Optimal Asymmetric Encryption Padding. It is a
- strengthening technique used for small encoded items such as decryption
- keys. This is commonly applied in cryptographic key-wrapping techniques
- and is supported by PKCS #1. Versions 5.0 and 6.0 of this specification
- were designed to support OAEP key-wrapping for certificate-based
- decryption keys for additional security.
-
- 7.7.2 Support for private keys stored on Smartcards or Tokens introduced
- a conflict with this OAEP logic. Most card and token products do
- not support the additional strengthening applied to OAEP key-wrapped
- data. In order to resolve this conflict, versions 6.1 and above of this
- specification will no longer support OAEP when encrypting using
- digital certificates.
-
- 7.7.3 Versions of PKZIP available during initial development of the
- certificate processing method set a value of 61 into the
- version needed to extract field for a file. This indicates that
- non-OAEP key wrapping is used. This affects certificate encryption
- only, and password encryption functions should not be affected by
- this value. This means values of 61 may be found on files encrypted
- with certificates only, or on files encrypted with both password
- encryption and certificate encryption. Files encrypted with both
- methods can safely be decrypted using the password methods documented.
-
-8.0 Splitting and Spanning ZIP files
--------------------------------------
-
- 8.1 Spanned ZIP files
-
- 8.1.1 Spanning is the process of segmenting a ZIP file across
- multiple removable media. This support has typically only
- been provided for DOS formatted floppy diskettes.
-
- 8.2 Split ZIP files
-
- 8.2.1 File splitting is a newer derivation of spanning.
- Splitting follows the same segmentation process as
- spanning, however, it does not require writing each
- segment to a unique removable medium and instead supports
- placing all pieces onto local or non-removable locations
- such as file systems, local drives, folders, etc.
-
- 8.3 File Naming Differences
-
- 8.3.1 A key difference between spanned and split ZIP files is
- that all pieces of a spanned ZIP file have the same name.
- Since each piece is written to a separate volume, no name
- collisions occur and each segment can reuse the original
- .ZIP file name given to the archive.
-
- 8.3.2 Sequence ordering for DOS spanned archives uses the DOS
- volume label to determine segment numbers. Volume labels
- for each segment are written using the form PKBACK#xxx,
- where xxx is the segment number written as a decimal
- value from 001 - nnn.
-
- 8.3.3 Split ZIP files are typically written to the same location
- and are subject to name collisions if the spanned name
- format is used since each segment will reside on the same
- drive. To avoid name collisions, split archives are named
- as follows.
-
- Segment 1 = filename.z01
- Segment n-1 = filename.z(n-1)
- Segment n = filename.zip
-
- 8.3.4 The .ZIP extension is used on the last segment to support
- quickly reading the central directory. The segment number
- n should be a decimal value.
-
- 8.4 Spanned Self-extracting ZIP Files
-
- 8.4.1 Spanned ZIP files may be PKSFX Self-extracting ZIP files.
- PKSFX files may also be split, however, in this case
- the first segment must be named filename.exe. The first
- segment of a split PKSFX archive must be large enough to
- include the entire executable program.
-
- 8.5 Capacities and Markers
-
- 8.5.1 Capacities for split archives are as follows:
-
- Maximum number of segments = 4,294,967,295 - 1
- Maximum .ZIP segment size = 4,294,967,295 bytes
- Minimum segment size = 64K
- Maximum PKSFX segment size = 2,147,483,647 bytes
-
- 8.5.2 Segment sizes may be different however by convention, all
- segment sizes should be the same with the exception of the
- last, which may be smaller. Local and central directory
- header records must never be split across a segment boundary.
- When writing a header record, if the number of bytes remaining
- within a segment is less than the size of the header record,
- end the current segment and write the header at the start
- of the next segment. The central directory may span segment
- boundaries, but no single record in the central directory
- should be split across segments.
-
- 8.5.3 Spanned/Split archives created using PKZIP for Windows
- (V2.50 or greater), PKZIP Command Line (V2.50 or greater),
- or PKZIP Explorer will include a special spanning
- signature as the first 4 bytes of the first segment of
- the archive. This signature (0x08074b50) will be
- followed immediately by the local header signature for
- the first file in the archive.
-
- 8.5.4 A special spanning marker may also appear in spanned/split
- archives if the spanning or splitting process starts but
- only requires one segment. In this case the 0x08074b50
- signature will be replaced with the temporary spanning
- marker signature of 0x30304b50. Split archives can
- only be uncompressed by other versions of PKZIP that
- know how to create a split archive.
-
- 8.5.5 The signature value 0x08074b50 is also used by some
- ZIP implementations as a marker for the Data Descriptor
- record. Conflict in this alternate assignment can be
- avoided by ensuring the position of the signature
- within the ZIP file to determine the use for which it
- is intended.
-
-9.0 Change Process
-------------------
-
- 9.1 In order for the .ZIP file format to remain a viable technology, this
- specification should be considered as open for periodic review and
- revision. Although this format was originally designed with a
- certain level of extensibility, not all changes in technology
- (present or future) were or will be necessarily considered in its
- design.
-
- 9.2 If your application requires new definitions to the
- extensible sections in this format, or if you would like to
- submit new data structures or new capabilities, please forward
- your request to zipformat@pkware.com. All submissions will be
- reviewed by the ZIP File Specification Committee for possible
- inclusion into future versions of this specification.
-
- 9.3 Periodic revisions to this specification will be published as
- DRAFT or as FINAL status to ensure interoperability. We encourage
- comments and feedback that may help improve clarity or content.
-
-
-10.0 Incorporating PKWARE Proprietary Technology into Your Product
-------------------------------------------------------------------
-
- 10.1 The Use or Implementation in a product of APPNOTE technological
- components pertaining to either strong encryption or patching requires
- a separate, executed license agreement from PKWARE. Please contact
- PKWARE at zipformat@pkware.com or +1-414-289-9788 with regard to
- acquiring such a license.
-
- 10.2 Additional information regarding PKWARE proprietray technology is
- available at http://www.pkware.com/appnote.
-
-11.0 Acknowledgements
----------------------
-
- In addition to the above mentioned contributors to PKZIP and PKUNZIP,
- PKWARE would like to extend special thanks to Robert Mahoney for
- suggesting the extension .ZIP for this software.
-
-12.0 References
----------------
-
- Fiala, Edward R., and Greene, Daniel H., "Data compression with
- finite windows", Communications of the ACM, Volume 32, Number 4,
- April 1989, pages 490-505.
-
- Held, Gilbert, "Data Compression, Techniques and Applications,
- Hardware and Software Considerations", John Wiley & Sons, 1987.
-
- Huffman, D.A., "A method for the construction of minimum-redundancy
- codes", Proceedings of the IRE, Volume 40, Number 9, September 1952,
- pages 1098-1101.
-
- Nelson, Mark, "LZW Data Compression", Dr. Dobbs Journal, Volume 14,
- Number 10, October 1989, pages 29-37.
-
- Nelson, Mark, "The Data Compression Book", M&T Books, 1991.
-
- Storer, James A., "Data Compression, Methods and Theory",
- Computer Science Press, 1988
-
- Welch, Terry, "A Technique for High-Performance Data Compression",
- IEEE Computer, Volume 17, Number 6, June 1984, pages 8-19.
-
- Ziv, J. and Lempel, A., "A universal algorithm for sequential data
- compression", Communications of the ACM, Volume 30, Number 6,
- June 1987, pages 520-540.
-
- Ziv, J. and Lempel, A., "Compression of individual sequences via
- variable-rate coding", IEEE Transactions on Information Theory,
- Volume 24, Number 5, September 1978, pages 530-536.
-
-
-APPENDIX A - AS/400 Extra Field (0x0065) Attribute Definitions
---------------------------------------------------------------
-
-A.1 Field Definition Structure:
-
- a. field length including length 2 bytes
- b. field code 2 bytes
- c. data x bytes
-
-A.2 Field Code Description
-
- 4001 Source type i.e. CLP etc
- 4002 The text description of the library
- 4003 The text description of the file
- 4004 The text description of the member
- 4005 x'F0' or 0 is PF-DTA, x'F1' or 1 is PF_SRC
- 4007 Database Type Code 1 byte
- 4008 Database file and fields definition
- 4009 GZIP file type 2 bytes
- 400B IFS code page 2 bytes
- 400C IFS Creation Time 4 bytes
- 400D IFS Access Time 4 bytes
- 400E IFS Modification time 4 bytes
- 005C Length of the records in the file 2 bytes
- 0068 GZIP two words 8 bytes
-
-APPENDIX B - z/OS Extra Field (0x0065) Attribute Definitions
-------------------------------------------------------------
-
-B.1 Field Definition Structure:
-
- a. field length including length 2 bytes
- b. field code 2 bytes
- c. data x bytes
-
-B.2 Field Code Description
-
- 0001 File Type 2 bytes
- 0002 NonVSAM Record Format 1 byte
- 0003 Reserved
- 0004 NonVSAM Block Size 2 bytes Big Endian
- 0005 Primary Space Allocation 3 bytes Big Endian
- 0006 Secondary Space Allocation 3 bytes Big Endian
- 0007 Space Allocation Type1 byte flag
- 0008 Modification Date Retired with PKZIP 5.0 +
- 0009 Expiration Date Retired with PKZIP 5.0 +
- 000A PDS Directory Block Allocation 3 bytes Big Endian binary value
- 000B NonVSAM Volume List variable
- 000C UNIT Reference Retired with PKZIP 5.0 +
- 000D DF/SMS Management Class 8 bytes EBCDIC Text Value
- 000E DF/SMS Storage Class 8 bytes EBCDIC Text Value
- 000F DF/SMS Data Class 8 bytes EBCDIC Text Value
- 0010 PDS/PDSE Member Info. 30 bytes
- 0011 VSAM sub-filetype 2 bytes
- 0012 VSAM LRECL 13 bytes EBCDIC "(num_avg num_max)"
- 0013 VSAM Cluster Name Retired with PKZIP 5.0 +
- 0014 VSAM KSDS Key Information 13 bytes EBCDIC "(num_length num_position)"
- 0015 VSAM Average LRECL 5 bytes EBCDIC num_value padded with blanks
- 0016 VSAM Maximum LRECL 5 bytes EBCDIC num_value padded with blanks
- 0017 VSAM KSDS Key Length 5 bytes EBCDIC num_value padded with blanks
- 0018 VSAM KSDS Key Position 5 bytes EBCDIC num_value padded with blanks
- 0019 VSAM Data Name 1-44 bytes EBCDIC text string
- 001A VSAM KSDS Index Name 1-44 bytes EBCDIC text string
- 001B VSAM Catalog Name 1-44 bytes EBCDIC text string
- 001C VSAM Data Space Type 9 bytes EBCDIC text string
- 001D VSAM Data Space Primary 9 bytes EBCDIC num_value left-justified
- 001E VSAM Data Space Secondary 9 bytes EBCDIC num_value left-justified
- 001F VSAM Data Volume List variable EBCDIC text list of 6-character Volume IDs
- 0020 VSAM Data Buffer Space 8 bytes EBCDIC num_value left-justified
- 0021 VSAM Data CISIZE 5 bytes EBCDIC num_value left-justified
- 0022 VSAM Erase Flag 1 byte flag
- 0023 VSAM Free CI % 3 bytes EBCDIC num_value left-justified
- 0024 VSAM Free CA % 3 bytes EBCDIC num_value left-justified
- 0025 VSAM Index Volume List variable EBCDIC text list of 6-character Volume IDs
- 0026 VSAM Ordered Flag 1 byte flag
- 0027 VSAM REUSE Flag 1 byte flag
- 0028 VSAM SPANNED Flag 1 byte flag
- 0029 VSAM Recovery Flag 1 byte flag
- 002A VSAM WRITECHK Flag 1 byte flag
- 002B VSAM Cluster/Data SHROPTS 3 bytes EBCDIC "n,y"
- 002C VSAM Index SHROPTS 3 bytes EBCDIC "n,y"
- 002D VSAM Index Space Type 9 bytes EBCDIC text string
- 002E VSAM Index Space Primary 9 bytes EBCDIC num_value left-justified
- 002F VSAM Index Space Secondary 9 bytes EBCDIC num_value left-justified
- 0030 VSAM Index CISIZE 5 bytes EBCDIC num_value left-justified
- 0031 VSAM Index IMBED 1 byte flag
- 0032 VSAM Index Ordered Flag 1 byte flag
- 0033 VSAM REPLICATE Flag 1 byte flag
- 0034 VSAM Index REUSE Flag 1 byte flag
- 0035 VSAM Index WRITECHK Flag 1 byte flag Retired with PKZIP 5.0 +
- 0036 VSAM Owner 8 bytes EBCDIC text string
- 0037 VSAM Index Owner 8 bytes EBCDIC text string
- 0038 Reserved
- 0039 Reserved
- 003A Reserved
- 003B Reserved
- 003C Reserved
- 003D Reserved
- 003E Reserved
- 003F Reserved
- 0040 Reserved
- 0041 Reserved
- 0042 Reserved
- 0043 Reserved
- 0044 Reserved
- 0045 Reserved
- 0046 Reserved
- 0047 Reserved
- 0048 Reserved
- 0049 Reserved
- 004A Reserved
- 004B Reserved
- 004C Reserved
- 004D Reserved
- 004E Reserved
- 004F Reserved
- 0050 Reserved
- 0051 Reserved
- 0052 Reserved
- 0053 Reserved
- 0054 Reserved
- 0055 Reserved
- 0056 Reserved
- 0057 Reserved
- 0058 PDS/PDSE Member TTR Info. 6 bytes Big Endian
- 0059 PDS 1st LMOD Text TTR 3 bytes Big Endian
- 005A PDS LMOD EP Rec # 4 bytes Big Endian
- 005B Reserved
- 005C Max Length of records 2 bytes Big Endian
- 005D PDSE Flag 1 byte flag
- 005E Reserved
- 005F Reserved
- 0060 Reserved
- 0061 Reserved
- 0062 Reserved
- 0063 Reserved
- 0064 Reserved
- 0065 Last Date Referenced 4 bytes Packed Hex "yyyymmdd"
- 0066 Date Created 4 bytes Packed Hex "yyyymmdd"
- 0068 GZIP two words 8 bytes
- 0071 Extended NOTE Location 12 bytes Big Endian
- 0072 Archive device UNIT 6 bytes EBCDIC
- 0073 Archive 1st Volume 6 bytes EBCDIC
- 0074 Archive 1st VOL File Seq# 2 bytes Binary
-
-APPENDIX C - Zip64 Extensible Data Sector Mappings
----------------------------------------------------
-
- -Z390 Extra Field:
-
- The following is the general layout of the attributes for the
- ZIP 64 "extra" block for extended tape operations.
-
- Note: some fields stored in Big Endian format. All text is
- in EBCDIC format unless otherwise specified.
-
- Value Size Description
- ----- ---- -----------
- (Z390) 0x0065 2 bytes Tag for this "extra" block type
- Size 4 bytes Size for the following data block
- Tag 4 bytes EBCDIC "Z390"
- Length71 2 bytes Big Endian
- Subcode71 2 bytes Enote type code
- FMEPos 1 byte
- Length72 2 bytes Big Endian
- Subcode72 2 bytes Unit type code
- Unit 1 byte Unit
- Length73 2 bytes Big Endian
- Subcode73 2 bytes Volume1 type code
- FirstVol 1 byte Volume
- Length74 2 bytes Big Endian
- Subcode74 2 bytes FirstVol file sequence
- FileSeq 2 bytes Sequence
-
-APPENDIX D - Language Encoding (EFS)
-------------------------------------
-
-D.1 The ZIP format has historically supported only the original IBM PC character
-encoding set, commonly referred to as IBM Code Page 437. This limits storing
-file name characters to only those within the original MS-DOS range of values
-and does not properly support file names in other character encodings, or
-languages. To address this limitation, this specification will support the
-following change.
-
-D.2 If general purpose bit 11 is unset, the file name and comment should conform
-to the original ZIP character encoding. If general purpose bit 11 is set, the
-filename and comment must support The Unicode Standard, Version 4.1.0 or
-greater using the character encoding form defined by the UTF-8 storage
-specification. The Unicode Standard is published by the The Unicode
-Consortium (www.unicode.org). UTF-8 encoded data stored within ZIP files
-is expected to not include a byte order mark (BOM).
-
-D.3 Applications may choose to supplement this file name storage through the use
-of the 0x0008 Extra Field. Storage for this optional field is currently
-undefined, however it will be used to allow storing extended information
-on source or target encoding that may further assist applications with file
-name, or file content encoding tasks. Please contact PKWARE with any
-requirements on how this field should be used.
-
-D.4 The 0x0008 Extra Field storage may be used with either setting for general
-purpose bit 11. Examples of the intended usage for this field is to store
-whether "modified-UTF-8" (JAVA) is used, or UTF-8-MAC. Similarly, other
-commonly used character encoding (code page) designations can be indicated
-through this field. Formalized values for use of the 0x0008 record remain
-undefined at this time. The definition for the layout of the 0x0008 field
-will be published when available. Use of the 0x0008 Extra Field provides
-for storing data within a ZIP file in an encoding other than IBM Code
-Page 437 or UTF-8.
-
-D.5 General purpose bit 11 will not imply any encoding of file content or
-password. Values defining character encoding for file content or
-password must be stored within the 0x0008 Extended Language Encoding
-Extra Field.
-
-D.6 Ed Gordon of the Info-ZIP group has defined a pair of "extra field" records
-that can be used to store UTF-8 file name and file comment fields. These
-records can be used for cases when the general purpose bit 11 method
-for storing UTF-8 data in the standard file name and comment fields is
-not desirable. A common case for this alternate method is if backward
-compatibility with older programs is required.
-
-D.7 Definitions for the record structure of these fields are included above
-in the section on 3rd party mappings for "extra field" records. These
-records are identified by Header ID's 0x6375 (Info-ZIP Unicode Comment
-Extra Field) and 0x7075 (Info-ZIP Unicode Path Extra Field).
-
-D.8 The choice of which storage method to use when writing a ZIP file is left
-to the implementation. Developers should expect that a ZIP file may
-contain either method and should provide support for reading data in
-either format. Use of general purpose bit 11 reduces storage requirements
-for file name data by not requiring additional "extra field" data for
-each file, but can result in older ZIP programs not being able to extract
-files. Use of the 0x6375 and 0x7075 records will result in a ZIP file
-that should always be readable by older ZIP programs, but requires more
-storage per file to write file name and/or file comment fields.
diff --git a/docs/extrafld.txt b/docs/extrafld.txt
deleted file mode 100644
index 624e05c..0000000
--- a/docs/extrafld.txt
+++ /dev/null
@@ -1,1372 +0,0 @@
-The following are the known types of zipfile extra fields as of this
-writing. Extra fields are documented in PKWARE's appnote.txt and are
-intended to allow for backward- and forward-compatible extensions to
-the zipfile format. Multiple extra-field types may be chained together,
-provided that the total length of all extra-field data is less than 64KB.
-(In fact, PKWARE requires that the total length of the entire file header,
-including timestamp, file attributes, filename, comment, extra field, etc.,
-be no more than 64KB.)
-
-Each extra-field type (or subblock) must contain a four-byte header con-
-sisting of a two-byte header ID and a two-byte length (little-endian) for
-the remaining data in the subblock. If there are additional subblocks
-within the extra field, the header for each one will appear immediately
-following the data for the previous subblock (i.e., with no padding for
-alignment).
-
-All integer fields in the descriptions below are in little-endian (Intel)
-format unless otherwise specified. Note that "Short" means two bytes,
-"Long" means four bytes, and "Long-Long" means eight bytes, regardless
-of their native sizes. Unless specifically noted, all integer fields should
-be interpreted as unsigned (non-negative) numbers.
-
-Christian Spieler, 20010517
-
-Updated to include the Unicode extra fields. Added new Unix extra field.
-
-Ed Gordon, 20060819, 20070607, 20070909, 20080426, 20080509
-
- -------------------------
-
- Header ID's of 0 thru 31 are reserved for use by PKWARE.
- The remaining ID's can be used by third party vendors for
- proprietary usage.
-
- The current Header ID mappings defined by PKWARE are:
-
- 0x0001 ZIP64 extended information extra field
- 0x0007 AV Info
- 0x0009 OS/2 extended attributes (also Info-ZIP)
- 0x000a NTFS (Win9x/WinNT FileTimes)
- 0x000c OpenVMS (also Info-ZIP)
- 0x000d Unix
- 0x000f Patch Descriptor
- 0x0014 PKCS#7 Store for X.509 Certificates
- 0x0015 X.509 Certificate ID and Signature for
- individual file
- 0x0016 X.509 Certificate ID for Central Directory
-
- The Header ID mappings defined by Info-ZIP and third parties are:
-
- 0x0065 IBM S/390 attributes - uncompressed
- 0x0066 IBM S/390 attributes - compressed
- 0x07c8 Info-ZIP Macintosh (old, J. Lee)
- 0x2605 ZipIt Macintosh (first version)
- 0x2705 ZipIt Macintosh v 1.3.5 and newer (w/o full filename)
- 0x334d Info-ZIP Macintosh (new, D. Haase's 'Mac3' field )
- 0x4154 Tandem NSK
- 0x4341 Acorn/SparkFS (David Pilling)
- 0x4453 Windows NT security descriptor (binary ACL)
- 0x4704 VM/CMS
- 0x470f MVS
- 0x4854 Theos, old inofficial port
- 0x4b46 FWKCS MD5 (see below)
- 0x4c41 OS/2 access control list (text ACL)
- 0x4d49 Info-ZIP OpenVMS (obsolete)
- 0x4d63 Macintosh SmartZIP, by Macro Bambini
- 0x4f4c Xceed original location extra field
- 0x5356 AOS/VS (binary ACL)
- 0x5455 extended timestamp
- 0x5855 Info-ZIP Unix (original; also OS/2, NT, etc.)
- 0x554e Xceed unicode extra field
- 0x6375 Info-ZIP Unicode Comment
- 0x6542 BeOS (BeBox, PowerMac, etc.)
- 0x6854 Theos
- 0x7075 Info-ZIP Unicode Path
- 0x756e ASi Unix
- 0x7855 Info-ZIP Unix (previous new)
- 0x7875 Info-ZIP Unix (new)
- 0xfb4a SMS/QDOS
-
-The following are detailed descriptions of the known extra-field block types:
-
- -OS/2 Extended Attributes Extra Field:
- ====================================
-
- The following is the layout of the OS/2 extended attributes "extra"
- block. (Last Revision 19960922)
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (OS/2) 0x0009 Short tag for this extra block type
- TSize Short total data size for this block
- BSize Long uncompressed EA data size
- CType Short compression type
- EACRC Long CRC value for uncompressed EA data
- (var.) variable compressed EA data
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (OS/2) 0x0009 Short tag for this extra block type
- TSize Short total data size for this block (4)
- BSize Long size of uncompressed local EA data
-
- The value of CType is interpreted according to the "compression
- method" section above; i.e., 0 for stored, 8 for deflated, etc.
-
- The OS/2 extended attribute structure (FEA2LIST) is compressed and
- then stored in its entirety within this structure. There will only
- ever be one block of data in the variable-length field.
-
-
- -OS/2 Access Control List Extra Field:
- ====================================
-
- The following is the layout of the OS/2 ACL extra block.
- (Last Revision 19960922)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (ACL) 0x4c41 Short tag for this extra block type ("AL")
- TSize Short total data size for this block
- BSize Long uncompressed ACL data size
- CType Short compression type
- EACRC Long CRC value for uncompressed ACL data
- (var.) variable compressed ACL data
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (ACL) 0x4c41 Short tag for this extra block type ("AL")
- TSize Short total data size for this block (4)
- BSize Long size of uncompressed local ACL data
-
- The value of CType is interpreted according to the "compression
- method" section above; i.e., 0 for stored, 8 for deflated, etc.
-
- The uncompressed ACL data consist of a text header of the form
- "ACL1:%hX,%hd\n", where the first field is the OS/2 ACCINFO acc_attr
- member and the second is acc_count, followed by acc_count strings
- of the form "%s,%hx\n", where the first field is acl_ugname (user
- group name) and the second acl_access. This block type will be
- extended for other operating systems as needed.
-
-
- -Windows NT Security Descriptor Extra Field:
- ==========================================
-
- The following is the layout of the NT Security Descriptor (another
- type of ACL) extra block. (Last Revision 19960922)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (SD) 0x4453 Short tag for this extra block type ("SD")
- TSize Short total data size for this block
- BSize Long uncompressed SD data size
- Version Byte version of uncompressed SD data format
- CType Short compression type
- EACRC Long CRC value for uncompressed SD data
- (var.) variable compressed SD data
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (SD) 0x4453 Short tag for this extra block type ("SD")
- TSize Short total data size for this block (4)
- BSize Long size of uncompressed local SD data
-
- The value of CType is interpreted according to the "compression
- method" section above; i.e., 0 for stored, 8 for deflated, etc.
- Version specifies how the compressed data are to be interpreted
- and allows for future expansion of this extra field type. Currently
- only version 0 is defined.
-
- For version 0, the compressed data are to be interpreted as a single
- valid Windows NT SECURITY_DESCRIPTOR data structure, in self-relative
- format.
-
-
- -PKWARE Win95/WinNT Extra Field:
- ==============================
-
- The following description covers PKWARE's "NTFS" attributes
- "extra" block, introduced with the release of PKZIP 2.50 for
- Windows. (Last Revision 20001118)
-
- (Note: At this time the Mtime, Atime and Ctime values may
- be used on any WIN32 system.)
- [Info-ZIP note: In the current implementations, this field has
- a fixed total data size of 32 bytes and is only stored as local
- extra field.]
-
- Value Size Description
- ----- ---- -----------
- (NTFS) 0x000a Short Tag for this "extra" block type
- TSize Short Total Data Size for this block
- Reserved Long for future use
- Tag1 Short NTFS attribute tag value #1
- Size1 Short Size of attribute #1, in bytes
- (var.) SubSize1 Attribute #1 data
- .
- .
- .
- TagN Short NTFS attribute tag value #N
- SizeN Short Size of attribute #N, in bytes
- (var.) SubSize1 Attribute #N data
-
- For NTFS, values for Tag1 through TagN are as follows:
- (currently only one set of attributes is defined for NTFS)
-
- Tag Size Description
- ----- ---- -----------
- 0x0001 2 bytes Tag for attribute #1
- Size1 2 bytes Size of attribute #1, in bytes (24)
- Mtime 8 bytes 64-bit NTFS file last modification time
- Atime 8 bytes 64-bit NTFS file last access time
- Ctime 8 bytes 64-bit NTFS file creation time
-
- The total length for this block is 28 bytes, resulting in a
- fixed size value of 32 for the TSize field of the NTFS block.
-
- The NTFS filetimes are 64-bit unsigned integers, stored in Intel
- (least significant byte first) byte order. They determine the
- number of 1.0E-07 seconds (1/10th microseconds!) past WinNT "epoch",
- which is "01-Jan-1601 00:00:00 UTC".
-
-
- -PKWARE OpenVMS Extra Field:
- ==========================
-
- The following is the layout of PKWARE's OpenVMS attributes "extra"
- block. (Last Revision 12/17/91)
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- (VMS) 0x000c Short Tag for this "extra" block type
- TSize Short Total Data Size for this block
- CRC Long 32-bit CRC for remainder of the block
- Tag1 Short OpenVMS attribute tag value #1
- Size1 Short Size of attribute #1, in bytes
- (var.) Size1 Attribute #1 data
- .
- .
- .
- TagN Short OpenVMS attribute tage value #N
- SizeN Short Size of attribute #N, in bytes
- (var.) SizeN Attribute #N data
-
- Rules:
-
- 1. There will be one or more of attributes present, which
- will each be preceded by the above TagX & SizeX values.
- These values are identical to the ATR$C_XXXX and
- ATR$S_XXXX constants which are defined in ATR.H under
- OpenVMS C. Neither of these values will ever be zero.
-
- 2. No word alignment or padding is performed.
-
- 3. A well-behaved PKZIP/OpenVMS program should never produce
- more than one sub-block with the same TagX value. Also,
- there will never be more than one "extra" block of type
- 0x000c in a particular directory record.
-
-
- -Info-ZIP VMS Extra Field:
- ========================
-
- The following is the layout of Info-ZIP's VMS attributes extra
- block for VAX or Alpha AXP. The local-header and central-header
- versions are identical. (Last Revision 19960922)
-
- Value Size Description
- ----- ---- -----------
- (VMS2) 0x4d49 Short tag for this extra block type ("JM")
- TSize Short total data size for this block
- ID Long block ID
- Flags Short info bytes
- BSize Short uncompressed block size
- Reserved Long (reserved)
- (var.) variable compressed VMS file-attributes block
-
- The block ID is one of the following unterminated strings:
-
- "VFAB" struct FAB
- "VALL" struct XABALL
- "VFHC" struct XABFHC
- "VDAT" struct XABDAT
- "VRDT" struct XABRDT
- "VPRO" struct XABPRO
- "VKEY" struct XABKEY
- "VMSV" version (e.g., "V6.1"; truncated at hyphen)
- "VNAM" reserved
-
- The lower three bits of Flags indicate the compression method. The
- currently defined methods are:
-
- 0 stored (not compressed)
- 1 simple "RLE"
- 2 deflated
-
- The "RLE" method simply replaces zero-valued bytes with zero-valued
- bits and non-zero-valued bytes with a "1" bit followed by the byte
- value.
-
- The variable-length compressed data contains only the data corre-
- sponding to the indicated structure or string. Typically multiple
- VMS2 extra fields are present (each with a unique block type).
-
-
- -Info-ZIP Macintosh Extra Field:
- ==============================
-
- The following is the layout of the (old) Info-ZIP resource-fork extra
- block for Macintosh. The local-header and central-header versions
- are identical. (Last Revision 19960922)
-
- Value Size Description
- ----- ---- -----------
- (Mac) 0x07c8 Short tag for this extra block type
- TSize Short total data size for this block
- "JLEE" beLong extra-field signature
- FInfo 16 bytes Macintosh FInfo structure
- CrDat beLong HParamBlockRec fileParam.ioFlCrDat
- MdDat beLong HParamBlockRec fileParam.ioFlMdDat
- Flags beLong info bits
- DirID beLong HParamBlockRec fileParam.ioDirID
- VolName 28 bytes volume name (optional)
-
- All fields but the first two are in native Macintosh format
- (big-endian Motorola order, not little-endian Intel). The least
- significant bit of Flags is 1 if the file is a data fork, 0 other-
- wise. In addition, if this extra field is present, the filename
- has an extra 'd' or 'r' appended to indicate data fork or resource
- fork. The 28-byte VolName field may be omitted.
-
-
- -ZipIt Macintosh Extra Field (long):
- ==================================
-
- The following is the layout of the ZipIt extra block for Macintosh.
- The local-header and central-header versions are identical.
- (Last Revision 19970130)
-
- Value Size Description
- ----- ---- -----------
- (Mac2) 0x2605 Short tag for this extra block type
- TSize Short total data size for this block
- "ZPIT" beLong extra-field signature
- FnLen Byte length of FileName
- FileName variable full Macintosh filename
- FileType Byte[4] four-byte Mac file type string
- Creator Byte[4] four-byte Mac creator string
-
-
- -ZipIt Macintosh Extra Field (short):
- ===================================
-
- The following is the layout of a shortened variant of the
- ZipIt extra block for Macintosh (without "full name" entry).
- This variant is used by ZipIt 1.3.5 and newer for entries that
- do not need a "full Mac filename" record.
- The local-header and central-header versions are identical.
- (Last Revision 19980903)
-
- Value Size Description
- ----- ---- -----------
- (Mac2b) 0x2705 Short tag for this extra block type
- TSize Short total data size for this block (12)
- "ZPIT" beLong extra-field signature
- FileType Byte[4] four-byte Mac file type string
- Creator Byte[4] four-byte Mac creator string
-
-
- -Info-ZIP Macintosh Extra Field (new):
- ====================================
-
- The following is the layout of the (new) Info-ZIP extra
- block for Macintosh, designed by Dirk Haase.
- All values are in little-endian.
- (Last Revision 19981005)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (Mac3) 0x334d Short tag for this extra block type ("M3")
- TSize Short total data size for this block
- BSize Long uncompressed finder attribute data size
- Flags Short info bits
- fdType Byte[4] Type of the File (4-byte string)
- fdCreator Byte[4] Creator of the File (4-byte string)
- (CType) Short compression type
- (CRC) Long CRC value for uncompressed MacOS data
- Attribs variable finder attribute data (see below)
-
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (Mac3) 0x334d Short tag for this extra block type ("M3")
- TSize Short total data size for this block
- BSize Long uncompressed finder attribute data size
- Flags Short info bits
- fdType Byte[4] Type of the File (4-byte string)
- fdCreator Byte[4] Creator of the File (4-byte string)
-
- The third bit of Flags in both headers indicates whether
- the LOCAL extra field is uncompressed (and therefore whether CType
- and CRC are omitted):
-
- Bits of the Flags:
- bit 0 if set, file is a data fork; otherwise unset
- bit 1 if set, filename will be not changed
- bit 2 if set, Attribs is uncompressed (no CType, CRC)
- bit 3 if set, date and times are in 64 bit
- if zero date and times are in 32 bit.
- bit 4 if set, timezone offsets fields for the native
- Mac times are omitted (UTC support deactivated)
- bits 5-15 reserved;
-
-
- Attributes:
-
- Attribs is a Mac-specific block of data in little-endian format with
- the following structure (if compressed, uncompress it first):
-
- Value Size Description
- ----- ---- -----------
- fdFlags Short Finder Flags
- fdLocation.v Short Finder Icon Location
- fdLocation.h Short Finder Icon Location
- fdFldr Short Folder containing file
-
- FXInfo 16 bytes Macintosh FXInfo structure
- FXInfo-Structure:
- fdIconID Short
- fdUnused[3] Short unused but reserved 6 bytes
- fdScript Byte Script flag and number
- fdXFlags Byte More flag bits
- fdComment Short Comment ID
- fdPutAway Long Home Dir ID
-
- FVersNum Byte file version number
- may be not used by MacOS
- ACUser Byte directory access rights
-
- FlCrDat ULong date and time of creation
- FlMdDat ULong date and time of last modification
- FlBkDat ULong date and time of last backup
- These time numbers are original Mac FileTime values (local time!).
- Currently, date-time width is 32-bit, but future version may
- support be 64-bit times (see flags)
-
- CrGMTOffs Long(signed!) difference "local Creat. time - UTC"
- MdGMTOffs Long(signed!) difference "local Modif. time - UTC"
- BkGMTOffs Long(signed!) difference "local Backup time - UTC"
- These "local time - UTC" differences (stored in seconds) may be
- used to support timestamp adjustment after inter-timezone transfer.
- These fields are optional; bit 4 of the flags word controls their
- presence.
-
- Charset Short TextEncodingBase (Charset)
- valid for the following two fields
-
- FullPath variable Path of the current file.
- Zero terminated string (C-String)
- Currently coded in the native Charset.
-
- Comment variable Finder Comment of the current file.
- Zero terminated string (C-String)
- Currently coded in the native Charset.
-
-
- -SmartZIP Macintosh Extra Field:
- ====================================
-
- The following is the layout of the SmartZIP extra
- block for Macintosh, designed by Marco Bambini.
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- 0x4d63 Short tag for this extra block type ("cM")
- TSize Short total data size for this block (64)
- "dZip" beLong extra-field signature
- fdType Byte[4] Type of the File (4-byte string)
- fdCreator Byte[4] Creator of the File (4-byte string)
- fdFlags beShort Finder Flags
- fdLocation.v beShort Finder Icon Location
- fdLocation.h beShort Finder Icon Location
- fdFldr beShort Folder containing file
- CrDat beLong HParamBlockRec fileParam.ioFlCrDat
- MdDat beLong HParamBlockRec fileParam.ioFlMdDat
- frScroll.v Byte vertical pos. of folder's scroll bar
- fdScript Byte Script flag and number
- frScroll.h Byte horizontal pos. of folder's scroll bar
- fdXFlags Byte More flag bits
- FileName Byte[32] full Macintosh filename (pascal string)
-
- All fields but the first two are in native Macintosh format
- (big-endian Motorola order, not little-endian Intel).
- The extra field size is fixed to 64 bytes.
- The local-header and central-header versions are identical.
-
-
- -Acorn SparkFS Extra Field:
- =========================
-
- The following is the layout of David Pilling's SparkFS extra block
- for Acorn RISC OS. The local-header and central-header versions are
- identical. (Last Revision 19960922)
-
- Value Size Description
- ----- ---- -----------
- (Acorn) 0x4341 Short tag for this extra block type ("AC")
- TSize Short total data size for this block (20)
- "ARC0" Long extra-field signature
- LoadAddr Long load address or file type
- ExecAddr Long exec address
- Attr Long file permissions
- Zero Long reserved; always zero
-
- The following bits of Attr are associated with the given file
- permissions:
-
- bit 0 user-writable ('W')
- bit 1 user-readable ('R')
- bit 2 reserved
- bit 3 locked ('L')
- bit 4 publicly writable ('w')
- bit 5 publicly readable ('r')
- bit 6 reserved
- bit 7 reserved
-
-
- -VM/CMS Extra Field:
- ==================
-
- The following is the layout of the file-attributes extra block for
- VM/CMS. The local-header and central-header versions are
- identical. (Last Revision 19960922)
-
- Value Size Description
- ----- ---- -----------
- (VM/CMS) 0x4704 Short tag for this extra block type
- TSize Short total data size for this block
- flData variable file attributes data
-
- flData is an uncompressed fldata_t struct.
-
-
- -MVS Extra Field:
- ===============
-
- The following is the layout of the file-attributes extra block for
- MVS. The local-header and central-header versions are identical.
- (Last Revision 19960922)
-
- Value Size Description
- ----- ---- -----------
- (MVS) 0x470f Short tag for this extra block type
- TSize Short total data size for this block
- flData variable file attributes data
-
- flData is an uncompressed fldata_t struct.
-
-
- -PKWARE Unix Extra Field:
- ========================
-
- The following is the layout of PKWARE's Unix "extra" block.
- It was introduced with the release of PKZIP for Unix 2.50.
- Note: all fields are stored in Intel low-byte/high-byte order.
- (Last Revision 19980901)
-
- This field has a minimum data size of 12 bytes and is only stored
- as local extra field.
-
- Value Size Description
- ----- ---- -----------
- (Unix0) 0x000d Short Tag for this "extra" block type
- TSize Short Total Data Size for this block
- AcTime Long time of last access (UTC/GMT)
- ModTime Long time of last modification (UTC/GMT)
- UID Short Unix user ID
- GID Short Unix group ID
- (var) variable Variable length data field
-
- The variable length data field will contain file type
- specific data. Currently the only values allowed are
- the original "linked to" file names for hard or symbolic
- links, and the major and minor device node numbers for
- character and block device nodes. Since device nodes
- cannot be either symbolic or hard links, only one set of
- variable length data is stored. Link files will have the
- name of the original file stored. This name is NOT NULL
- terminated. Its size can be determined by checking TSize -
- 12. Device entries will have eight bytes stored as two 4
- byte entries (in little-endian format). The first entry
- will be the major device number, and the second the minor
- device number.
-
- [Info-ZIP note: The fixed part of this field has the same layout as
- Info-ZIP's abandoned "Unix1 timestamps & owner ID info" extra field;
- only the two tag bytes are different.]
-
-
- -PATCH Descriptor Extra Field:
- ============================
-
- The following is the layout of the Patch Descriptor "extra"
- block.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- (Patch) 0x000f Short Tag for this "extra" block type
- TSize Short Size of the total "extra" block
- Version Short Version of the descriptor
- Flags Long Actions and reactions (see below)
- OldSize Long Size of the file about to be patched
- OldCRC Long 32-bit CRC of the file about to be patched
- NewSize Long Size of the resulting file
- NewCRC Long 32-bit CRC of the resulting file
-
-
- Actions and reactions
-
- Bits Description
- ---- ----------------
- 0 Use for autodetection
- 1 Treat as selfpatch
- 2-3 RESERVED
- 4-5 Action (see below)
- 6-7 RESERVED
- 8-9 Reaction (see below) to absent file
- 10-11 Reaction (see below) to newer file
- 12-13 Reaction (see below) to unknown file
- 14-15 RESERVED
- 16-31 RESERVED
-
- Actions
-
- Action Value
- ------ -----
- none 0
- add 1
- delete 2
- patch 3
-
- Reactions
-
- Reaction Value
- -------- -----
- ask 0
- skip 1
- ignore 2
- fail 3
-
-
- -PKCS#7 Store for X.509 Certificates:
- ===================================
-
- This field is contains the information about each
- certificate a file is signed with. This field should only
- appear in the first central directory record, and will be
- ignored in any other record.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- (Store) 0x0014 2 bytes Tag for this "extra" block type
- SSize 2 bytes Size of the store data
- SData (variable) Data about the store
-
- SData
- Value Size Description
- ----- ---- -----------
- Version 2 bytes Version number, 0x0001 for now
- StoreD (variable) Actual store data
-
- The StoreD member is suitable for passing as the pbData
- member of a CRYPT_DATA_BLOB to the CertOpenStore() function
- in Microsoft's CryptoAPI. The SSize member above will be
- cbData + 6, where cbData is the cbData member of the same
- CRYPT_DATA_BLOB. The encoding type to pass to
- CertOpenStore() should be
- PKCS_7_ANS_ENCODING | X509_ASN_ENCODING.
-
-
- -X.509 Certificate ID and Signature for individual file:
- ======================================================
-
- This field contains the information about which certificate
- in the PKCS#7 Store was used to sign the particular file.
- It also contains the signature data. This field can appear
- multiple times, but can only appear once per certificate.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- (CID) 0x0015 2 bytes Tag for this "extra" block type
- CSize 2 bytes Size of Method
- Method (variable)
-
- Method
- Value Size Description
- ----- ---- -----------
- Version 2 bytes Version number, for now 0x0001
- AlgID 2 bytes Algorithm ID used for signing
- IDSize 2 bytes Size of Certificate ID data
- CertID (variable) Certificate ID data
- SigSize 2 bytes Size of Signature data
- Sig (variable) Signature data
-
- CertID
- Value Size Description
- ----- ---- -----------
- Size1 4 bytes Size of CertID, should be (IDSize - 4)
- Size1 4 bytes A bug in version one causes this value
- to appear twice.
- IssSize 4 bytes Issuer data size
- Issuer (variable) Issuer data
- SerSize 4 bytes Serial Number size
- Serial (variable) Serial Number data
-
- The Issuer and IssSize members are suitable for creating a
- CRYPT_DATA_BLOB to be the Issuer member of a CERT_INFO
- struct. The Serial and SerSize members would be the
- SerialNumber member of the same CERT_INFO struct. This
- struct would be used to find the certificate in the store
- the file was signed with. Those structures are from the MS
- CryptoAPI.
-
- Sig and SigSize are the actual signature data and size
- generated by signing the file with the MS CryptoAPI using a
- hash created with the given AlgID.
-
-
- -X.509 Certificate ID and Signature for central directory:
- ========================================================
-
- This field contains the information about which certificate
- in the PKCS#7 Store was used to sign the central directory.
- It should only appear with the first central directory
- record, along with the store. The data structure is the
- same as the CID, except that SigSize will be 0, and there
- will be no Sig member.
-
- This field is also kept after the last central directory
- record, as the signature data (ID 0x05054b50, it looks like
- a central directory record of a different type). This
- second copy of the data is the Signature Data member of the
- record, and will have a SigSize that is non-zero, and will
- have Sig data.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- (CDID) 0x0016 2 bytes Tag for this "extra" block type
- CSize 2 bytes Size of Method
- Method (variable)
-
-
- -ZIP64 Extended Information Extra Field:
- ======================================
-
- The following is the layout of the ZIP64 extended
- information "extra" block. If one of the size or
- offset fields in the Local or Central directory
- record is too small to hold the required data,
- a ZIP64 extended information record is created.
- The order of the fields in the ZIP64 extended
- information record is fixed, but the fields will
- only appear if the corresponding Local or Central
- directory record field is set to 0xFFFF or 0xFFFFFFFF.
-
- Note: all fields stored in Intel low-byte/high-byte order.
-
- Value Size Description
- ----- ---- -----------
- (ZIP64) 0x0001 2 bytes Tag for this "extra" block type
- Size 2 bytes Size of this "extra" block
- Original
- Size 8 bytes Original uncompresseed file size
- Compressed
- Size 8 bytes Size of compressed data
- Relative Header
- Offset 8 bytes Offset of local header record
- Disk Start
- Number 4 bytes Number of the disk on which
- this file starts
-
- This entry in the Local header must include BOTH original
- and compressed file sizes.
-
-
- -Extended Timestamp Extra Field:
- ==============================
-
- The following is the layout of the extended-timestamp extra block.
- (Last Revision 19970118)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (time) 0x5455 Short tag for this extra block type ("UT")
- TSize Short total data size for this block
- Flags Byte info bits
- (ModTime) Long time of last modification (UTC/GMT)
- (AcTime) Long time of last access (UTC/GMT)
- (CrTime) Long time of original creation (UTC/GMT)
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (time) 0x5455 Short tag for this extra block type ("UT")
- TSize Short total data size for this block
- Flags Byte info bits (refers to local header!)
- (ModTime) Long time of last modification (UTC/GMT)
-
- The central-header extra field contains the modification time only,
- or no timestamp at all. TSize is used to flag its presence or
- absence. But note:
-
- If "Flags" indicates that Modtime is present in the local header
- field, it MUST be present in the central header field, too!
- This correspondence is required because the modification time
- value may be used to support trans-timezone freshening and
- updating operations with zip archives.
-
- The time values are in standard Unix signed-long format, indicating
- the number of seconds since 1 January 1970 00:00:00. The times
- are relative to Coordinated Universal Time (UTC), also sometimes
- referred to as Greenwich Mean Time (GMT). To convert to local time,
- the software must know the local timezone offset from UTC/GMT.
-
- The lower three bits of Flags in both headers indicate which time-
- stamps are present in the LOCAL extra field:
-
- bit 0 if set, modification time is present
- bit 1 if set, access time is present
- bit 2 if set, creation time is present
- bits 3-7 reserved for additional timestamps; not set
-
- Those times that are present will appear in the order indicated, but
- any combination of times may be omitted. (Creation time may be
- present without access time, for example.) TSize should equal
- (1 + 4*(number of set bits in Flags)), as the block is currently
- defined. Other timestamps may be added in the future.
-
-
- -Info-ZIP Unix Extra Field (type 1):
- ==================================
-
- The following is the layout of the old Info-ZIP extra block for
- Unix. It has been replaced by the extended-timestamp extra block
- (0x5455) and the Unix type 2 extra block (0x7855).
- (Last Revision 19970118)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (Unix1) 0x5855 Short tag for this extra block type ("UX")
- TSize Short total data size for this block
- AcTime Long time of last access (UTC/GMT)
- ModTime Long time of last modification (UTC/GMT)
- UID Short Unix user ID (optional)
- GID Short Unix group ID (optional)
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (Unix1) 0x5855 Short tag for this extra block type ("UX")
- TSize Short total data size for this block
- AcTime Long time of last access (GMT/UTC)
- ModTime Long time of last modification (GMT/UTC)
-
- The file access and modification times are in standard Unix signed-
- long format, indicating the number of seconds since 1 January 1970
- 00:00:00. The times are relative to Coordinated Universal Time
- (UTC), also sometimes referred to as Greenwich Mean Time (GMT). To
- convert to local time, the software must know the local timezone
- offset from UTC/GMT. The modification time may be used by non-Unix
- systems to support inter-timezone freshening and updating of zip
- archives.
-
- The local-header extra block may optionally contain UID and GID
- info for the file. The local-header TSize value is the only
- indication of this. Note that Unix UIDs and GIDs are usually
- specific to a particular machine, and they generally require root
- access to restore.
-
- This extra field type is obsolete, but it has been in use since
- mid-1994. Therefore future archiving software should continue to
- support it. Some guidelines:
-
- An archive member should either contain the old "Unix1"
- extra field block or the new extra field types "time" and/or
- "Unix2".
-
- If both the old "Unix1" block type and one or both of the new
- block types "time" and "Unix2" are found, the "Unix1" block
- should be considered invalid and ignored.
-
- Unarchiving software should recognize both old and new extra
- field block types, but the info from new types overrides the
- old "Unix1" field.
-
- Archiving software should recognize "Unix1" extra fields for
- timestamp comparison but never create it for updated, freshened
- or new archive members. When copying existing members to a new
- archive, any "Unix1" extra field blocks should be converted to
- the new "time" and/or "Unix2" types.
-
-
- -Info-ZIP Unix Extra Field (type 2):
- ==================================
-
- The following is the layout of the new Info-ZIP extra block for
- Unix. (Last Revision 19960922)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (Unix2) 0x7855 Short tag for this extra block type ("Ux")
- TSize Short total data size for this block (4)
- UID Short Unix user ID
- GID Short Unix group ID
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (Unix2) 0x7855 Short tag for this extra block type ("Ux")
- TSize Short total data size for this block (0)
-
- The data size of the central-header version is zero; it is used
- solely as a flag that UID/GID info is present in the local-header
- extra field. If additional fields are ever added to the local
- version, the central version may be extended to indicate this.
-
- Note that Unix UIDs and GIDs are usually specific to a particular
- machine, and they generally require root access to restore.
-
-
- -ASi Unix Extra Field:
- ====================
-
- The following is the layout of the ASi extra block for Unix. The
- local-header and central-header versions are identical.
- (Last Revision 19960916)
-
- Value Size Description
- ----- ---- -----------
- (Unix3) 0x756e Short tag for this extra block type ("nu")
- TSize Short total data size for this block
- CRC Long CRC-32 of the remaining data
- Mode Short file permissions
- SizDev Long symlink'd size OR major/minor dev num
- UID Short user ID
- GID Short group ID
- (var.) variable symbolic link filename
-
- Mode is the standard Unix st_mode field from struct stat, containing
- user/group/other permissions, setuid/setgid and symlink info, etc.
-
- If Mode indicates that this file is a symbolic link, SizDev is the
- size of the file to which the link points. Otherwise, if the file
- is a device, SizDev contains the standard Unix st_rdev field from
- struct stat (includes the major and minor numbers of the device).
- SizDev is undefined in other cases.
-
- If Mode indicates that the file is a symbolic link, the final field
- will be the name of the file to which the link points. The file-
- name length can be inferred from TSize.
-
- [Note that TSize may incorrectly refer to the data size not counting
- the CRC; i.e., it may be four bytes too small.]
-
-
- -BeOS Extra Field:
- ================
-
- The following is the layout of the file-attributes extra block for
- BeOS. (Last Revision 19970531)
-
- Local-header version:
-
- Value Size Description
- ----- ---- -----------
- (BeOS) 0x6542 Short tag for this extra block type ("Be")
- TSize Short total data size for this block
- BSize Long uncompressed file attribute data size
- Flags Byte info bits
- (CType) Short compression type
- (CRC) Long CRC value for uncompressed file attribs
- Attribs variable file attribute data
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (BeOS) 0x6542 Short tag for this extra block type ("Be")
- TSize Short total data size for this block (5)
- BSize Long size of uncompr. local EF block data
- Flags Byte info bits
-
- The least significant bit of Flags in both headers indicates whether
- the LOCAL extra field is uncompressed (and therefore whether CType
- and CRC are omitted):
-
- bit 0 if set, Attribs is uncompressed (no CType, CRC)
- bits 1-7 reserved; if set, assume error or unknown data
-
- Currently the only supported compression types are deflated (type 8)
- and stored (type 0); the latter is not used by Info-ZIP's Zip but is
- supported by UnZip.
-
- Attribs is a BeOS-specific block of data in big-endian format with
- the following structure (if compressed, uncompress it first):
-
- Value Size Description
- ----- ---- -----------
- Name variable attribute name (null-terminated string)
- Type Long attribute type (32-bit unsigned integer)
- Size Long Long data size for this sub-block (64 bits)
- Data variable attribute data
-
- The attribute structure is repeated for every attribute. The Data
- field may contain anything--text, flags, bitmaps, etc.
-
-
- -SMS/QDOS Extra Field:
- ====================
-
- The following is the layout of the file-attributes extra block for
- SMS/QDOS. The local-header and central-header versions are identical.
- (Last Revision 19960929)
-
- Value Size Description
- ----- ---- -----------
- (QDOS) 0xfb4a Short tag for this extra block type
- TSize Short total data size for this block
- LongID Long extra-field signature
- (ExtraID) Long additional signature/flag bytes
- QDirect 64 bytes qdirect structure
-
- LongID may be "QZHD" or "QDOS". In the latter case, ExtraID will
- be present. Its first three bytes are "02\0"; the last byte is
- currently undefined.
-
- QDirect contains the file's uncompressed directory info (qdirect
- struct). Its elements are in native (big-endian) format:
-
- d_length beLong file length
- d_access byte file access type
- d_type byte file type
- d_datalen beLong data length
- d_reserved beLong unused
- d_szname beShort size of filename
- d_name 36 bytes filename
- d_update beLong time of last update
- d_refdate beLong file version number
- d_backup beLong time of last backup (archive date)
-
-
- -AOS/VS Extra Field:
- ==================
-
- The following is the layout of the extra block for Data General
- AOS/VS. The local-header and central-header versions are identical.
- (Last Revision 19961125)
-
- Value Size Description
- ----- ---- -----------
- (AOSVS) 0x5356 Short tag for this extra block type ("VS")
- TSize Short total data size for this block
- "FCI\0" Long extra-field signature
- Version Byte version of AOS/VS extra block (10 = 1.0)
- Fstat variable fstat packet
- AclBuf variable raw ACL data ($MXACL bytes)
-
- Fstat contains the file's uncompressed fstat packet, which is one of
- the following:
-
- normal fstat packet (P_FSTAT struct)
- DIR/CPD fstat packet (P_FSTAT_DIR struct)
- unit (device) fstat packet (P_FSTAT_UNIT struct)
- IPC file fstat packet (P_FSTAT_IPC struct)
-
- AclBuf contains the raw ACL data; its length is $MXACL.
-
-
- -Tandem NSK Extra Field:
- ======================
-
- The following is the layout of the file-attributes extra block for
- Tandem NSK. The local-header and central-header versions are
- identical. (Last Revision 19981221)
-
- Value Size Description
- ----- ---- -----------
- (TA) 0x4154 Short tag for this extra block type ("TA")
- TSize Short total data size for this block (20)
- NSKattrs 20 Bytes NSK attributes
-
-
- -THEOS Extra Field:
- =================
-
- The following is the layout of the file-attributes extra block for
- Theos. The local-header and central-header versions are identical.
- (Last Revision 19990206)
-
- Value Size Description
- ----- ---- -----------
- (Theos) 0x6854 Short 'Th' signature
- size Short size of extra block
- flags Byte reserved for future use
- filesize Long file size
- fileorg Byte type of file (see below)
- keylen Short key length for indexed and keyed files,
- data segment size for 16 bits programs
- reclen Short record length for indexed,keyed and direct,
- text segment size for 16 bits programs
- filegrow Byte growing factor for indexed,keyed and direct
- protect Byte protections (see below)
- reserved Short reserved for future use
-
- File types
- ==========
-
- 0x80 library (keyed access list of files)
- 0x40 directory
- 0x10 stream file
- 0x08 direct file
- 0x04 keyed file
- 0x02 indexed file
- 0x0e reserved
- 0x01 16 bits real mode program (obsolete)
- 0x21 16 bits protected mode program
- 0x41 32 bits protected mode program
-
- Protection codes
- ================
-
- User protection
- ---------------
- 0x01 non readable
- 0x02 non writable
- 0x04 non executable
- 0x08 non erasable
-
- Other protection
- ----------------
- 0x10 non readable
- 0x20 non writable
- 0x40 non executable Theos before 4.0
- 0x40 modified Theos 4.x
- 0x80 not hidden
-
-
- -THEOS old inofficial Extra Field:
- ================================
-
- The following is the layout of an inoffical former version of a
- Theos file-attributes extra blocks. This layout was never published
- and is no longer created. However, UnZip can optionally support it
- when compiling with the option flag OLD_THEOS_EXTRA defined.
- Both the local-header and central-header versions are identical.
- (Last Revision 19990206)
-
- Value Size Description
- ----- ---- -----------
- (THS0) 0x4854 Short 'TH' signature
- size Short size of extra block
- flags Short reserved for future use
- filesize Long file size
- reclen Short record length for indexed,keyed and direct,
- text segment size for 16 bits programs
- keylen Short key length for indexed and keyed files,
- data segment size for 16 bits programs
- filegrow Byte growing factor for indexed,keyed and direct
- reserved 3 Bytes reserved for future use
-
-
- -FWKCS MD5 Extra Field:
- =====================
-
- The FWKCS Contents_Signature System, used in automatically
- identifying files independent of filename, optionally adds
- and uses an extra field to support the rapid creation of
- an enhanced contents_signature.
- There is no local-header version; the following applies
- only to the central header. (Last Revision 19961207)
-
- Central-header version:
-
- Value Size Description
- ----- ---- -----------
- (MD5) 0x4b46 Short tag for this extra block type ("FK")
- TSize Short total data size for this block (19)
- "MD5" 3 bytes extra-field signature
- MD5hash 16 bytes 128-bit MD5 hash of uncompressed data
- (low byte first)
-
- When FWKCS revises a .ZIP file central directory to add
- this extra field for a file, it also replaces the
- central directory entry for that file's uncompressed
- file length with a measured value.
-
- FWKCS provides an option to strip this extra field, if
- present, from a .ZIP file central directory. In adding
- this extra field, FWKCS preserves .ZIP file Authenticity
- Verification; if stripping this extra field, FWKCS
- preserves all versions of AV through PKZIP version 2.04g.
-
- FWKCS, and FWKCS Contents_Signature System, are
- trademarks of Frederick W. Kantor.
-
- (1) R. Rivest, RFC1321.TXT, MIT Laboratory for Computer
- Science and RSA Data Security, Inc., April 1992.
- ll.76-77: "The MD5 algorithm is being placed in the
- public domain for review and possible adoption as a
- standard."
-
-
- -Info-ZIP Unicode Path Extra Field:
- =================================
-
- Stores the UTF-8 version of the entry path as stored in the
- local header and central directory header.
- (Last Revision 20070912)
-
- Value Size Description
- ----- ---- -----------
- (UPath) 0x7075 Short tag for this extra block type ("up")
- TSize Short total data size for this block
- Version 1 byte version of this extra field, currently 1
- NameCRC32 4 bytes File Name Field CRC32 Checksum
- UnicodeName Variable UTF-8 version of the entry File Name
-
- Currently Version is set to the number 1. If there is a need
- to change this field, the version will be incremented. Changes
- may not be backward compatible so this extra field should not be
- used if the version is not recognized.
-
- The NameCRC32 is the standard zip CRC32 checksum of the File Name
- field in the header. This is used to verify that the header
- File Name field has not changed since the Unicode Path extra field
- was created. This can happen if a utility renames the entry but
- does not update the UTF-8 path extra field. If the CRC check fails,
- this UTF-8 Path Extra Field should be ignored and the File Name field
- in the header used instead.
-
- The UnicodeName is the UTF-8 version of the contents of the File Name
- field in the header. As UnicodeName is defined to be UTF-8, no UTF-8
- byte order mark (BOM) is used. The length of this field is determined
- by subtracting the size of the previous fields from TSize. If both
- the File Name and Comment fields are UTF-8, the new General Purpose
- Bit Flag, bit 11 (Language encoding flag (EFS)), can be used to
- indicate that both the header File Name and Comment fields are UTF-8
- and, in this case, the Unicode Path and Unicode Comment extra fields
- are not needed and should not be created. Note that, for backward
- compatibility, bit 11 should only be used if the native character set
- of the paths and comments being zipped up are already in UTF-8. The
- same method, either bit 11 or extra fields, should be used in both
- the local and central directory headers.
-
-
- -Info-ZIP Unicode Comment Extra Field:
- ====================================
-
- Stores the UTF-8 version of the entry comment as stored in the
- central directory header.
- (Last Revision 20070912)
-
- Value Size Description
- ----- ---- -----------
- (UCom) 0x6375 Short tag for this extra block type ("uc")
- TSize Short total data size for this block
- Version 1 byte version of this extra field, currently 1
- ComCRC32 4 bytes Comment Field CRC32 Checksum
- UnicodeCom Variable UTF-8 version of the entry comment
-
- Currently Version is set to the number 1. If there is a need
- to change this field, the version will be incremented. Changes
- may not be backward compatible so this extra field should not be
- used if the version is not recognized.
-
- The ComCRC32 is the standard zip CRC32 checksum of the Comment
- field in the central directory header. This is used to verify that
- the comment field has not changed since the Unicode Comment extra field
- was created. This can happen if a utility changes the Comment field
- but does not update the UTF-8 Comment extra field. If the CRC check
- fails, this Unicode Comment extra field should be ignored and the
- Comment field in the header used.
-
- The UnicodeCom field is the UTF-8 version of the entry comment field
- in the header. As UnicodeCom is defined to be UTF-8, no UTF-8 byte
- order mark (BOM) is used. The length of this field is determined by
- subtracting the size of the previous fields from TSize. If both the
- File Name and Comment fields are UTF-8, the new General Purpose Bit
- Flag, bit 11 (Language encoding flag (EFS)), can be used to indicate
- both the header File Name and Comment fields are UTF-8 and, in this
- case, the Unicode Path and Unicode Comment extra fields are not
- needed and should not be created. Note that, for backward
- compatibility, bit 11 should only be used if the native character set
- of the paths and comments being zipped up are already in UTF-8. The
- same method, either bit 11 or extra fields, should be used in both
- the local and central directory headers.
-
-
- -Info-ZIP New Unix Extra Field:
- ====================================
-
- Currently stores Unix UIDs/GIDs up to 32 bits.
- (Last Revision 20080509)
-
- Value Size Description
- ----- ---- -----------
- (UnixN) 0x7875 Short tag for this extra block type ("ux")
- TSize Short total data size for this block
- Version 1 byte version of this extra field, currently 1
- UIDSize 1 byte Size of UID field
- UID Variable UID for this entry
- GIDSize 1 byte Size of GID field
- GID Variable GID for this entry
-
- Currently Version is set to the number 1. If there is a need
- to change this field, the version will be incremented. Changes
- may not be backward compatible so this extra field should not be
- used if the version is not recognized.
-
- UIDSize is the size of the UID field in bytes. This size should
- match the size of the UID field on the target OS.
-
- UID is the UID for this entry in standard little endian format.
-
- GIDSize is the size of the GID field in bytes. This size should
- match the size of the GID field on the target OS.
-
- GID is the GID for this entry in standard little endian format.
-
- If both the old 16-bit Unix extra field (tag 0x7855, Info-ZIP Unix)
- and this extra field are present, the values in this extra field
- supercede the values in that extra field.