Toggle menu
Toggle personal menu
Not logged in
Your IP address will be publicly visible if you make any edits.

PZZ (Gotcha Force): Difference between revisions

Line 29: Line 29:
So max length of a pzz packed file is (2^30) * 2048: more than 2 Tb (2 199 023 255 552 octets). In others words there is no file length restriction.
So max length of a pzz packed file is (2^30) * 2048: more than 2 Tb (2 199 023 255 552 octets). In others words there is no file length restriction.


File length (in bytes) is computed as following: (file_descriptor & 0x3FFFFFFF) * 0x800
File length (in bytes) is computed as following:
* (file_descriptor & 0x3FFFFFFF) * 0x800
Here 0x3FFFFFFF is a mask allowing to retrieve the 30 least significant bits from the 32 bits of the file_descriptor.


Here 0x3FFFFFFFis a mask allowing to retrieve the 30 least significant bits from the 32 bits of the file_descriptor.
Compression flag is retrieved with another mask:
 
* file_descriptor & 0x40000000.
Compression flag is retrieved with another mask: file_descriptor & 0x40000000.


To better understand how a file descriptor works here is an example:
To better understand how a file descriptor works here is an example:
Line 39: Line 40:
The first file to be packed has a length of 12kb (12 000 bytes) and is packed compressed. So it's file descriptor is equal to:
The first file to be packed has a length of 12kb (12 000 bytes) and is packed compressed. So it's file descriptor is equal to:
* upper_round(12000/0x800)+0x40000000 = 0x40000006
* upper_round(12000/0x800)+0x40000000 = 0x40000006
So the ``40 00 00 06`` is stored just after the header file count field. A file descriptor describe sometimes an empty file, in this case the file_descriptor is equal to "00 00 00 00" and will be counted in the header file count field.
So the ``40 00 00 06`` is stored just after the header file count field. A file descriptor describe sometimes an empty file, in this case the file_descriptor is equal to "00 00 00 00" and will be counted in the header file count field.



Revision as of 11:11, 7 September 2022

← Gotcha Force

This article is about Gotcha Force PZZ file format and ongoing researchs on it.

PZZ files are archive packing files with a compression algorithm available.

Format

PZZ files have a fixed length header of header 0x800 bytes (2048 bytes).

We found files directly after the header one after the other (See note). Files can be compressed or not. Theoretically a pzz file can handle max 511 files (See Header). In the Gotcha Force case pzz files are in big endian (this is not the case in wii games).

Note: Every file is aligned to a 0x800 bytes (2048) multiple with Null pading ("\x00") before to be packed.

Header

Header:

  • 4 bytes - file_count uint - Total files count packed in the pzz.
  • 4 bytes[file_count] - uint - file_descriptors describing files length and compression.

Note: Header length is 0x800 bytes (2048) and each file_descriptor takes 4 bytes, this allow max 2048/4-1 max total file_discriptors so the PZZ cannot contains more than 511 files. All pzz from Gotcha Force contains less than 20 files.

File descriptors format

If we assign an index to each bit of a file_descriptor from 0 (least significant bit) to 31 (most significant bit):

  • Bit 31 - Unused.
  • Bit 30 - Compression flag: set to 1 if the file is compressed else 0.
  • Bit 0 to 29 (30 bits) - File length divided by 0x800 bytes (2048).

So max length of a pzz packed file is (2^30) * 2048: more than 2 Tb (2 199 023 255 552 octets). In others words there is no file length restriction.

File length (in bytes) is computed as following:

  • (file_descriptor & 0x3FFFFFFF) * 0x800

Here 0x3FFFFFFF is a mask allowing to retrieve the 30 least significant bits from the 32 bits of the file_descriptor.

Compression flag is retrieved with another mask:

  • file_descriptor & 0x40000000.

To better understand how a file descriptor works here is an example:

The first file to be packed has a length of 12kb (12 000 bytes) and is packed compressed. So it's file descriptor is equal to:

  • upper_round(12000/0x800)+0x40000000 = 0x40000006

So the ``40 00 00 06`` is stored just after the header file count field. A file descriptor describe sometimes an empty file, in this case the file_descriptor is equal to "00 00 00 00" and will be counted in the header file count field.

Files padding

When a pzz packed file is stored uncompressed then we find a padding added to it's end because of the 0x800 align. In this case it's impossible to know the padding length and how to remove it exactly since the file could contains ending Null bytes.

Compression algorithm

The compression algorithm has to be investigated.

Observations

Fichiers et compression

All stxx.pzz -> 000 packed files are stored uncompressed same as firstld.pzz -> 000, 001, 002 and 005 files. Others pzz files are all stored compressed. (NTSC/USA version)


stxx.pzz files (40)

  • 001 -> same files than hitxxx.bin
  • 002 -> same files than hitxxx.bin
  • 003 -> same files than hitxxx.bin
  • 004 -> ?

A same position can match several hitxxx.bin. Actually hitxxx.bin are sometimes duplicated in the afs_data.afs. HITS files start with magic number "STIH" (big endian string).

gets.pzz files

  • 000 -> ?
  • 001 -> ?
  • 002 -> ?
  • 003 -> ?
  • 004 -> ?
  • 005 -> ?
  • 006 -> ?
  • 007 -> ?
  • 008 -> hit000.bin from afs_data
  • 009 -> hit001.bin from afs_data
  • 010 -> hit002.bin from afs_data


firstld.pzz file

  • 000 -> snd_com04.tsb (padding problem when unpacked because stored uncompressed "U")
  • 001 -> snd_com04.chd (padding problem when unpacked because stored uncompressed "U")
  • 002 -> ?
  • 003 -> icon.bin from l'afs_data
  • 004 -> ?
  • 005 -> mc_msg00.mdt from l'afs_data (unpacked name has to be implemented in pzztool.py)
  • 006 -> as_icon.tpl from l'afs_data (unpacked name has to be implemented in pzztool.py)

efct.tpl file

  • 000 -> efct00.tpl from afs_data
  • 001 -> efct00_mdl.arc from afs_data
  • 002 -> efct01_mdl.arc from afs_data

cmn_data.pzz file

  • 000 -> comhit.bin (unpacked name has to be implemented in pzztool.py)
  • 001 -> comhit2.bin (unpacked name has to be implemented in pzztool.py)
  • 002 -> dm0000mot.bin (unpacked name has to be implemented in pzztool.py)
  • 003 -> plcmndata.bin (unpacked name has to be implemented in pzztool.py)

Software

Virtual World RE provide the python script pzztool.py handling PZZ files unpack / uncompress / compress / pack inspired from a PS2 pzz file handling algorithm.