Toggle menu
Toggle personal menu
Not logged in
Your IP address will be publicly visible if you make any edits.

PZZ (Gotcha Force): Difference between revisions

No edit summary
 
(22 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[Gotcha Force | ← Gotcha Force]]
[[Gotcha Force | ← Gotcha Force]]


''Cet article est pour les format de fichiers PZZ de Gotcha Force. Voir [[PZZ (Format de fichier)]] pour les autres versions de ce fichier.''
''This article is about Gotcha Force PZZ file format and ongoing researches on it.''


Les fichiers '''PZZ''' sont des dossiers d'archive.
{{Research | 1| The structure of this file is well known. }}


<div style="text-align: center;">
'''PZZ''' files are archive packing files with a compression algorithm available.
<h2 style="color: rgb(241, 196, 15);">Cette section est en cours de rédaction.<h2>
<h4 style="color: rgb(241, 196, 15); text-align: center;">Des recherches sont encore nécessaires et certains paragraphes peuvent être faux.<h4>
</div>


__TOC__
== Format ==
 
PZZ files have a fixed length header of [[#Header|header]] 0x800 bytes (2048 bytes).
 
We found files directly after the [[#Header|header]] one after the other (See [[#note_format_1|note]]). Files can be compressed or not. Theoretically a pzz file can handle max 511 files (See [[#Header|Header]]). In the Gotcha Force case pzz files are in big endian (this is not the case in wii games).


== Format ==
Note: Every file is aligned to a 0x800 bytes (2048) multiple with Null pading ("\x00") before to be packed.


Les PZZ se composent d'un header de 2048 octets, soit 0x800 octets. A la suite de ce header se trouvent les fichiers inscrit les uns à la suite des autres. Ces fichiers peuvent être compressés ou non.<br>
=== Header ===
=== Header ===
Header:
* 4 bytes - file_count uint - Total files count packed in the pzz.
* 4 bytes[file_count] - uint - [[#File descriptors|file_descriptors]] describing files length and compression.
Note: Header length is 0x800 bytes (2048) and each file_descriptor takes 4 bytes, this allows max 2048/4-1 max total file_discriptors so the PZZ cannot contain more than 511 files. All pzz from Gotcha Force contain less than 20 files.
==== File descriptors format ====


Le premier champ du header est un uint32 big endian contenant le nombre de fichiers total de l'archive. On retrouve après une suite de taille variable de descripteurs de fichiers au format uint32 big endian. Chaque fichier est paddé avec des "\x00" pour avoir une taille multiple de 0x800.
If we assign an index to each bit of a file_descriptor from 0 (least significant bit) to 31 (most significant bit):
* Bit 31 - Unused.
* Bit 30 - Compression flag: set to 1 if the file is compressed else 0.
* Bit 0 to 29 (30 bits) - File length divided by 0x800 bytes (2048).


==== Format descripteur de fichier ====
So the max length of a pzz packed file is (2^30) * 2048: more than 2 Tb (2 199 023 255 552 octets). In others words there is no file length restriction.


Si on numérote les bits d'un descripteur de fichier de 0 (poids faible) à 31 (poids fort) :
File length (in bytes) is computed as following:
* (file_descriptor & 0x3FFFFFFF) * 0x800
Here 0x3FFFFFFF is a mask allowing to retrieve the 30 least significant bits from the 32 bits of the file_descriptor.


* bit 31 - inutilisé,
Compression flag is retrieved with another mask:
* bit 30 - Flag de compression : est à 1 si le fichier est compressé, 0 sinon,
* file_descriptor & 0x40000000.
* bit 0 à 29 (30 bits) - taille du fichier divisée par 2048 (0x800),
taille_fichier (octets) = (descripteur_fichier & 0x3FFFFFFF) * 0x800<br>
bCompression = descripteur_fichier & 0x40000000


Pour mieux comprendre le format du descripteur de fichier, voici un exemple :
To better understand how a file descriptor works here is an example:


Le premier fichier de l'archive fait 12288 octets et il est compressé, son descripteur est alors (12288/0x800)+0x40000000 = 0x40000006. On le stockera dans l'entête en big endian (40 00 00 06) juste à la suite du nombre de fichiers.
The first file to be packed has a length of 12kb (12 000 bytes) and is packed compressed. So it's file descriptor is equal to:
* upper_round(12000/0x800)+0x40000000 = 0x40000006
So the '''40 00 00 06''' is stored just after the header file count field. A file descriptor describes sometimes an empty file, in this case the file_descriptor is equal to "00 00 00 00" and will be counted in the header file count field.


Le descripteur de fichier peut décrire un fichier vide. Il correspondra alors à "00 00 00 00" mais sera compté dans le nombre de fichier au début du header.
==== Files padding ====


==== Padding des fichiers ====
<span style="color: rgb(241, 196, 15);">When a pzz packed file is stored uncompressed then we find a padding added to its end because of the 0x800 align. In this case it's impossible to know the padding length and how to remove it exactly since the file could contain ending Null bytes.</span>


<span style="color: rgb(241, 196, 15);">Lors ce que le fichier à extraire du pzz n'est pas compressé, on a alors un padding présent à sa fin. Il devient alors impossible de déterminer quel padding enlever précisément. En effet, le fichier peut se terminer par des "00".</span>
=== Compression algorithm ===
The compression algorithm has to be investigated.


=== Algorithme de compression ===
L'algorithme de compression reste à déterminer.
=== Observations ===
=== Observations ===
==== Fichiers et compression ====
==== Files and compression ====
Tous les fichiers stxx.pzz -> 000 sont packés non compressés ainsi que les fichiers firstld.pzz -> 000, 001, 002 et 005. Les autres fichiers des pzz sont tous packés compressés.
All stxx.pzz -> 000 packed files are stored uncompressed same as firstld.pzz -> 000, 001, 002 and 005 files. Others pzz files are all stored compressed. (NTSC/USA version)


==== Fichiers stxx.pzz (40) ====
==== stxx.pzz files (40) ====
* 001 -> fichier identique dans les fichiers hitxxx.bin
* 001 -> same files than hitxxx.bin
* 002 -> fichier identique dans les fichiers hitxxx.bin
* 002 -> same files than hitxxx.bin
* 003 -> fichier identique dans les fichiers hitxxx.bin
* 003 -> same files than hitxxx.bin
* 004 -> ?
* 004 -> ?


Une même position peux correspondre à plusieurs fichiers hitxxx.bin à la fois. En effet, les fichiers hitxxx.bin peuvent être identiques les uns des autres.
A same position can match several hitxxx.bin. Actually hitxxx.bin are sometimes duplicated in the afs_data.afs. HITS files start with magic number "STIH" (big endian string).
 
Les fichiers hits commencent par le magic number "STIH" soit HITS en big endian.


==== Fichier gets.pzz ====
==== gets.pzz files ====
* 000 -> ?
* 000 -> ?
* 001 -> ?
* 001 -> ?
Line 64: Line 72:
* 006 -> ?
* 006 -> ?
* 007 -> ?
* 007 -> ?
* 008 -> hit000.bin de l'afs_data
* 008 -> hit000.bin from afs_data
* 009 -> hit001.bin de l'afs_data
* 009 -> hit001.bin from afs_data
* 010 -> hit002.bin de l'afs_data
* 010 -> hit002.bin from afs_data
 


=== Fichier firstld.pzz ===
=== firstld.pzz file ===
* 000 -> snd_com04.tsb (problème de padding car le fichier est unpack en "U")
* 000 -> snd_com04.tsb (padding problem when unpacked because stored uncompressed "U")
* 001 -> snd_com04.chd (problème de padding car le fichier est unpack en "U")
* 001 -> snd_com04.chd (padding problem when unpacked because stored uncompressed "U")
* 002 -> ?
* 002 -> ?
* 003 -> icon.bin de l'afs_data
* 003 -> icon.bin from l'afs_data
* 004 -> ?
* 004 -> ?
* 005 -> mc_msg00.mdt de l'afs_data (à implémenter dans pzztool.py)
* 005 -> mc_msg00.mdt from l'afs_data (unpacked name has to be implemented in pzztool.py)
* 006 -> as_icon.tpl de l'afs_data (à implémenter dans pzztool.py)
* 006 -> as_icon.tpl from l'afs_data (unpacked name has to be implemented in pzztool.py)


=== Fichier efct.tpl ===
=== efct.tpl file ===
* 000 -> efct00.tpl de l'afs_data
* 000 -> efct00.tpl from afs_data
* 001 -> efct00_mdl.arc de l'afs_data
* 001 -> efct00_mdl.arc from afs_data
* 002 -> efct01_mdl.arc de l'afs_data
* 002 -> efct01_mdl.arc from afs_data


=== Fichier cmn_data.pzz ===
=== cmn_data.pzz file ===
* 000 -> comhit.bin (à implémenter dans pzztool.py)
* 000 -> comhit.bin (unpacked name has to be implemented in pzztool.py)
* 001 -> comhit2.bin (à implémenter dans pzztool.py)
* 001 -> comhit2.bin (unpacked name has to be implemented in pzztool.py)
* 002 -> dm0000mot.bin (à implémenter dans pzztool.py)
* 002 -> dm0000mot.bin (unpacked name has to be implemented in pzztool.py)
* 003 -> plcmndata.bin (à implémenter dans pzztool.py)
* 003 -> plcmndata.bin (unpacked name has to be implemented in pzztool.py)


== Logiciel ==
== Software ==


Virtual World RE a développé le script python [https://github.com/Virtual-World-RE/NeoGF pzztool.py] permettant de manipuler les archives PZZ et leurs fichiers internes, en s'inspirant [https://github.com/infval/pzzcompressor_jojo d'un script de manipulation pzz de PS2].
Virtual World RE provides the python script [https://github.com/Virtual-World-RE/NeoGF/tree/main/pzztool pzztool.py] handling PZZ files unpack / uncompress / compress / pack inspired from [https://github.com/infval/pzzcompressor_jojo a PS2 pzz file handling algorithm].


[[Catégorie:Format de fichier]]
[[Category:File format]]
[[Catégorie:Gotcha Force]]
[[Category:Gotcha Force]]

Latest revision as of 13:48, 7 October 2023

← Gotcha Force

This article is about Gotcha Force PZZ file format and ongoing researches on it.


This file format is almost completely documented.
The structure of this file is well known.


PZZ files are archive packing files with a compression algorithm available.

Format

PZZ files have a fixed length header of header 0x800 bytes (2048 bytes).

We found files directly after the header one after the other (See note). Files can be compressed or not. Theoretically a pzz file can handle max 511 files (See Header). In the Gotcha Force case pzz files are in big endian (this is not the case in wii games).

Note: Every file is aligned to a 0x800 bytes (2048) multiple with Null pading ("\x00") before to be packed.

Header

Header:

  • 4 bytes - file_count uint - Total files count packed in the pzz.
  • 4 bytes[file_count] - uint - file_descriptors describing files length and compression.

Note: Header length is 0x800 bytes (2048) and each file_descriptor takes 4 bytes, this allows max 2048/4-1 max total file_discriptors so the PZZ cannot contain more than 511 files. All pzz from Gotcha Force contain less than 20 files.

File descriptors format

If we assign an index to each bit of a file_descriptor from 0 (least significant bit) to 31 (most significant bit):

  • Bit 31 - Unused.
  • Bit 30 - Compression flag: set to 1 if the file is compressed else 0.
  • Bit 0 to 29 (30 bits) - File length divided by 0x800 bytes (2048).

So the max length of a pzz packed file is (2^30) * 2048: more than 2 Tb (2 199 023 255 552 octets). In others words there is no file length restriction.

File length (in bytes) is computed as following:

  • (file_descriptor & 0x3FFFFFFF) * 0x800

Here 0x3FFFFFFF is a mask allowing to retrieve the 30 least significant bits from the 32 bits of the file_descriptor.

Compression flag is retrieved with another mask:

  • file_descriptor & 0x40000000.

To better understand how a file descriptor works here is an example:

The first file to be packed has a length of 12kb (12 000 bytes) and is packed compressed. So it's file descriptor is equal to:

  • upper_round(12000/0x800)+0x40000000 = 0x40000006

So the 40 00 00 06 is stored just after the header file count field. A file descriptor describes sometimes an empty file, in this case the file_descriptor is equal to "00 00 00 00" and will be counted in the header file count field.

Files padding

When a pzz packed file is stored uncompressed then we find a padding added to its end because of the 0x800 align. In this case it's impossible to know the padding length and how to remove it exactly since the file could contain ending Null bytes.

Compression algorithm

The compression algorithm has to be investigated.

Observations

Files and compression

All stxx.pzz -> 000 packed files are stored uncompressed same as firstld.pzz -> 000, 001, 002 and 005 files. Others pzz files are all stored compressed. (NTSC/USA version)

stxx.pzz files (40)

  • 001 -> same files than hitxxx.bin
  • 002 -> same files than hitxxx.bin
  • 003 -> same files than hitxxx.bin
  • 004 -> ?

A same position can match several hitxxx.bin. Actually hitxxx.bin are sometimes duplicated in the afs_data.afs. HITS files start with magic number "STIH" (big endian string).

gets.pzz files

  • 000 -> ?
  • 001 -> ?
  • 002 -> ?
  • 003 -> ?
  • 004 -> ?
  • 005 -> ?
  • 006 -> ?
  • 007 -> ?
  • 008 -> hit000.bin from afs_data
  • 009 -> hit001.bin from afs_data
  • 010 -> hit002.bin from afs_data


firstld.pzz file

  • 000 -> snd_com04.tsb (padding problem when unpacked because stored uncompressed "U")
  • 001 -> snd_com04.chd (padding problem when unpacked because stored uncompressed "U")
  • 002 -> ?
  • 003 -> icon.bin from l'afs_data
  • 004 -> ?
  • 005 -> mc_msg00.mdt from l'afs_data (unpacked name has to be implemented in pzztool.py)
  • 006 -> as_icon.tpl from l'afs_data (unpacked name has to be implemented in pzztool.py)

efct.tpl file

  • 000 -> efct00.tpl from afs_data
  • 001 -> efct00_mdl.arc from afs_data
  • 002 -> efct01_mdl.arc from afs_data

cmn_data.pzz file

  • 000 -> comhit.bin (unpacked name has to be implemented in pzztool.py)
  • 001 -> comhit2.bin (unpacked name has to be implemented in pzztool.py)
  • 002 -> dm0000mot.bin (unpacked name has to be implemented in pzztool.py)
  • 003 -> plcmndata.bin (unpacked name has to be implemented in pzztool.py)

Software

Virtual World RE provides the python script pzztool.py handling PZZ files unpack / uncompress / compress / pack inspired from a PS2 pzz file handling algorithm.