PICO-8 Wiki
Register
m (→‎Lua code: Corrected 0x47FF to 0x7FFF)
Tag: Visual edit
No edit summary
Tag: Source edit
(8 intermediate revisions by 4 users not shown)
Line 3: Line 3:
 
The .p8.png format is a binary format based on the [https://en.wikipedia.org/wiki/Portable_Network_Graphics PNG image format]. A .p8.png file is an image that can be viewed in any image viewer (such as a web browser). The image appears as the picture of a game cartridge. PICO-8 generates this image using the most recent screenshot taken when pressing the F7 key while the cart is running. If the first two lines of Lua code are comments (a title and byline), it also puts the text of these comments on the label image.
 
The .p8.png format is a binary format based on the [https://en.wikipedia.org/wiki/Portable_Network_Graphics PNG image format]. A .p8.png file is an image that can be viewed in any image viewer (such as a web browser). The image appears as the picture of a game cartridge. PICO-8 generates this image using the most recent screenshot taken when pressing the F7 key while the cart is running. If the first two lines of Lua code are comments (a title and byline), it also puts the text of these comments on the label image.
   
The cart data is stored using a [https://en.wikipedia.org/wiki/Steganography stegonographic process]. Each PICO-8 byte is stored as the two least significant bits of each of the four color channels, ordered ARGB. The image is 160 pixels wide and 205 pixels high, for a possible storage of 32,800 bytes. Of these, only the first 32,769 bytes are used.
+
The cart data is stored using a [https://en.wikipedia.org/wiki/Steganography steganographic process]. Each PICO-8 byte is stored as the two least significant bits of each of the four color channels, ordered ARGB (E.g: the A channel stores the 2 most significant bits in the bytes). The image is 160 pixels wide and 205 pixels high, for a possible storage of 32,800 bytes. Of these, only the first 32,773 bytes are used.
   
 
== Graphics and sound ==
 
== Graphics and sound ==
Line 11: Line 11:
 
== Lua code ==
 
== Lua code ==
   
  +
Bytes 0x4300-0x7fff are the Lua code.
Bytes 0x4300-0x7fff are the Lua code. If the first four bytes (0x4300-0x4303) are <code>:c:</code> followed by a null (<code>\x00</code>), then the code is stored as compressed data. Otherwise the code is stored as plaintext (ASCII), up to the first null byte. PICO-8 compresses the code if it is larger than a certain threshold.
 
   
If compressed, the next two bytes after <code>:c:\x00</code> (0x4304-0x4305) are the length of the code decompressed, stored MSB first. This is followed by two null bytes (0x4306-0x4307). The remainder (0x4308-0x7fff) is the compressed data.
+
If the first four bytes (0x4300-0x4303) are a null (<code>\x00</code>) followed by <code>pxa</code>, then the code is stored in the new (v0.2.0+) compressed format. (See below)
   
 
If the first four bytes (0x4300-0x4303) are <code>:c:</code> followed by a null (<code>\x00</code>), then the code is stored in the old (pre-v0.2.0) compressed format. (See below)
  +
  +
In all other cases, the code is stored as plaintext (ASCII), up to the first null byte.
  +
  +
== New Compressed Format ==
  +
* The first four bytes (0x4300-0x4303) are <code>\x00pxa</code>.
  +
* The next two bytes (0x4304-0x4305) are the length of the decompressed code, stored MSB first.
  +
* The next two bytes (0x4306-0x4307) are the length of the compressed data + 8 for this 8-byte header, stored MSB first.
  +
* The remainder (0x4308-0x7fff) is the compressed data.
  +
The decompression algorithm maintains a "[https://en.wikipedia.org/wiki/Move-to-front_transform move-to-front]" mapping of the 256 possible bytes. Initially, each of the 256 possible bytes maps to itself.
  +
  +
The decompression algorithm processes the compressed data bit by bit - going from LSB to MSB of each byte - until the expected length of decompressed characters has been emitted.
  +
  +
Each group of bits starts with a single header bit, specifying the group's type.
  +
* If that header bit is 1, an index is read via the following:
  +
<pre>
  +
-- read a unary value
  +
unary = 0
  +
while read_bit() == 1 do unary += 1 end
  +
  +
-- unary_mask ensures that each value of 'unary' allows the encoding of different indices
  +
unary_mask = ((1 << unary) - 1)
  +
index = read_bits(4 + unary) + (unary_mask << 4)
  +
</pre>
  +
This index is used as a 0-based index to the move-to-front mapping. The byte mapped by the index is written to the output stream.
  +
  +
This byte is then moved to the front of the move-to-front mapping. (E.g. if the mapping is 0,1,2,3,4,5,... and the index is 3, the mapping is updated to be 3,0,1,2,4,5,...)
  +
* Otherwise, if the header bit is 0, an offset and a length are read via the following:
  +
<pre>
  +
-- read the offset
  +
offset_bits = read_bit() ? (read_bit() ? 5 : 10) : 15
  +
offset = read_bits(offset_bits) + 1
  +
  +
-- read the length
  +
length = 3
  +
repeat
  +
part = read_bits(3)
  +
length += part
  +
until part != 7
  +
</pre>
  +
Then we go back "offset" characters in the output stream, and copy "length" characters to the end of the output stream. "length" may be larger than "offset", in which case we effectively repeat a pattern of "offset" characters.
  +
  +
== Old Compressed Format ==
  +
* The first four bytes (0x4300-0x4303) are <code>:c:\x00</code>.
  +
* The next two bytes (0x4304-0x4305) are the length of the decompressed code, stored MSB first.
  +
* The next two bytes (0x4306-0x4307) are always zero.
  +
* The remainder (0x4308-0x7fff) is the compressed data.
 
The decompression algorithm processes the compressed data one byte at a time, and performs an action based on the value, until the expected length of decompressed characters has been emitted:
 
The decompression algorithm processes the compressed data one byte at a time, and performs an action based on the value, until the expected length of decompressed characters has been emitted:
   
Line 24: Line 71:
 
offset = (current_byte - 0x3c) * 16 + (next_byte & 0xf)
 
offset = (current_byte - 0x3c) * 16 + (next_byte & 0xf)
 
length = (next_byte >> 4) + 2
 
length = (next_byte >> 4) + 2
  +
</pre>
</pre>Note that length can not be greater than offset. (Unlike typical length-offset encodings)
 
  +
 
Note that length can not be greater than offset. (Unlike typical length-offset encodings)
   
 
== Version ID ==
 
== Version ID ==
Line 34: Line 83:
 
== References ==
 
== References ==
   
  +
* For official C code released by Lexaloffle that supports the compression format: https://github.com/dansanderson/lexaloffle
 
* For a Python library that can read files in this format, see [[Picotool]] ([https://github.com/dansanderson/picotool GitHub]).
 
* For a Python library that can read files in this format, see [[Picotool]] ([https://github.com/dansanderson/picotool GitHub]).
 
* Forum post by asterick describing the code compression format: http://www.lexaloffle.com/bbs/?tid=2400
 
* Forum post by asterick describing the code compression format: http://www.lexaloffle.com/bbs/?tid=2400
 
* [https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Welch Lempel–Ziv–Welch compression]
 
* [https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Welch Lempel–Ziv–Welch compression]
  +
[[Category:Reference]]
  +
[[Category:Research]]

Revision as of 23:43, 1 May 2021

PICO-8 can save cartridges in two file formats: the .p8 format, and the .p8.png format. The save command will use the format that corresponds to the filename extension.

The .p8.png format is a binary format based on the PNG image format. A .p8.png file is an image that can be viewed in any image viewer (such as a web browser). The image appears as the picture of a game cartridge. PICO-8 generates this image using the most recent screenshot taken when pressing the F7 key while the cart is running. If the first two lines of Lua code are comments (a title and byline), it also puts the text of these comments on the label image.

The cart data is stored using a steganographic process. Each PICO-8 byte is stored as the two least significant bits of each of the four color channels, ordered ARGB (E.g: the A channel stores the 2 most significant bits in the bytes). The image is 160 pixels wide and 205 pixels high, for a possible storage of 32,800 bytes. Of these, only the first 32,773 bytes are used.

Graphics and sound

Bytes 0x0000-0x42ff are the spritesheet, map, flags, music, and sound effects data. These are copied directly into memory when the cart runs. See Memory for a complete explanation of the order and format of this data.

Lua code

Bytes 0x4300-0x7fff are the Lua code.

If the first four bytes (0x4300-0x4303) are a null (\x00) followed by pxa, then the code is stored in the new (v0.2.0+) compressed format. (See below)

If the first four bytes (0x4300-0x4303) are :c: followed by a null (\x00), then the code is stored in the old (pre-v0.2.0) compressed format. (See below)

In all other cases, the code is stored as plaintext (ASCII), up to the first null byte.

New Compressed Format

  • The first four bytes (0x4300-0x4303) are \x00pxa.
  • The next two bytes (0x4304-0x4305) are the length of the decompressed code, stored MSB first.
  • The next two bytes (0x4306-0x4307) are the length of the compressed data + 8 for this 8-byte header, stored MSB first.
  • The remainder (0x4308-0x7fff) is the compressed data.

The decompression algorithm maintains a "move-to-front" mapping of the 256 possible bytes. Initially, each of the 256 possible bytes maps to itself.

The decompression algorithm processes the compressed data bit by bit - going from LSB to MSB of each byte - until the expected length of decompressed characters has been emitted.

Each group of bits starts with a single header bit, specifying the group's type.

  • If that header bit is 1, an index is read via the following:
-- read a unary value
unary = 0
while read_bit() == 1 do unary += 1 end

-- unary_mask ensures that each value of 'unary' allows the encoding of different indices
unary_mask = ((1 << unary) - 1)
index = read_bits(4 + unary) + (unary_mask << 4)

This index is used as a 0-based index to the move-to-front mapping. The byte mapped by the index is written to the output stream.

This byte is then moved to the front of the move-to-front mapping. (E.g. if the mapping is 0,1,2,3,4,5,... and the index is 3, the mapping is updated to be 3,0,1,2,4,5,...)

  • Otherwise, if the header bit is 0, an offset and a length are read via the following:
-- read the offset
offset_bits = read_bit() ? (read_bit() ? 5 : 10) : 15
offset = read_bits(offset_bits) + 1

-- read the length
length = 3
repeat
  part = read_bits(3)
  length += part
until part != 7

Then we go back "offset" characters in the output stream, and copy "length" characters to the end of the output stream. "length" may be larger than "offset", in which case we effectively repeat a pattern of "offset" characters.

Old Compressed Format

  • The first four bytes (0x4300-0x4303) are :c:\x00.
  • The next two bytes (0x4304-0x4305) are the length of the decompressed code, stored MSB first.
  • The next two bytes (0x4306-0x4307) are always zero.
  • The remainder (0x4308-0x7fff) is the compressed data.

The decompression algorithm processes the compressed data one byte at a time, and performs an action based on the value, until the expected length of decompressed characters has been emitted:

  • 0x00: Copy the next byte directly to the output stream.
  • 0x01-0x3b: Emit a character from a lookup table: newline, space, 0123456789abcdefghijklmnopqrstuvwxyz!#%(){}[]<>+=/*:;.,~_
  • 0x3c-0xff: Calculate an offset and length from this byte and the next byte, then copy those bytes from what has already been emitted. In other words, go back "offset" characters in the output stream, copy "length" characters, then paste them to the end of the output stream. Offset and length are calculated as:
offset = (current_byte - 0x3c) * 16 + (next_byte & 0xf)
length = (next_byte >> 4) + 2

Note that length can not be greater than offset. (Unlike typical length-offset encodings)

Version ID

Byte 0x8000 encodes a version ID. This appears to have changed over multiple versions of PICO-8, but the file format has not changed.

Bytes 0x8001-0x8004 appear to encode a minor version or build number, as a 32-bit integer with the most significant byte first. It first became non-zero partway through version 8, and increased ever since. (Example values are 67959 and 70776)

References