IFF Format overview

The following is a quick description of the flib library. It implements a generic structured file access mechanism based on a generalized IFF format. This is what we currently use for images and textures.

Kernel

File type independence

The primary goal of the flib library is to present all file accesses in a homogeneous manner. Disk files, pipes, memory segments, and so on are all logically represented as files and are manipulated through the same set of functions. The flib kernel is composed of eight functions: FLopen, FLreopen, FLclose, FLread, FLwrite, FLseek, FLtell and FLflush. These low level IO routines can be used instead of the libc routines (open, fopen, read, fread, write, fwrite...).

Another advantage is the removal of certain restrictions caused by the files opening mode. For example, writing in a read open pipe file (such as "pipe:cat file") is made possible.

FLopen uses naming conventions to identify the type of logical file you wish to handle (no longer necessary to specify different methods for open, popen, fopen, socket, and so on). The currently recognized names are:

File name	Description
name	ordinary disk file
name.Z	compressed file
mmap:name	memory mapped file
pipe:cmd [args]	standard input (output) of cmd
fd:#	file descriptor number #, 0,1,2 for stdin, stdout, stderr
stdin, stdout, stderr	aliases for descriptors 0,1 and 2
host:name	file on remote host
user@host:name	file on remote host accessed via user account
mem:addr	memory segment at address addr

Certain limitations exist depending on the exact nature of the opened file object (e.g. can’t FLseek on a pipe).

Files are buffered when this makes sense. Transfers can equally be accelerated by minimizing the amount of memory moves. FLbgnread, FLendread, FLbgnwrite and FLendwrite allow you to gain direct access to the read/write buffers in the library. These are particularly efficient for memory mapped files.

Format independence

File access libraries based on flib can provide an extra degree of independence with respect to the format of the data it contains by using the FLfilter function which lets you pass a file through an external filter before reading/writing it.

Control

Error handling is similar to that offered by the standard C IO functions and system calls. The state variable flerror is modified when a error occurs and several functions are provided to allow access to this value: FLerror, FLseterror, FLperror, FLstrerror, FLoserror, and FLsetoserror. The set of errors that are handled is a superset of the Linux standard errors (errno, strerror, h_errno and hstrerror if supported by the system).

IO functions are implemented to avoid cascading errors. However, it is strongly suggested that you do not attempt to continue reading/writing when an error occurs.

Certain parameters of the library can be modified by calling FLconfig: creation of temporary files, mapping, automatic compression/decompression.

FLsetpath and FLaddpath allow you to define and augment the path used to resolve file names for read access.

FLbuildpath and FLfreepath are used to construct and destroy path that are activatable by the FLswitchpath call which optimizes frequent path changes. FLsetreorder can also be called to optimize path traversal.

Structured files

Flib implements a set of rules for file structure derived from IFF. The structure is based on the use of tags to identify blocks of data called chunks or structures of chunks called groups. Each tag is made up of four characters and is immediately followed by the size of the chunk or group that it describes coded on 4 bytes. This structure is the same as in the IFF (Interchange File Format), with a few extensions. All data is written in big endian format, except for tags, which are handled as pseudo character strings. (Byte swapping is handled at compile time).

Block size data allows the parser to skip information it does not recognize.

There are two types of tags: tags that define the file structure (i.e groups) and tags that contain data.

Groups

Four tags are used to arrange blocks into groups: FORM, CAT, LIST, and PROP. The first four characters following the size are used to identify the type of the group.

FORM defines the beginning of a data block, in a way similar to a C struct.

FORM 38 TEXT

	CHAR 6 "Times"

	CHAR 12 "Hello World"

EOF

is similar to

struct Text t = {

	char *f = "Times";

	char *c = "Hello World";

};

The size of the group (38) is equal to the size of the data it contains (6 plus 12) plus the size of the headers (4 for TEXT, 8 for CHAR 6 and 8 for CHAR 12) giving, in this case, 6+12+4+8+8 = 38.

As in C structures you can nest groups as in the following example:

FORM 52 TEXT

	FORM 8 FONT

		CHAR 6 "Times"

		LONG 4 <12>

		LONG 4 <0>

	CHAR 12 "Hello World"

EOF

or in C terms:

struct Text t = {

	struct Font f = {

		char *n = "Times";

		int s = 12;

		int d = 0;

};

	char *string = "Hello World";

};

This example may not show that blocks are not constrained to use a unique data type and may contain the equivalent of a complete C structure. The role of the FORM tag is to separate independent blocks of data that can be handled separately and to specify the meaning of each sub-unit. In the example above the CHAR chunk in the FONT FORM does not mean the same thing as the CHAR chunk in the TEXT FORM. The FORM tag is used to determine how you interpret an ordered set of data types.

CAT defines a concatenation of independent objects with no order relation between them. Two typical uses of CAT’s are for libraries of objects (pictures in the upcoming example) or clipboards (second example).

CAT 3632 PICT

	FORM 1234 PICT ...

	FORM 2378 PICT ...

EOF

CAT 2130 CLIP

	FORM 1234 PICT ...

	FORM 876 DRAW ...

EOF

Searching through a structured file is generally greatly accelerated, even in a CAT that has no order amongst its members, through the knowledge of the size of every group or chunk specified in the header.

LIST defines an ordered set of objects (FORM data blocks) and, along with PROP, is used to group objects with similar properties, avoiding redundancy. For example a sequence of equal sized images might be represented in the following way:

One image would have a structure like:

FORM .... PICT

	IHDR 32 [image size info]

	BODY ... [image data]

EOF

then a sequence of like-sized images could be done as follows, sharing the common header information:

LIST ... ANIM

	PROP 44 PICT

		IHDR 32

	FORM ... PICT

		BODY ....

	FORM ... PICT

		BODY ....

	FORM ... PICT

		BODY ....

EOF

The information specified in a PROP construct applies until the end of the LIST. They can be redefined locally in a FORM the same way local C variables can (in the above example the common IHDR is valid in all PICTs that don’t include an IHDR block of their own.

Data blocks

Data blocks are defined by:

[tag] [size] [data]

Example: an image could have the following structure:

FORM 12304 IMAG

	IHDR 200 ... picture header, size, maps ...

	LINE 800 ... data from line 1 ...

	LINE 800 ... data from line 2 ...

...

for a library:

CAT 64200 IMAG

	FORM 12304 IMAG

		IHDR 200

...

	FORM 12304 IMAG

...

and for a sequence,

LIST 64200 IMAG

	PROP 208 IMAG

		IHDR 200		... Common header ...

	FORM 12394 IMAG

...

	FORM 12304 IMAG

		IHDR 200		... Local redefinition ...

...

Alignment considerations

IFF blocks align to 2 byte boundaries. The size specified in the header does not take the padding into account. Current machines typically align their memory on 4 or 8 byte boundaries. Flib uses 8 extra TAGS to let you specify alignment information. Four are used to align to four byte boundaries (FOR4, CAT4, LIS4 and PRO4) and four are used to align to 8 byte boundaries (FOR8, CAT8, LIS8 and PRO8).

Data blocks inherit the alignment of the group that contains them (as well as any sub-groups; hence it’s illegal to create a group aligned to 2 byte boundaries inside a group aligned to 4 byte boundaries. However the reverse is perfectly valid.)

Extensions

One of the major constraints of IFF is that you have to know the size of a group or a chunk before writing it to a file. If you want to change the information a block contains you have to be able to modify the header to reflect changes in the size of the structure. This poses no problem for seekable file (memory or disk files) but does pose problems for other types of files. Rather than create intermediate temporary files, flib implements a mechanism allowing you to say that you don’t know the size of the block you are working on. Since negative block sizes are meaningless, two special values are set aside for this purpose: FL_szFile, indicating that the size will be written in later once the entire group has been written and FL_szFifo indicating that the size will not be written because the file is not seekable, A special zero sized block (GEND) is used to indicate the end of the structure.

Functions

Blocks can be read and written using calls to FLgetchunk and FLputchunk. For more direct control the user can call FLbgnget and FLbgnput to open a block. FLput and FLget supply services equivalent to FLread and FLwrite within a block. After appropriate number of FLput or FLget calls you close the block using FLendput or FLendget.

Groups are handled using FLbgnrgroup, FLbgnWgroup, FLendrgroup and FLendwgroup. Flib also implements a generic parser, FLparse, that can scan a file and check its consistency as well install callbacks for each step of the parse (start of group end of group).