BYTEA vs OID (Large Objects)

Top Previous Next

In PostgreSQL there are two ways for storing Binary Large OBjects (BLOBs). First, is to use bytea type. And the second, is to use ability of Large Objects:

BYTEA type

BYTEA type or binary string is very similar to simple character strings, like varchar and text. However, they are distinguished by two characteristics:

First, binary strings specifically allow storing octets of value zero and other "non-printable" octets (usually, octets outside the range 32 to 126). Character strings disallow zero octets, and also disallow any other octet values and sequences of octet values that are invalid according to the database's selected character set encoding.

Second, operations on binary strings process the actual bytes, whereas the processing of character strings depends on locale settings. In short, binary strings are appropriate for storing data that the programmer thinks of as "raw bytes", whereas character strings are appropriate for storing text.

Similitude means that BYTEA values will be included to the result data, which may cost efficiency. In other words, PostgresDAC will load all data from server locally and then user will be able to work with it.

Developer must also remember sending BYTEA values to server that octets of certain values must be escaped. In general, to escape an octet, it is converted into the three-digit octal number equivalent of its decimal octet value, and preceded by two backslashes, e.g. '\\xxx'. Thus in the worst case the size of the data sent through network may be increased a lot.

OID type (Large Objects)

PostgreSQL has a large object facility, which provides stream-style access to user data that is stored in a special large-object structure. Streaming access is useful when working with data values that are too large to manipulate conveniently as a whole.

All large objects are placed in a single system table called pg_largeobject. PostgreSQL also supports a storage system called "TOAST" that automatically stores values larger than a single database page into a secondary storage area per table. This makes the large object facility partially obsolete. One remaining advantage of the large object facility is that it allows values up to 2 GB in size, whereas TOASTed fields can be at most 1 GB. Also, large objects can be randomly modified using a read/write API that is more efficient than performing such operations using TOAST.

The large object implementation breaks large objects up into "chunks" and stores the chunks in rows in the database. A B-tree index guarantees fast searches for the correct chunk number when doing random access reads and writes.

It's possible that pg_largeobject table will hold some object data, but it might not be used anywhere in database. For now there are no build-in methods to remove this ghosts, as set forth in PostgreSQL help. But there is a thought that this may be done by VACUUM command. However, this functionality is not implemented yet.

Comparative table

Characteristic	BYTEA	OID
Max. allowed space	1 GB	2 GB
Data access	As a whole	Stream-style
Storage	In defined table	In pg_largeobject system table
Data manipulation	Using SQL and escaping sequnces	Only within transaction block by special functions
Loading	Preload	On demand