PostgreSQL: How the Rows are stored Physically using ctid?

PostgreSQL stores data into logical equivalent pages.

The size of each data page is 8KB.
The structure used to store the table is a heap file which also stores header information like: checksum, start of free space, end of free space.

It contains a pair of offset & length, which actually points to the actual items.

Postgres data called immutable row data. These immutable rows are called “tuples” in PostgreSQL.
Tuples are uniquely identified by ctid. A ctid conceptually represents the physical disk location for a tuple.

Because of MVCC architecture, multiple ctid can describe a single row which sometime degrade the write performance of PostgreSQL.

What is Multi Version Concurrency Control (MVCC)

For example,
When we are updating any data, internally MVCC inserts new version of record and marks old record as dead row.

This operation requires to create new ctid for new rows, new link for index, remove old link of record.
Yes, this is good to avoid locking issue, but write performance is not that much good.

A CTIDs are transparent to users, and any user can check table CTID by adding one additional ctid column in any of the table.
PostgreSQL internally creates CTID column for each and every table.

For example,

Execute VACUUM or AUTO-VACUUM is required to increase the overall performance of PostgreSQL which removes the dead rows or old version rows, generated by MVCC architecture.

PostgreSQL: Short note on VACUUM, VACUUM FULL and ANALYZE

Anvesh Patel

Leave a Reply

Be the First to Comment!

Notify of