Database Research & Development

  • Home
  • NoSQL
    • NoSQL
    • Cassandra
  • Databases
    • Database Theory
    • Database Designing
    • SQL Server Coding Standards
    • SQL Server
    • PostgreSQL
    • MySQL
    • Greenplum
    • Linux
  • Interviews
    • SQL Server Interviews
    • MySQL Interviews
    • SQL Puzzles
  • DBA Scripts
    • SQL Server DBA Scripts
    • PostgreSQL DBA Scripts
    • MySQL DBA Scripts
    • Greenplum DBA Scripts
  • Home
  • Blog Archives !
  • (: Laugh@dbrnd :)
  • Contact Me !
sqlserverinterviews
Home 2016 December PostgreSQL: CLUSTER – Improve Index Performance (No default cluster index)

PostgreSQL: CLUSTER – Improve Index Performance (No default cluster index)

This article is half-done without your Comment! *** Please share your thoughts via Comment ***

I have received a few emails related to PostgreSQL default cluster index and I have received from the developers of Microsoft SQL Server.
If you are Microsoft SQL Server developer, you know that Primary key of the Table is working as Cluster Index and resides in the same data page where data actually stored.

In PostgreSQL, There is no concept like: Table Primary Key means default Cluster Index of that table.
In PostgreSQL, we have one CLUSTER command which is similar to Cluster Index.

Once you create your table primary key or any other Index, you can execute the CLUSTER command by specifying that Index name to achieve the physical order of the Table Data.

CLUSTER instructs PostgreSQL to cluster the table specified by table_name based on the index specified by index_name. The index must already have been defined on table_name.

A clustering on an index forces the physical ordering of the data and you can have only one clustered index per table and should carefully pick which index you will use to cluster.

When a table is clustered, it is physically reordered based on the index information. Clustering is a one-time operation: when the table is subsequently updated, the changes are not clustered. That is, no attempt is made to store new or updated rows according to their index order.

If you wish, you can periodically recluster by issuing the CLUSTER command again and again.

A executing of CLUSTER command requires an ACCESS EXCLUSIVE lock on the table because It reorders your data. You can also get some free space by issuing CLUSTER because It rearranges the data in a particular order.

When you execute the CLUSTER command, It creates a temporary copy of the table therefore, you need free space on disk at least equal to the original table size.

When an index scan is used, a temporary copy of the table is created that contains the table data in the index order. Temporary copies of each index on the table are created as well. Therefore, you need free space on disk at least equal to the sum of the table size and the index sizes.

It is advisable to set maintenance_work_mem to a reasonably large value before clustering.

Once you are done with your CLUSTER, you should execute the ANALYZE command on that table because planner records statistics about the ordering of tables.

Syntax of Cluster:
First time you must execute CLUSTER using the Index Name.

1
CLUSTER table_name USING index_name;

Cluster the table:
Once you have executed CLUSTER with Index, next time you should execute only CLUSTER TABLE because It knows that which index already defined as CLUSTER.

1
CLUSTER table_name;

Cluster all tables of database:

1
CLUSTER;

Dec 2, 2016Anvesh Patel
Database Theory: What is difference between MySQL and PostgreSQL?Database News: Now Uber switched from PostgreSQL to MySQL
Anvesh Patel
Anvesh Patel

Database Engineer

December 2, 2016 PostgreSQLAnvesh Patel, CLUSTER, Cluster Index, database, database research and development, dbrnd, explicit lock, fragmentation, Index Performance, Order of data, plpgsql, Postgres Query, postgresql, PostgreSQL Administrator, PostgreSQL Error, PostgreSQL Programming, PostgreSQL Tips and Tricks, Primary key
About Me!

I'm Anvesh Patel, a Database Engineer certified by Oracle and IBM. I'm working as a Database Architect, Database Optimizer, Database Administrator, Database Developer. Providing the best articles and solutions for different problems in the best manner through my blogs is my passion. I have more than six years of experience with various RDBMS products like MSSQL Server, PostgreSQL, MySQL, Greenplum and currently learning and doing research on BIGData and NoSQL technology. -- Hyderabad, India.

About DBRND !

dbrnd

This is a personal blog (www.dbrnd.com).

Any views or opinions represented in this blog are personal and belong solely to the blog owner and do not represent those of people, institutions or organizations that the owner may or may not be associated with in professional or personal capacity, unless explicitly stated.

Feel free to challenge me, disagree with me, or tell me I’m completely nuts in the comments section of each blog entry, but I reserve the right to delete any comment for any reason whatsoever (abusive, profane, rude, or anonymous comments) - so keep it polite.

The content of this website is protected by copyright. No portion of this website may be copied or replicated in any form without the written consent of the website owner.

Recent Comments !
  • Anvesh Patel { Sure will do... } – May 27, 12:43 PM
  • Anvesh Patel { Great... } – May 27, 12:41 PM
  • Anvesh Patel { Great... } – May 27, 12:39 PM
  • Anvesh Patel { Great... } – May 27, 12:36 PM
  • Anvesh Patel { Great... } – May 27, 12:28 PM
  • Anvesh Patel { Great... } – May 27, 12:27 PM
  • Anvesh Patel { Great... } – May 27, 12:16 PM
  • Older »
Follow Me !
  • facebook
  • linkedin
  • twitter
  • youtube
  • google
  • flickr
© 2015 – 2019 All rights reserved. Database Research & Development (dbrnd.com)
Posting....