Database Research & Development

  • Home
  • NoSQL
    • NoSQL
    • Cassandra
  • Databases
    • Database Theory
    • Database Designing
    • SQL Server Coding Standards
    • SQL Server
    • PostgreSQL
    • MySQL
    • Greenplum
    • Linux
  • Interviews
    • SQL Server Interviews
    • MySQL Interviews
    • SQL Puzzles
  • DBA Scripts
    • SQL Server DBA Scripts
    • PostgreSQL DBA Scripts
    • MySQL DBA Scripts
    • Greenplum DBA Scripts
  • Home
  • Blog Archives !
  • (: Laugh@dbrnd :)
  • Contact Me !
sqlserverinterviews
Home 2018 February Greenplum: Script to find Table Fragmentation and total wasted space

Greenplum: Script to find Table Fragmentation and total wasted space

This article is half-done without your Comment! *** Please share your thoughts via Comment ***

I recently started work with Greenplum Distributed RDBMS. Internally, it is using PostgreSQL 8.2 which is an ancient version of PostgreSQL.

I shared a couple of articles on finding the fragmentation in PostgreSQL, but none of my scripts worked for Greenplum PostgreSQL 8.2.

I have Posted one question on StackOverflow. Please check this link.

Finally, I found a script for finding the fragmentation of Greenplum tables which is very important for Greenplum DBA because Greenplum is a based on MVCC architecture which generates dead rows at every UPDATE / DELETE.

What is Multi Version Concurrency Control (MVCC)

You can use below script for finding the wasted space of the tables and can execute VACUUM ANALYZE on those tables.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
select
current_database() as dbname
, relid
, schemaname
, tablename
, round(bloat,1) as bloat_ratio
, pg_size_pretty(expbytes) as expected_size
, pg_size_pretty(relbytes) as relation_size
, pg_size_pretty(wastedbytes) as wasted_space
, round(1.0 * relbytes / nullif((sum(relbytes) over ())::bigint,0),4) as relative_size
, round((bloat-1) * (1.0 * relbytes / nullif((sum(relbytes) over ())::bigint, 0)), 8) as concern
from
(
SELECT
relid
, schemaname
, tablename
, CASE WHEN otta=0 THEN
0.0
ELSE
sml.relpages/otta::numeric
END AS bloat
, reltuples::bigint
, relpages::bigint
, otta
, (bs*otta)::bigint as expbytes
, CASE WHEN relpages < otta THEN 0 ELSE
(bs*(sml.relpages-otta))::bigint
END AS wastedbytes
, CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages
, (bs*relpages)::bigint as relbytes
FROM
(
SELECT
schemaname
, tablename
, cc.oid as relid
, cc.reltuples
, cc.relpages
, bs
, CEIL((cc.reltuples*((datahdr+ma-(CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta
FROM
(
SELECT
ma
, bs
, schemaname
, tablename
, (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr
, (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
FROM
(
SELECT
schemaname
, tablename
, hdr
, ma
, bs
, SUM((1-s.null_frac)*avg_width) AS datawidth
, MAX(s.null_frac) AS maxfracsum
, hdr+(1+(count(case when s.null_frac<>0 then 1 end))/8) as nullhdr
FROM
pg_stats s
cross join
(
SELECT
current_setting('block_size')::numeric AS bs
, CASE WHEN substring(version(),12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr
, CASE WHEN version() ~ 'mingw32' THEN 8 ELSE 4 END AS ma
) AS constants
GROUP BY schemaname, tablename, hdr, ma, bs
) AS foo
) AS rs
JOIN pg_class cc
ON
cc.relname = rs.tablename
and cc.relkind = 'r'
JOIN pg_namespace nn
ON
cc.relnamespace = nn.oid
AND nn.nspname = rs.schemaname
) AS sml
) wrapper
where wastedbytes > 2*1024*1024 and bloat >= 1.4order by (bloat-1) * (1.0 * relbytes / nullif((sum(relbytes) over ())::bigint, 0)) desc;

Taken reference from this article

Feb 6, 2018Anvesh Patel
Greenplum: gp_bloat_diag to get the list of pending VACUUM of TablesSQL Server: sp_spaceused for accurate disk space information of Database
Comments: 1
  1. Badal Singh Dudy
    October 11, 2019 at 7:57 am

    Hello Anvesh Patel,

    Its a good Article its too informative but i have one question how to i identified when i need to rebuilt the Index in Postgres or it will manage by Vaccum Analyze process?

    ReplyCancel

Leave a Reply to Badal Singh Dudy Cancel reply

CAPTCHA
Refresh

*

Anvesh Patel
Anvesh Patel

Database Engineer

February 6, 2018 1 Comment GreenplumAnvesh Patel, bloat, database, database research and development, dbrnd, dead tuples, Greenplum, Greenplum Administrator, Greenplum Error, Greenplum Monitoring, Greenplum Performance Tuning, Greenplum Programming, Greenplum Query, Greenplum Tips and Tricks, pg_size_pretty, PostgreSQL 8.2, table fragmentation, wasted space
About Me!

I'm Anvesh Patel, a Database Engineer certified by Oracle and IBM. I'm working as a Database Architect, Database Optimizer, Database Administrator, Database Developer. Providing the best articles and solutions for different problems in the best manner through my blogs is my passion. I have more than six years of experience with various RDBMS products like MSSQL Server, PostgreSQL, MySQL, Greenplum and currently learning and doing research on BIGData and NoSQL technology. -- Hyderabad, India.

About DBRND !

dbrnd

This is a personal blog (www.dbrnd.com).

Any views or opinions represented in this blog are personal and belong solely to the blog owner and do not represent those of people, institutions or organizations that the owner may or may not be associated with in professional or personal capacity, unless explicitly stated.

Feel free to challenge me, disagree with me, or tell me I’m completely nuts in the comments section of each blog entry, but I reserve the right to delete any comment for any reason whatsoever (abusive, profane, rude, or anonymous comments) - so keep it polite.

The content of this website is protected by copyright. No portion of this website may be copied or replicated in any form without the written consent of the website owner.

Recent Comments !
  • Anvesh Patel { Sure will do... } – May 27, 12:43 PM
  • Anvesh Patel { Great... } – May 27, 12:41 PM
  • Anvesh Patel { Great... } – May 27, 12:39 PM
  • Anvesh Patel { Great... } – May 27, 12:36 PM
  • Anvesh Patel { Great... } – May 27, 12:28 PM
  • Anvesh Patel { Great... } – May 27, 12:27 PM
  • Anvesh Patel { Great... } – May 27, 12:16 PM
  • Older »
Follow Me !
  • facebook
  • linkedin
  • twitter
  • youtube
  • google
  • flickr
© 2015 – 2019 All rights reserved. Database Research & Development (dbrnd.com)
Posting....