PostgreSQL 9.6: Powerful Parallel Query Processing

This article is half-done without your Comment! *** Please share your thoughts via Comment ***

A most awaited feature, The Powerful Parallel Sequential Query Scan feature introduced in PostgreSQL 9.6.

You can easily set the Parallel Sequential parameter’s value and can execute your queries 10 times faster.
When we are talking about Parallel Sequential Scanning, in background multiple workers or CPU threads are responsible for executing one single query.

In the PostgreSQL 9.6, You can change the Process Workers parameter value which is default 8. (Using max_worker_processes parameter)

Important Note:
If you want to increase Parallel Process Worker parameter value, You should make sure about your total CPU usage because More Parallel Process Worker required More CPU.
Once you change the parameter value, You must restart the PostgreSQL service.

Database Theory: What is Parallel Query Processing (Parallel Database System)?

The details about new Parallel Sequential Parameters:

max_worker_processes: Total number of background workers limited by this parameter. (Default: 8)
max_parallel_workers_per_gather: Total number of workers that can assist a sequential scan.
parallel_setup_cost: used to estimate the cost of instantiate a worker.
min_parallel_relation_size: You can define the minimum size of the relation and only those relations will consider for additional workers.
parallel_tuple_cost: used to estimate the cost of transferring a tuple from one worker to another.

Below is a full demonstration of this:

Prepared different result set by changing the max_parallel_workers_per_gather parameter value and once you change the value of this parameter, please do not forget to restart PostgreSQL Service. (You can find this parameter in postgresql.conf)

For each execution result, You should compare the value of “Execution time”.’

First create one sample table:

CREATE TABLE tbl_ItemTransactions

(

TranID SERIAL

,TransactionDate TIMESTAMPTZ

,TransactionName TEXT

);

Insert Millions of data to test the performance of BRIN Index:

INSERT INTO tbl_ItemTransactions

(TransactionDate, TransactionName)

SELECT x, 'dbrnd'

FROM generate_series('2014-01-01 00:00:00'::timestamptz, '2016-08-01 00:00:00'::timestamptz,'2 seconds'::interval) a(x);

Check the total size of table:

1 2	SELECT pg_size_pretty(pg_total_relation_size('tbl_ItemTransactions')) AS TableSize; -- total size "2028 MB"

Check the execution plan of below query, where max_parallel_workers_per_gather = 0:

1	EXPLAIN ANALYZE SELECT COUNT(1) FROM tbl_ItemTransactions;

Result:

"Aggregate (cost=768697.65..768697.66 rows=1 width=8)

(actual time=59863.028..59863.029 rows=1 loops=1)"

" -> Seq Scan on tbl_itemtransactions

(cost=0.00..666853.32 rows=40737732 width=0)

(actual time=0.825..30647.470 rows=40737601 loops=1)"

"Planning time: 0.079 ms"

"Execution time: 59863.095 ms"

Check the execution plan of below query, where max_parallel_workers_per_gather = 2:

1	EXPLAIN ANALYZE SELECT COUNT(1) FROM tbl_ItemTransactions;

Result:

"Finalize Aggregate (cost=472651.90..472651.91 rows=1 width=8)

(actual time=21129.508..21129.511 rows=1 loops=1)"

" -> Gather (cost=472651.69..472651.90 rows=2 width=8)

(actual time=21129.213..21129.492 rows=3 loops=1)"

" Workers Planned: 2"

" Workers Launched: 2"

" -> Partial Aggregate (cost=471651.69..471651.70 rows=1 width=8)

(actual time=21102.175..21102.176 rows=1 loops=3)"

" -> Parallel Seq Scan on tbl_itemtransactions

(cost=0.00..429216.55 rows=16974055 width=0)

(actual time=0.048..10929.919 rows=13579200 loops=3)"

"Planning time: 0.052 ms"

"Execution time: 21132.005 ms"

Check the execution plan of below query, where max_parallel_workers_per_gather = 4:

1	EXPLAIN ANALYZE SELECT COUNT(1) FROM tbl_ItemTransactions;

Result:

"Finalize Aggregate (cost=387781.83..387781.84 rows=1 width=8)

(actual time=16212.360..16212.361 rows=1 loops=1)"

" -> Gather (cost=387781.41..387781.82 rows=4 width=8)

(actual time=16211.487..16212.347 rows=5 loops=1)"

" Workers Planned: 4"

" Workers Launched: 4"

" -> Partial Aggregate (cost=386781.41..386781.42 rows=1 width=8)

(actual time=16159.409..16159.409 rows=1 loops=5)"

" -> Parallel Seq Scan on tbl_itemtransactions

(cost=0.00..361320.33 rows=10184433 width=0)

(actual time=0.043..9301.194 rows=8147520 loops=5)"

"Planning time: 0.050 ms"

"Execution time: 13516.982 ms"

Check the execution plan of below query, where max_parallel_workers_per_gather = 6:

1	EXPLAIN ANALYZE SELECT COUNT(1) FROM tbl_ItemTransactions;

Result:

"Finalize Aggregate (cost=345346.90..345346.91 rows=1 width=8)

(actual time=16514.376..16514.377 rows=1 loops=1)"

" -> Gather (cost=345346.27..345346.88 rows=6 width=8)

(actual time=16511.341..16514.351 rows=7 loops=1)"

" Workers Planned: 6"

" Workers Launched: 6"

" -> Partial Aggregate (cost=344346.27..344346.28 rows=1 width=8)

(actual time=16358.386..16358.387 rows=1 loops=7)"

" -> Parallel Seq Scan on tbl_itemtransactions

(cost=0.00..327372.22 rows=6789622 width=0)

(actual time=0.055..9956.567 rows=5819657 loops=7)"

"Planning time: 0.034 ms"

"Execution time: 10288.810 ms"

Nov 21, 2016Anvesh Patel

PostgreSQL 9.6: Powerful Parallel Query Processing – Full Performance Test

A most awaited feature, The Powerful Parallel Sequential Query Scan feature introduced in PostgreSQL 9.6.

Anvesh Patel

About Me!

About DBRND !

Recent Comments !

Follow Me !