Database Theory: What is Parallel Query Processing (Parallel Database System)?

In the past decade, lots of analysis and research has been done for the Parallel Database System and Parallel Query Processing.

The success of Parallel Database System depends on the relational database model and good CPU / DISK performance. Without a good CPU and DISK performance, we should not use the Parallel Query Processing.

In general,
When we are executing any SQL Query, It creates the execution plan and submit to the internal query executor. A query executor collects the all execution plans and execute in sequence.
For example, your one SQL Query requires 100,00,0000 times manipulation, so using a single CPU thread system It takes more time to execute.

In the Parallel Query Processing,
More than one process worker running in the background and responsible for the same task or multiple tasks in the sharing mode.

parallelprocessing
For example, If we say 4 different process worker are executing one single big SQL Query which requires 100,00,0000 times manipulation, It completes the whole process four times faster than using single process worker or single CPU thread for same SQL Query.

Basically, there are main three types of architecture of Parallel Database System.

Shared Disk:

This approach is not that much popular where same disk shared among the different background process workers.

Shared Nothing:

With a shared nothing system, each processor owns a portion of the database and only that portion may be directly accessed or manipulated by that processor. A two phase commit protocol is required to coordinate a transaction commit which involves multiple nodes.

A shared nothing system is based on the concept of declustering. Declustering a relation involves distributing its tuples among multiple nodes according to some distribution criteria such as applying a hash function to the key attribute of each tuple.

In the context of query processing, the main advantage of a shared nothing system is its scalability.
One of the best example of this architecture is: GreenPlum Relational Database.

Shared Everything:

In a shared everything system, main memory, in addition to disks, is also shared across all the processors, making system management and load balancing much easier.

First, there are no communication delays because messages are exchanged through the shared memory, and synchronization can be accomplished by cheap, low level mechanisms.

Second, load balancing is much easier because the operating system can automatically allocate the next ready process to the first available processor.

Tomorrow, You can access one good article about Performance Test of Parallel Query Processing using PostgreSQL 9.6 Database System.


Please share your ideas and opinions about this topic with me, your contribution will add true value to this topic.
If anyone has doubts on this topic then please do let me know by leaving comments or send me an email.

If you like this post, then please share it with others.
Please follow dbrnd.com, I will share my experience towards the success of Database Research and Development Activity.

I put up a post every day, please keep reading and learning.
Discover Yourself, Happy Blogging !
Anvesh M. Patel.

More from dbrnd.com

Leave a Reply

2 Comments on "Database Theory: What is Parallel Query Processing (Parallel Database System)?"

Notify of
avatar

Sort by:   newest | oldest | most voted
Santhosha
Guest
Santhosha
2 days 2 hours ago

Nice article. What is the root cause for ORA-12801 and ORA-01652 unable to extend temp segment when processing on parallel queries.

wpDiscuz