Database Research & Development

  • Home
  • NoSQL
    • NoSQL
    • Cassandra
  • Databases
    • Database Theory
    • Database Designing
    • SQL Server Coding Standards
    • SQL Server
    • PostgreSQL
    • MySQL
    • Greenplum
    • Linux
  • Interviews
    • SQL Server Interviews
    • MySQL Interviews
    • SQL Puzzles
  • DBA Scripts
    • SQL Server DBA Scripts
    • PostgreSQL DBA Scripts
    • MySQL DBA Scripts
    • Greenplum DBA Scripts
  • Home
  • Blog Archives !
  • (: Laugh@dbrnd :)
  • Contact Me !
sqlserverinterviews
Home 2018 February Greenplum: How to Troubleshoot running Database Backup Process

Greenplum: How to Troubleshoot running Database Backup Process

Database Backup

This article is half-done without your Comment! *** Please share your thoughts via Comment ***

Greenplum is based on MPP (massively parallel processing) architecture. It built on shared nothing mode parallel processing in which all child segments own separate resources.

Database Theory: What is Parallel Query Processing (Parallel Database System)?

When you are taking database backup in Greenplum, it executes a backup process in different stages and executes in parallel.

For GP DBA, this is very important for knowing that in which machine or segment backup process is running. They may have few other questions like Is backup process completed for a master segment?, If it is completed for a master segment, in which child segment it is running currently.

You can access this article for troubleshooting Greenplum running database backup process.

You can start Greenplum full database backup using below command:

1
gpcrondump -x database_name -u /home/gpadmin/dbrnd_backup/

It first takes a backup of the master node (all metadata or pg_catalog), while during this time you cannot perform any DDL or any other operation on Greenplum database server like CREATE TABLE, DROP TABLE, TRUNCATE TABLE.
It takes only a few minutes to complete the backup of a master node.

How can you check that, Is Greenplum database backup process running on master node?

The first step is to check if the message “Starting Dump process” is on the screen
After this message, you can find a subsequent message “Releasing pg_class lock” that means backup process completed for master segment (meta data backup).

You can measure the time difference between “Starting Dump process” and “Releasing pg_class lock” that is the total time taken for master segment backup.

If you are unable to find “Releasing pg_class lock” on the screen and find only “Starting Dump process” for a long time, there is 99% problem with a master segment and problem like database lock, network issue, disk failure.

How can we check that, Is Greenplum database backup process running on child nodes?

After taken a master node back up, a gp_dump command will be sent to all the segments to start a copy of the data using COPY command.

We can check the running processes in child nodes and can find COPY command and db_dumps directory in the process list. You can get the list of Linux running processes using a command: ps -ef.
Below is a sample process message:

1
postgres: port 40002, gpadmin database_name [local] con808 [local] cmd880 COPY

Important Note:

Please do not kill long running Greenplum backup from only master node because it may continue the backup process on child segments. Your one wrong kill command can fail your whole Greenplum cluster.


If you guys know how to kill long running backup in Greenplum, please share full detail with me because I am also looking for the solution for killing backup process at all child segments.

Feb 3, 2018Anvesh Patel
Greenplum: Script to find blocked and deadlock by other blocking queriesGreenplum: gp_stats_missing to get the list of pending ANALYZE of Tables

Leave a Reply Cancel reply

CAPTCHA
Refresh

*

Anvesh Patel

Database Engineer

ImageFebruary 3, 2018 GreenplumAnvesh Patel, backup, database, database research and development, dbrnd, gpcrondump, Greenplum, Greenplum Administrator, Greenplum Error, Greenplum Monitoring, Greenplum Performance Tuning, Greenplum Programming, Greenplum Query, Greenplum Tips and Tricks, parallel backup, PostgreSQL 8.2, Releasing pg_class lock, Starting Dump process
About Me!

I'm Anvesh Patel, a Database Engineer certified by Oracle and IBM. I'm working as a Database Architect, Database Optimizer, Database Administrator, Database Developer. Providing the best articles and solutions for different problems in the best manner through my blogs is my passion. I have more than six years of experience with various RDBMS products like MSSQL Server, PostgreSQL, MySQL, Greenplum and currently learning and doing research on BIGData and NoSQL technology. -- Hyderabad, India.

About DBRND !

dbrnd

This is a personal blog (www.dbrnd.com).

Any views or opinions represented in this blog are personal and belong solely to the blog owner and do not represent those of people, institutions or organizations that the owner may or may not be associated with in professional or personal capacity, unless explicitly stated.

Feel free to challenge me, disagree with me, or tell me I’m completely nuts in the comments section of each blog entry, but I reserve the right to delete any comment for any reason whatsoever (abusive, profane, rude, or anonymous comments) - so keep it polite.

The content of this website is protected by copyright. No portion of this website may be copied or replicated in any form without the written consent of the website owner.

Recent Comments !
  • Anvesh Patel { Sure will do... } – May 27, 12:43 PM
  • Anvesh Patel { Great... } – May 27, 12:41 PM
  • Anvesh Patel { Great... } – May 27, 12:39 PM
  • Anvesh Patel { Great... } – May 27, 12:36 PM
  • Anvesh Patel { Great... } – May 27, 12:28 PM
  • Anvesh Patel { Great... } – May 27, 12:27 PM
  • Anvesh Patel { Great... } – May 27, 12:16 PM
  • Older »
Follow Me !
  • facebook
  • linkedin
  • twitter
  • youtube
  • google
  • flickr
© 2015 – 2019 All rights reserved. Database Research & Development (dbrnd.com)
Posting....