Database Design: The Truth about Archive Table and Soft Delete of Historical Data

I have found lots of discussion and questions-answers, which one is a better from Archive Table and Soft Delete.
Today in our company, This same discussion also happened during one of our requirement analysis.
I have marked that this is a very basic topic, but still database professional has lots of discussion on this.

We all know the importance of data and most of the company and the client always prefer to store old or deleted historical data.
But out immediate next question would be: whether to store in Archive Table or marked as Soft Deleted.

Archive vs Delete
In this post, I am going to show all truth about: whether to go with Archive Table or Soft Deletion.

” The truth about, Marked as Soft Deleted “

Most of the time, This would be a first choice by Database Architecture to mark data as Soft Deleted.
This is not a bad idea to avoid important data loss and even not need to go with prior approval to hard deletion.
If any invalid deletion is happening, then it’s absolutely good because it’s very easy to recover.

Generally, Soft Deletion is achieved by adding one extra IsDeleted bit flag or Deleted_Date field column. You can add this extra column on those tables in which you need to perform soft deletion.

Once you decide to go with Soft Deletion then you should revoke all DELETE privileges from all the database account.

You don’t have to worry about cascading a delete through various other tables in the database that reference the row you are deleting.

The Soft Deletions comes with a cost, though, because you need to update your queries and indexes to be able to exclude the deleted rows. Each and every query, You have to write one filter condition to check IsSoftDeleted flag. This is also hard to manage in a big system because once you forget to put this filter, then some unwanted data also select by query which is not actually require.

Just Imagine that your table has only 10 active row and 100000 soft deleted rows and still you should manage this deleted data because of soft deletion and will create performance issues. This soft deleted data also create, extra load on your file groups and rebuilding of the indexes also take a long time to complete.

One more problem of Soft Deletion: Take an example of unique key constraint, once you marked some unique data as soft deleted and same data need to insert in the new format then soft deletion will fail because the data is already exist in a table.

” The truth about, Stored into Archive Table “

We understand the importance of data so we have another very good approach to move all deleted data into a different Archive Table.

You can create same different replica table and all deleted data should be moved into this table from the original table.If you choose this approach then you do not require to write extra condition in WHERE caluse to filter deleted data. You can easily move all archive data into another filegroup which will reduce the load of primary filegroup. The size of the index will be smaller and performance of the table will improve with a smaller table size.

The disadvantage of Archive Table is management of different Archive Tables, Just Imagine that you have total 400 tables in your system and the data of all this table require to achieve then you should create 400*2=800 tables to manage just archive data.

Once you moved archive data from the original table to archive table and later you need some data recovery from the archive table to the original table, then you will face problem of physical order of data.

You also require to write a different index management and table management script for this all archive tables because as a database professional we should also take care data size and its performance.

” The Conclusion “
As per this theory, both concepts have some advantage and disadvantage, but I suggest to go with Archive Table for historical data because Soft Deletion will create a performance issue for running Application. The Soft deletion also we can easily manage by creating different views, but again views has some it’s own limitations.

Again this is my view and perspective, you can go with any approch base on your system requirements.

Anvesh Patel

Leave a Reply

1 Comment on "Database Design: The Truth about Archive Table and Soft Delete of Historical Data"

Notify of
Sort by:   newest | oldest | most voted
Michael Hunt

or just use Stretch database in sql server 2016, no need to delete, no need to change queries/indexes, no need to worry about tables getting too big