Database Research & Development

  • Home
  • NoSQL
    • NoSQL
    • Cassandra
  • Databases
    • Database Theory
    • Database Designing
    • SQL Server Coding Standards
    • SQL Server
    • PostgreSQL
    • MySQL
    • Greenplum
    • Linux
  • Interviews
    • SQL Server Interviews
    • MySQL Interviews
    • SQL Puzzles
  • DBA Scripts
    • SQL Server DBA Scripts
    • PostgreSQL DBA Scripts
    • MySQL DBA Scripts
    • Greenplum DBA Scripts
  • Home
  • Blog Archives !
  • (: Laugh@dbrnd :)
  • Contact Me !
sqlserverinterviews
Home 2015 May Effective storage of Email Address into Database System

Effective storage of Email Address into Database System

This article is half-done without your Comment! *** Please share your thoughts via Comment ***

Problem :

Email Address is a ubiquitous field and column for every application or system. As per the application size and scope database system stores lots of user Email Addresses.

E.g. 10,00,000 email address is stored in a database system and it is also increasing day by day.
Usually, any database user saves all this email address into one column. Most probably there is one column name is EmailAddress Varchar(250) and user store email address like ” myprofile@yahoo.co.in ” into this column. – Great and usually 70% to 75% user doing this.

Once database size and number of users are going to increase day by day at that time, this storage will not perform as per expectation and decrease the performance of the database.

How this will decrease the performance of the database?

  • As we assumed that email address stored into one column like ” myprofile@yahoo.co.in”.
  • Now, we require to find that, how many total numbers of users from only “Yahoo” domain?
  • How many total number of users from only “India Yahoo”?
  • Require to set some additional offer only for “Gmail” domain.

Friends, now think about above questions.

In my career path I have also stored email address into one column, but gradually I came to know that this is a bad idea because now a day Email address is not only for user name or only for storage purpose.
This is very important for promotional and marketing activity.

When you have stored in one column and let’s imagine, you want to find only ” Gmail ” domain out of 10,0,00,000 email address.

As a database developer, you can think that you require one string operation on all stored email address in which you will find the part which is after @”. But this is not a solution where string searching and parsing is required, which slows down the performance of relevant select statement.

Solution :
The solution is simple,

Just divide your Email address into two part or three part.

E.g.

  • Single Part : ” myprofile@yahoo.co.in “.
  • Two Part : ” myprofile” and ” yahoo.co.in “.
  • Three Part : “myprofile” and ” yahoo ” and ” co.in “.

Now, I am going to explain this using one MySQL Demo.
UserEmailRelation

I have created above two table into MySQL. I have divided Email Address into three part.
Below is a sample data.
tbl_User table Data:
User Mastertbl_Email table Data:
Email Master

Examine the first email address:

User Anvesh has anvesh@gmail.com, Roy has roy@yahoo.co.in
As you can find the EmailDomainID reference column in, a User master table which is indicating that which email address is associated with User.
Now a thing is going to easy:
How ?

If I require finding how many users with Gmail, I need to find the only EmailDomainID=1.

You can also apply easy extension filter like Yahoo from co.in and .com.
This will give you 80% to 85% better performance than to store Email into one column.

Also, This will reduce the size of data, E.g, if we store ” Gmail ” keyword 10,00,000 times into the database, then it consume more space than to store only integer ” 1 “.

Yes, I also agree that, during every insert, you have to split email address for storing purpose, but this is a good instead of parsing string to count only domain name.

This structure is quite complex to manage at every insert and every email existence check, but it gives better performance.

Below is simple string operation.

Full Email Address is : anvesh@yahoo.co.in

Now parse only ” anvesh ” from anvesh@yahoo.co.in

1
2
3
4
SELECT SUBSTRING
('anvesh@yahoo.CO.IN',1,
(POSITION('@' IN 'anvesh@yahoo.CO.IN') -1)
)

Now parse only ” yahoo ” from anvesh@yahoo.co.in

1
2
3
4
5
6
7
8
SELECT SUBSTRING(
SUBSTRING('anvesh@yahoo.CO.IN'
,(POSITION('@' IN 'anvesh@yahoo.CO.IN') + 1))
,1
,POSITION('.' IN SUBSTRING('anvesh@yahoo.CO.IN'
,(POSITION('@' IN 'anvesh@yahoo.CO.IN')))
)-2
)

Now parse only ” co.in ” from anvesh@yahoo.co.in

1
2
3
4
SELECT SUBSTRING((SUBSTRING('anvesh@yahoo.CO.IN'
,(POSITION('@' IN 'anvesh@yahoo.CO.IN') + 1)))
,POSITION('.' IN SUBSTRING('anvesh@yahoo.CO.IN'
,(POSITION('@' IN 'anvesh@yahoo.CO.IN') + 1)))+1)

The above code is required whenever you insert the email address into the database because from the application end we receive one full email address, but for effective storage, you have to parse like above code.

You can also suggest three different input parameters from the application end where above all codes can shift to at application level.

Click here to find details about string functions of MySQL.

May 19, 2015Anvesh Patel
MySQL String FunctionsHow to reset postgres user password in windows ?
Comments: 7
  1. Fas
    August 20, 2015 at 10:28 am

    Nice Article in detail… keep it up…

    • Anvesh Patel
      Anvesh Patel
      August 20, 2015 at 11:06 am

      Thanks ! I have noticed that you were reading this blog more than 30 minutes. Readers, like you inspire me to write more quality content.

  2. Bharath
    November 2, 2015 at 3:31 am

    I liked this concept, which is as per boy codds rule. This concept is useful for reporting servers or warehouses. But if iam using login for authentucation of my website, imagine pain of splitting it and storing it. Also the pain of joins while authenticating user.

  3. chaitanya
    November 8, 2015 at 5:21 pm

    This solution will not fit in OLTP world. If you have thousands on users connecting to server and trying to join email domain table, it leads to lock contention.

    • Anvesh Patel
      Anvesh Patel
      November 17, 2015 at 12:17 pm

      Yes, you are right, but we can use this column only for report purpose. you can also store full email in one column so every time not require to join domain table. I just show one way to store email address in normal form.

      • Nidhin
        January 6, 2016 at 5:46 am

        “”This will give you 80% to 85% better performance than to store Email into one column.

        Also, This will reduce the size of data, E.g, if we store ” Gmail ” keyword 10,00,000 times into the database, then it consume more space than to store only integer ” 1 “.””

        So basically its not reducing any thing, U increasing the column count for the tables just for research, As anyway u have to store the whole email id against the user to avoid all the Joins for login purposes. I like the concept but in real life implemtation its one extra table and column

        • Anvesh Patel
          Anvesh Patel
          January 6, 2016 at 6:01 am

          Right Nidhin, management is quite difficult but when we are fetching for different type of report, it performs better.

Anvesh Patel
Anvesh Patel

Database Engineer

May 19, 2015 Database DesigningAnvesh Patel, database, Database Design, database research and development, Email Address, MySQL, MySQL Database Administrator, MySQL Query, Storage, String Functions
About Me!

I'm Anvesh Patel, a Database Engineer certified by Oracle and IBM. I'm working as a Database Architect, Database Optimizer, Database Administrator, Database Developer. Providing the best articles and solutions for different problems in the best manner through my blogs is my passion. I have more than six years of experience with various RDBMS products like MSSQL Server, PostgreSQL, MySQL, Greenplum and currently learning and doing research on BIGData and NoSQL technology. -- Hyderabad, India.

About DBRND !

dbrnd

This is a personal blog (www.dbrnd.com).

Any views or opinions represented in this blog are personal and belong solely to the blog owner and do not represent those of people, institutions or organizations that the owner may or may not be associated with in professional or personal capacity, unless explicitly stated.

Feel free to challenge me, disagree with me, or tell me I’m completely nuts in the comments section of each blog entry, but I reserve the right to delete any comment for any reason whatsoever (abusive, profane, rude, or anonymous comments) - so keep it polite.

The content of this website is protected by copyright. No portion of this website may be copied or replicated in any form without the written consent of the website owner.

Recent Comments !
  • Anvesh Patel { Sure will do... } – May 27, 12:43 PM
  • Anvesh Patel { Great... } – May 27, 12:41 PM
  • Anvesh Patel { Great... } – May 27, 12:39 PM
  • Anvesh Patel { Great... } – May 27, 12:36 PM
  • Anvesh Patel { Great... } – May 27, 12:28 PM
  • Anvesh Patel { Great... } – May 27, 12:27 PM
  • Anvesh Patel { Great... } – May 27, 12:16 PM
  • Older »
Follow Me !
  • facebook
  • linkedin
  • twitter
  • youtube
  • google
  • flickr
© 2015 – 2019 All rights reserved. Database Research & Development (dbrnd.com)
Posting....