0

Combining Data Sets

Posted by Danielle Smith on 11:17 in , ,
Occasionally, you may want to combine data sets in order to bring back more relevant information. This blog post will explain what UNION, UNION ALL, EXCEPT and INTERSECT clauses are and how you would use them in your queries.

For future tutorials, I will be using a different database set up in order to bring back more data as the one I created in the first instance was just to show in simple terms how basic SELECT and CREATE statements work. I will post results to queries below however if you want to replicate what I am doing, please leave a comment below and I will send scripts to replicate the database.

UNION


The UNION clause combines data sets between 2 or more queries (though the same field types in the same order will need to be returned) however duplicates are not displayed.






When you execute the query above, it will bring back customer details from 2 different customer tables found in 2 different databases and then display them together. Obviously, the customer tables should have a similar set up (i.e. the CustomerID should be either a GUID or int but not a combination of the 2, otherwise an error will occur). As UNION doesn't display duplicates, it makes it very useful to merge databases where there is a chance that data could be replicated (and not wanted!) Here is an example of the data set brought back once this query has been run:












UNION ALL


The UNION ALL clause combines data sets between 2 or more queries (though the same field types in the same order will need to be returned) including duplicates.


When you execute the query above, it performs a count showing the number of times a customer has been duplicated within the system. This may not be particularly useful in most day to day database queries however it demonstrates how they work, particularly with aggregates. As you can also see above, it is also possible to nest SELECT statements so that additional clauses can be added to the results set. usually this would be in the form of a WHERE clause however in this instance, I am using a GROUP BY aggregate - something which I will come onto in a future blog post (keep your eyes peeled!). Here is an example of the data set brought back once this query has been run:













EXCEPT


The EXCEPT clause provides a subset of information. EXCEPT returns all rows that exist in the table to the left of the operator that do not have matching rows in the table to the right.






When you execute the query above, it will bring back all names of assets where they do not have an associated link in the AssetCity table. This means that they will not be linked to a location.

Here is an example of the data set brought back once this query has been run:













INTERSECT 

The INTERSECT clauses provides a subset of information. INTERSECT returns all rows that match in both queries. 






When you execute the query above, it will bring back the opposite of the EXCEPT clause and will return all names of assets that do belong to a location. It could be potentially useful when bringing back assets that are related to a particular location.

Here is an example of the data set brought back once this query has been run:





0 Comments

Post a Comment

Please post any feedback or comments here...

Copyright © 2009 SQL Genius - Personal Development of a Junior All rights reserved. Theme by Laptop Geek. | Bloggerized by FalconHive.