Joining Tables with UNION
Earlier I mentioned that you can use the UNION keyword to join tables in ways that standard linking doesn’t allow, but I didn’t go into the details of how to do it. Although this book isn’t meant to teach you about writing SQL statements (there are already many books for that), I’m always surprised at how many people don’t know about UNION and its benefits. As a reader of my book, I certainly don’t want you to be one of those people “in the dark” about something so useful and yet so simple. It has certainly gotten me out of some tough reporting situations! So let’s take a short break from pure Crystal Reports talk and see if I can enlighten you about the benefits of using the UNION keyword.
When you typically join two tables together, you do so using a field that is common between the two tables. Each record in the tables is analyzed and when the data in the fields match, those two records are joined together to make a new record. The new record has data from both tables in it. The UNION keyword is unique because it takes the data from one table and appends it to the end of the data from the other table. Unlike linking tables, which joins two records into one, UNION lists one record after the other. For example, if Table A has 25 records and Table B has 10 records, their UNION would create a new table with 35 records in it.
I find using UNION to be helpful in two situations. The first being when you have a table that archives past data on a periodic basis. The current table would have current year data and the archive table would have all the historical data in it. If you were to create a report to print the entire sales history for a particular customer, it would be tricky since the data is in two different tables. You could create a main report that prints the data from the current table and then create a subreport that pulls data from the archive table. Of course, the subreport would have to be formatted exactly the same as the main report so that the user doesn’t notice that you are using two different reports for one list. This also makes it harder to maintain the report. A much better way is to use a UNION and it gives you just a single table that lists all the records in it. You can have one report without having to resort to using subreports.
The other situation I find UNION to be helpful in is when combining two similar tables into a single list. I frequently see databases that have tables with similar information in them but for whatever reason they are not in the same table and the field names are different. Sometimes this is due to poor database design and sometimes it’s due to the way the application evolved and requirements changed over time. Like it or not, you have to make the best of it. Since the two tables have similar information, you frequently get requests to print reports that combine the two tables together.
Consider an example where a company buys out another company. You need to print a list of all the products that both companies have so you can look for a possible overlap. Rather than printing out two separate reports and comparing them side by side, you can use a UNION to combine the two tables and build one large report with all the products listed together.
Now that you know how the UNION keyword can help you, let’s look at the three requirements for using it.
- You need to have at least two tables (but more are acceptable as well).
- The SELECT statements must have the same number of fields listed.
- The fields must have matching data types and be in the same order.
Needing to have two or more tables is a given (otherwise there wouldn’t be anything to combine). Getting the SELECT statements correct is critical. If the first SELECT statement lists three fields, for example a number, a string, and a date, then the second SELECT statement must also list three fields of the same data type. You can’t have a different number of fields or change their order and expect the database to figure out what you want to do. It will return an error.
For an example, look at the previous Figure 11-24 that shows the Add Command dialog box. It uses a UNION to join two Orders tables together. Here is the code repeated for you.
The first SELECT statement pulls data from the current Orders table and the second SELECT statement pulls data from the historical ORDERS_ARCHIVE table. Notice that both SELECT statements use three fields and they are of the same data type. In this example, they also happen to be the exact same fields. But that isn’t always the case and it certainly isn’t a requirement. As long as the data types match, then you’ll be fine. This SQL statement also uses a parameter to select the data for just one customer. In this example, the WHERE conditions are the same, but this isn’t a requirement of the UNION. The WHERE clause for each SELECT statement can be completely different. As long as the SELECT statement follows rules #2 and #3 above, then that is all that is necessary.
As one more final example, let’s look how the UNION statement can build a virtual table that could be used to link to other tables. If you have some temporary data you want to insert into your report and you don’t want to create a physical table in the database to store it, you can use the UNION statement to create it virtually.
In this example, we are building a list of the months in the year. The first field is the month number and the second field is the month name. By using the UNION keyword we are joining all the individual SELECT statements into one big virtual table that contains one record for every month of the year.
It is also important to note that since we are creating raw data that doesn’t come from an existing table, we need to define the field names manually. In the first SELECT statement, I use the AS keyword to define an alias for each field name. This tells the database server that the first column is called MonthNum and the second column is called MonthName.
Okay, you’re probably tired of reading about the UNION keyword and ready to get back on topic. Let’s look at more advanced uses of the command object and see how your report can modify live data on the database server as the report runs.