Working with multiple databases is not very often required in application development. Yet, their need cannot be ruled out in some business critical situations. The idea is undoubtedly messy and needs meticulous planning in delivering one. The pitfall is obvious; there are numerous database vendors, each claiming to be best in own their way and, in view of outsmarting each other, there is no architectural uniformity. So, if we still insist on our need to unify database multiplicity and somehow get our work done, the only hope remains in SQL. SQL is the language where most, if not all, vendors agree as a means of communication, talking to the database. The article focuses on the problem and possible techniques to get around these complexities in some situations.
Complexity of Variety
This variety is "…spice of life" [sic] can be quite annoying at times. Look at our needs: simple files were not good enough, we got relational databases. Later, we thought relational databases are too procedural; because we live in an era of object-oriented programming languages, we can't we have an object oriented database. Voilà! PostgreSQL popped up. You see, security is very vital; Oracle said gotcha! Oracle database. Our last wish (pray it will be the last, unfortunately not so) is for a document database; MongoDB emerges. You may list more in this manner but the point is that each of them has wonderful uses, matured through the years of service. But, there is very poor option to work along with their infrastructural polarity. If this is the situation, how easy do you think it would be to get a database truce? The respite is that most of them agreed to speak SQL, a standard language for database communication. Even though this language is also not protected from vendor-specific enhancements, its core syntax has a consensus and marked reliability. Therefore, it seems reasonable to cling to this feature and find some way in reaching a solution. By the way, there is actually a JDBC driver available that can ease lot of pain in bringing out a virtual interface with a set of APIs which help in collaborating theses stubborn databases. One of them is UnityJDBC. It is commercially available; a trial version can be freely downloaded.
Planning for the Problem Domain
Each database provides an external interface, called a database driver, which provides a window to peep programmatically into the so-called database forté. (For a simple JDBC tutorial, refer to Manipulating a database with JDBC). There are different type of database drivers available, more specifically JDBC driver defined by their type such as type 1, 2, 3 and 4. For more information on different types of JDBC driver, refer to JDBC Application Design Consideration. Let's stick to type 4 drivers for the sake of our discussion here. Type 4 drivers are database vendor-specific and mostly used in commercial applications. So, to approach the solution of working with multiple databases what, we can do is this:
- Write our very own driver by assembling different type 4 drivers of the database and create a unified approach of a virtual driver.
- Or, take a rather raw approach of communication with the help of SQL.
Either way, it's a forced approach to bring some sort of conformity. Creating a virtual driver is not an easy way, although once created, it can save a lot future effort. Due to further layering, performance is a major concern in this approach, despite that it's a right engineering approach to leverage reusability. Communicating through SQL is affordable and simple to implement if one is ready to deal with complex queries.
When we talk of working with multiple databases, we may have to deal with the following situations of communication:
- Between databases of the same vendor
- Between databases of different vendors
In the case of dealing with databases of the same vendor, we can prefix a table's name with the database name. For example, you can use MyDatabase.product_tbl while creating joins between tables contained in different databases. Other parts of the query remains the same. Similar rules apply while querying databases of different vendors. However, the scenario becomes a little complex in the case of dealing with client-server architecture. The probable scenarios are:
- Multiple Databases on One Server Instance
- Multiple Databases on Multiple Servers from the Same Database Vendor
- Multiple Databases on Multiple Servers from Different Vendors
Dr. Ramon Lawrence has written an excellent article focusing this issue. Refer to How to Query Multiple Database… for more information on these scenarios.
One thing's for sure: Connecting multiple databases in an application and making it work without any glitches requires a lot of effort. The setbacks can be complexity, bugs, performance issues, version control, and above all,, creating a uniform design of the application. If you need to create a simple data migration tool, it's not that difficult, but having to create an application that serves multifaceted needs that too deal with multiple databases, creating layered architecture is not that easy. I think that layering, by far, is a major concern for ongoing technological advancement. On one hand, it helps in segmenting complexities into strata to have simplified implementation; on the other hand, with every layering pot holes full of bugs open up in the system. Also, sometimes I feel are we not patching things up in the name of layering. I really have mixed feelings with no fixed answer in this variety of solution.