The North Carolina Department of Transportation (NCDOT) faced many data synchronization challenges typical of mobile computing today.
The North Carolina Department of Transportation (NCDOT) faced many data synchronization challenges typical of mobile computing today. This case study provides an overview of the problem encountered by NC DOT Mobile Inspector application (the need for timely exchange of information within bridge inspection teams and with the central office) as well as the solution (installing SQL Anywhere Studio MobiLink technology to implement database synchronization on this application).
Before I traveled to Raleigh, North Carolina in the spring of 2004, I never gave much thought to bridges and overpasses. I didn't think about the need to regularly inspect bridges for safety, and I certainly didn't know about "Bridge Inspector" as a job title. For all I knew, those men in the hard hats and safety vests were cleaning up roadkill.
That changed pretty quickly when I started working on the Mobile Inspector application. There are about 19,000 bridges and culverts in North Carolina that need to be inspected on a regular basis, and there are about 60 Field Inspectors doing that, including four teams who work underwater. Those inspectors are very busy taking notes and photos in the field, before returning to their offices to prepare inspection reports that are used to determine which bridges will be okay for a while longer and which ones need work now.
I joined the project as a consultant with responsibility for the design and implementation of the MobiLink synchronization setup. The decision to use SQL Anywhere and MobiLink had already been made before I arrived, and except for a few schema changes required to support synchronization, the database design was complete and application development was well underway.
Overview of the NC DOT Application
The Mobile Inspector application is used to gather measurements and observations made by the inspectors, as well as to store digital photographs. It is designed to push data entry from the office out into the field, to save costs as well as satisfy a federal government mandate to capture bridge inspection data electronically. To make this possible, it features voice control, handwriting recognition, and a custom shorthand tool.
Figure 1: Shane Dunlow, Field Inspector, records flange measurements in the Mobile Inspector application on a tablet PC. Photo courtesy Jeff Odom.
Each inspector uses a tablet PC running Windows 2000 in the field. If this were a traditional two-level, client-server arrangement, or even a two-level distributed hierarchy, all the tablet PCs would communicate or synchronize directly with the central database. However, for security reasons, connections to the central database are available only at the office. That means data must be copied to the Team Leader's tablet PC before each field trip begins and copied back when the inspection report is complete. Data must also be copied back and forth between Team Leader and Team Member PCs, sometimes several times on a large bridge; those operations are performed on site using wireless connections.
Figure 2 shows how the computers involved in the Mobile Inspector application are arranged in three levels, with the central database server at the top, the Team Leader tablet PCs in the middle, and the Team Member PCs at the bottom. The Team Leaders communicate with the central database via wireline LANs, and with the Team Members via wireless LANs. Synchronization server software runs on both the central database computer and on the Team Leader PCs, and those servers communicate with synchronization client software on the Team Leader and Team Member PCs respectively.
Click here for a larger image.
Figure 2: Three-Level Distributed Hierarchy
The application itself uses a relational database management system to store data on the tablet PC, including the digital photographs. This was done for all the usual reasons folks use an RDBMS: it offers data integrity and recoverability as well as great power and simplicity. That's the good news. The bad news is that moving from a single central database to a three-level distributed hierarchy involving dozens of remote databases is a difficult task. Especially when you can't send a DBA out with each inspection team, no matter how nice the weather is on the Outer Banks of North Carolina, the synchronization process must be reliable even when wireless connections aren't.
The database setup at NC DOT uses Oracle 9i as the central database, Adaptive Server Anywhere (ASA) for the remote databases on the tablet PCs, and MobiLink synchronization software to handle both Oracle-to-ASA and ASA-to-ASA transfers. ASA and MobiLink both ship as part of SQL Anywhere Studio Version 9 from iAnywhere Solutions.
MobiLink software consists of client and server components. The remote client starts a synchronization session by connecting to a local database, gathering all the rows changed since the previous synchronization and uploading those rows to the MobiLink server. The server then applies the uploaded changes to the central database, selects rows to be downloaded, and sends those rows to the client. Finally, the MobiLink client applies the downloaded changes to the local database and sends an acknowledgment back to the server.
Failure handling is an important part of synchronization in any environment, but especially in a mobile environment where connections are frequently lost or accidentally dropped. This issue was well understood before the project began, and it was one of the reasons MobiLink was selected. With MobiLink, failures are handled at the level of upload and download database transactions. If an upload fails, all the uploaded changes are rolled back and the whole process stops. Recovery consists of running the synchronization process again; the upload stream is reconstructed from scratch and the process repeated as if it had never been attempted.
Failure during the download stage is handled in a similar fashion: the download is rolled back and the recovery technique is the same: "try again." This time, however, only the download is repeated because the previous upload worked and was committed. All of this is automatic with MobiLink, requiring no administrative effort or intervention.
Importance of Implementing Business Rules
One of MobiLink's main strengths is the ability to tailor the synchronization process by exploiting its event-driven architecture. You can provide a custom script to specify the processing that should happen for any particular event. MobiLink provides an extremely fine level of control; in a synchronization involving 100 tables, for example, literally thousands of different events are available. Of course, most events don't need scripts, the default action is okay; in the case of NC DOT, custom scripts were written for the following events:
||Purpose of Script|
||for each table, specify how each uploaded insert is handled|
||for each table, specify how each uploaded update is handled|
||for each table, specify how each uploaded delete is handled |
||for each table, record how many rows were uploaded|
||determine the set of tables and rows are to be downloaded|
||for each table, select the rows to be downloaded|
||for each table, select the primary keys for rows to be deleted from the remote database|
||record which set of tables and rows don't need to be downloaded again until they change|
||for each table, record how many rows were downloaded |
||special handling for specific errors that may occur|
Sophisticated applications often impose complex business rules on the synchronization process, and the Mobile Inspector is no exception. MobiLink scripts are used for business rules as straightforward as "only District Supervisors get rows in District Supervisor Inspection table" as well as selection rules that are more complex. For example, the following download_cursor script selects Inspection rows that (1) are assigned to particular inspection team, and (2a) have not already been downloaded or (2b) have been updated since the last time they were downloaded. The SELECT shown here makes use of the BRI_RECD_SYNC_IDS table to identify rows assigned to this team; an earlier, much longer, begin_download script dynamically populates this table so it is available for use in scripts like this one:
( SELECT *
= TO_NUMBER ( USERENV ( SESSIONID ) )
= BRI_TMP_SYNC_VRBLS.GLOBAL_DB_ID -- for this remote
= BRI_INSPCTN.STRCTR_ID -- matches on structure id
= BRI_INSPCTN.INSPCTN_ID -- matches on inspection id
AND ( BRI_RECD_SYNC_IDS.FRST_DOWNLOAD_TS
= '1900-01-01' -- not yet been downloaded
>= ? ) ) -- updated since the last synch
MobiLink scripts are invoked by the MobiLink server, but they are actually executed by the central or "consolidated" database server, passing rows back and forth to the MobiLink server. At NC DOT, two sets of MobiLink servers and two sets of scripts are used to implement the three-level database architecture. At the top level, scripts, written in Oracle PL/SQL, are run by the central Oracle 9i server; the example shown above is one of those scripts.. Rows are passed to the MobiLink server running at the central site, and from there, to a Team Leader tablet PC in an inspection office.
Out in the field, another MobiLink server runs on each Team Leader's tablet PC. It invokes SQL scripts in the ASA database server running on the same PC; those rows are passed to the MobiLink server, and from there, to Team Member PCs via wireless links. In this setup, the Team Leader's database acts as a remote database when synchronizing with Oracle, and as a consolidated database when synchronizing with a Team Member.
Challenges of Distributed Data
Distributed databases that are not in constant contact with one another cannot offer any mechanism for system-wide locking. In particular, there's nothing to stop you from inserting two rows with the same primary key on two different remote databases, thus causing a primary key conflict when those rows are uploaded to the consolidated database.
One solution is to use a composite primary key consisting of an ordinary autoincrement column plus a "database identifier" to solve the problem of autoincrement values overlapping for different databases. Multi-column primary keys irritate application developers by making foreign key definitions overly complex, so globally unique identifiers (GUIDs) are sometimes considered as a single-column alternative. GUIDs are 128-bit integers with values like '6F9619FF-8B86-D011-B42D-00C04FC964FF'; they're unique, all right, but they make for fat database indexes, and the values are awkward to deal with when debugging. Yet another common solution is to use key pools; that's where groups of unique key values are generated by the central server and downloaded for later use by individual remote databases. Key pools do work, but they're difficult to implement and administer properly, especially in a multi-level hierarchy. Plus, they generate extra network traffic.
ASA offers a more elegant solution to the problem of globally unique primary keys: a partitioned autoincrement facility via the GLOBAL AUTOINCREMENT column default. Each database is assigned a unique database identifier 1, 2, 3 and so on, and that value is automatically used to determine which partition or range of values is used to assign primary key values on that database.
This is very important to the Mobile Inspector application because new rows are frequently inserted in many tables on each remote database, and they all have automatically assigned or "artificial" primary keys. The following example shows the BRI_SKTCH table that is used to hold freehand sketches in a BLOB column. The partition size is set to 2,000,000, which means each inspector can create that many sketches before the partition is exhausted. The 32-bit signed integer primary key can be as large as 2,147,483,647; divide that by 2,000,000 and you get 1,073, the maximum number of databases that can use this mechanism without any primary key conflicts. That works for the Mobile Inspector; for other, larger applications, unsigned 64-bit integers are available.
CREATE TABLE BRI_SKTCH (
SKTCH_ID INTEGER NOT NULL DEFAULT GLOBAL
AUTOINCREMENT ( 2000000 ),
STRCTR_ID INTEGER NOT NULL,
SKTCH_TTL VARCHAR ( 45 ) NOT NULL,
SKTCH_DES VARCHAR ( 130 ) NULL,
FILE_NM VARCHAR ( 32 ) NULL,
DEL_FLG_IND BIT NOT NULL DEFAULT 0,
LST_MDFD_TS TIMESTAMP NOT NULL DEFAULT TIMESTAMP,
SKTCH_LOB LONG BINARY NULL,
SKTCH_NBR INTEGER NOT NULL,
SKTCH_SZ_AMT INTEGER NULL,
PRIMARY KEY ( SKTCH_ID ) );
Another challenge of distributed data is to reduce the amount of network traffic required to keep the databases in synch. Before the advent of specialized synchronization software, application code had to be written to determine what data needed to be shipped back and forth. Sometimes this code transmitted too much data in an effort to make sure nothing was missed. Other times, the transmission time was just as long as it searched for differences that needed to be reconciled.
The MobiLink client reduces network traffic by reading the remote database transaction log to determine what must be uploaded. Only those rows that have been inserted, updated, or deleted since the last synchronization are sent. Also, if a row is modified more than once between synchronizations, only the final version is uploaded.
The MobiLink server doesn't have the same advantage when building the download stream. Because it works with different databases such as Oracle, SQL Server, and DB2, MobiLink doesn't have access to the transaction log. If you don't want to download an entire table every time, you must code an appropriate WHERE clause in the download_cursor script.
That's where another ASA feature comes into play: a timestamp column defined with the special DEFAULT TIMESTAMP attribute will be updated automatically every time the row is updated, as well as when the row is inserted. The earlier CREATE TABLE shows how each table can have a LST_MDFD_TS column that is kept up to date without having to code a trigger. The SELECT statement shown before that contains the predicate LST_MDFD_TS >= ? to limit the download result set to only those rows that have changed since the last synch. The MobiLink server automatically substitutes the "last synchronization timestamp" value for the "?" placeholder, making it unnecessary to write any custom code to keep track of when each remote database is synchronized.
My involvement with the Mobile Inspector project began in late March 2004 and continued full-time through April, with shorter visits in May and June and again in September.
During April, the SQL Anywhere schema changes were implemented, including the introduction of the GLOBAL AUTOINCREMENT primary keys, the addition of the "LST_MDFD_TS" columns, and the new tables required to support the business rules affecting synchronization. The MobiLink scripts were developed and tested during this period, using ASA as the consolidated database, and then the Oracle database was created.
In May and June, the Oracle PL/SQL version of the MobiLink scripts was created, and full three-level Oracle-ASA-ASA testing was performed.
In the meantime, development proceeded on the application, requiring another visit in September to revise the MobiLink scripts and perform extensive testing prior to rollout.
Ever since leaving North Carolin,a I've started noticing roadside crews, wondering whether they're doing an inspection. Maybe someday, I'll be able to tell by whether or not they're using mobile computers. The final shipment of tablet PCs has arrived at NC DOT, a real indication of success in a cost-conscious jurisdiction. If an application like Mobile Inspector works there, it can work anywhere.
My experience on the Mobile Inspector application reinforced two lessons learned over the years: First, it's easy to underestimate the complexity of a synchronization setup, especially when it is an integral part of a large business application. Second, synchronization at the level of relational database data simplifies the setup because it separates the synchronization logic from the application code, reduces complex data structures to simple rows and columns, and exploits basic database features like transaction commit and column default values.
About the Author
Breck Carter is principal consultant at RisingRoad Professional Services, providing consulting and support for SQL Anywhere databases and MobiLink synchronization with Oracle, DB2, SQL Server, ASE, and ASA. Breck is author of SQL Anywhere Studio 9 Developer's Guide from Wordware Publishing. He can be reached at email@example.com.