INET Conferences

	Conferences

	INET
	NDSS Other Conferences

Integrating Front-End Web Systems with Back-End Systems

Mitchell COHEN <macohen@us.ibm.com>
IBM T.J. Watson Research Center
USA

Abstract

This paper presents the implications of different storage and replication techniques for the key business objects being used by both front-end and back-end systems simultaneously. Much work has been done on data synchronization and replication in the database area as thoroughly discussed by Bernstein, Hadzilacos, and Goodman. This paper will concentrate on the implications that the different data synchronization and replication paradigms have on common business objects needed for both a merchant server running on the Web and internal business systems. First, the need for integrating front-end and back-end systems from a business-object standpoint is established. Second, five paradigms of data storage and replication are defined. Finally, what each of the paradigms means for the different business objects is discussed.

The need for integrating front-end and back-end systems
Choosing data storage and synchronization
Five basic paradigms
Analysis of the business objects
- Implications of the paradigm attributes on the business objects
Paradigm recommendations for the business objects
Reference

The need for integrating front-end and back-end systems

The Electronic Commerce Revolution has created many new computing needs among which is back-end integration. One way to conduct business on the Internet is to have a front-end system running on a Web server that handles the interaction with the outside world. An example of such a system is a Web merchant server enabling customers (and potential customers) to browse and search product catalogs and descriptions in order to price and purchase items.

Internal business processes such as tracking inventory, processing orders, and accounting are already being handled by back-end systems for many companies. A front-end system requires much of the data already stored in an existing back-end system (for example, product availability) while at the same time, new data coming in from the front-end system (such as new orders) must be propagated to the back-end system for internal business processing. How can we connect the front-end systems to the information already in the back-end systems and have the two work together coherently, efficiently, and quickly? What should be considered for the different types of business objects?

Figure 1. Where replication and storage strategies fit within a front-end and back-end

Choosing data storage and synchronization

There are several different basic approaches to storing data when a front-end system (such as a Web merchant server) works in conjunction with a back-end system (such as an Enterprise Resource Planning system). Different methods can be chosen for the different object types. For example, it may be best to have the customer information fully synchronized and replicated while only the back-end stores inventory information.

The different data storage and replication paradigms are based on two basic considerations:

Whether to store the data in only one of the front-end or back-end systems or in both of the systems
At what level or interval to synchronize the data

There are many factors affecting the decisions. For each data type, one needs to consider desired or needed response times for data reads at each end of the integration as well as response times for data insertions or updates.

A requirement for having the front-end up during back-end downtime plays a role in the paradigm decision-making process. Many companies bring down their back-ends during nonbusiness hours. However, the Web eliminates the notion of "business hours" when dealing with consumers or possibly even other businesses. The ability for the front-end to run when the back-end is down is called front-end independence. In certain situations, perhaps for some smaller companies or in some business-to-business environments, it may be acceptable if the front-end is not fully functional periodically. In all the paradigms, the back-end has no reliance on the front-end being up, a key integration issue as companies do not want their normal (i.e., existing) business to suffer due to Web presence.

Simplicity of solution from development, maintenance, and cost perspectives is also a factor. Of course, for some data, part of the decision is predetermined as particular back-end systems require specific data items to be stored in and accessible directly from their systems. The same holds for many front-end systems.

Some of the paradigms require user exits or triggers in one or both of the ends. For instance, in order for a change in a back-end system to be synchronously applied in a front-end system, the back-end system needs to inform the front-end of the change via a call to the front-end (or some middleware sitting between the two ends). To make this call, the back-end must be enabled to make external procedure calls inside the business processes at the correct parts of the process. Many front-end and back-end packages come with this capability, and homegrown software can be enhanced to make these calls. Additionally, many of the front-ends and back-ends are based on databases that allow triggers.

For an asynchronous or batch update that needs to be sent from one end to the other, no additional processing is needed by the sender as long as an external process is able to poll (that is, check at a regular interval) for such updates. Some back-ends allow users to check for changes since a particular date and time.

Regardless of the storage scenario choice and implementation, for each type of object a decision must be made as to where updates of the object type are allowed. For instance, perhaps it makes sense for new customers to be added to both front-ends and back-ends while keeping pricing as only updateable via the back-end. Finer update granularity may be needed. Perhaps a customer address can be updated from the front-end, but only the back-end would be allowed to update the customer's credit terms.

Five basic paradigms

Combining the storage and update decisions gives us several paradigms. For ease of discussion, the different scenarios of storage locations and replication timing will be described in five different paradigms. While in reality a hybrid of two or more of these paradigms can be used for each object type, only the chosen paradigms will be defined and analyzed.

Synchronous replication

In synchronous replication, the data is stored in both the front-end and back-end systems with immediate replication. Any data changes (inserts, updates, deletes) made at either end are propagated to the other end as they occur, meaning that data at both ends are always consistent. To ensure consistency, changes at one end should wait for successful propagation to the other end. On the other hand, for data reads there is no need to go to the other end; that is, there is quick read access on both ends.

With this paradigm, careful thought must be given to handling data changes at one end, say end X, when the other end, say end Y, cannot be communicated with. If it is known that end Y is not running, data changes at end X can be logged and processed at Y when Y is started up. If end X is unable to communicate with the end Y and it is unknown whether or not Y is up and running, two different potential solutions are

Allow data changes at end X, logging them for reconciliation when communication between X and Y is restored; there are many different algorithms for reconciling such potentially conflicting data changes.
Disallow data changes at one of the ends, eliminating the need for any kind of reconciliation. Because the two ends are unable to communicate, one of the ends, the lower priority one which is typically the front-end, needs to know in advance that it will not allow data changes during noncommunication periods.

Depending on the intricacy of this reconciliation, implementation of this paradigm can be complicated and quite involved.

Periodic replication

In periodic replication, the data is stored in both the front-end and back-end systems but data changes are propagated periodically in batch. Similar to the noncommunication periods in the use of synchronized replication, additions, modifications, and removals of a business object occur at one end only. Otherwise, there again needs to be a reconciliation of the changes.

Because each end stores data locally, both data access and changes are quick because there is no need to access a remote system, and each end is fully functional regardless of the status of the other end. However, there will be periods of data inconsistencies with the potential existence of outdated data.

Real-time access and update

With real-time access and update, only the back-end stores a copy of the data. The front-end accesses and changes data by going directly to the back-end on the fly (via a back-end API, ODBC, or some other similar mechanism). While it is possible to have it set up in the opposite manner, with the back-end accessing data stored at the front-end, this is atypical as most of today's existing back-end systems use internal data stores lacking flexibility, and hence, back-ends accessing front-end data will not be considered.

Without replication, accessed data is always current and there are no consistency issues. The two main disadvantages here are that a front-end relying on a back-end's data will not be functional (or at least fully functional) when the back-end is not operating or communicating, and data retrieval and modification may be time-consuming on the front-end as they are done remotely. Of course, data update on the back-end is quicker than in the above-mentioned replication paradigms because there is no replicating.

Real-time access/batch update

With real-time access/batch update, again only the back-end stores a copy of the data and the front-end accesses the data by going to the back-end. Data changes are done locally at the front-end and sent to the back-end periodically in batch. There is no need to wait for the back-end on these changes, but the back-end does not always have a current view of the data as changes may be lingering on the front-end side. With intelligent processing, the front-end can get an up-to-date view of the data by going to the back-end and factoring in its unforwarded data changes. So, the data changes at the front-end are quick, but data accesses are not. Additionally, the front-end cannot fully function during back-end downtime as it will be unable to access the data, but front-end functionality requiring only data changes (and not reads) will continue to be available.

Disparate

The disparate paradigm, the least useful in integrating the two ends as it is actually not integration at all, has both the front-end and back-end systems keeping their own copies of the data without any synchronization. The lack of any data consistency limits the usefulness of this option, but deserves mentioning as a distinct paradigm that can be chosen at least for certain business objects.

There is one situation where the disparate paradigm can be useful. Some companies choose to separate the Web customers and their existing sales methods (telephone, mail, store). For them, while the same products (at least some of them) need to be accessible at both ends, customers may be separated. All the customer information for shoppers on the Web is stored at the front-end only. The back-end may have one generic place holder for a Web customer. To get the correct ship-to address, the information is passed from the front-end to the back-end as part of the order. Some back-ends allow an order to contain a "ship to" different from that which appears with the customer data.

Although generally impractical, this paradigm is not without advantages as there is no need to wait at either end for data accesses and changes or for the other system to be up, and it has merit when used as part of a hybrid method. One example of such a hybrid method deals with inventory. During periods of high inventory levels, we can use the disparate paradigm with inventory stored without any synchronization at both ends; when either end recognizes inventory becoming low, it warns the other end to switch over to a synchronous replication mode -- the rationale being that when inventory is high, we do not need to check inventory availability, affording a quick response. Availability needs to be questioned only when the inventory is low.

Summary of Paradigms
Synchronous Replication Periodic Replication Real-Time Access and Update Real-Time Access / Batch Update Disparate

current, consistent information Y N Y N N

quick retrieval Y Y N N Y
may want an initial population

quick update N Y N Y
update not reflected in front-end Y

front-end independence Y
need logging when back-end is down Y N N Y

implementation complexity complicated somewhat complicated easy fair simple

Analysis of the business objects

There are many business objects that can be analyzed. The main front-end application being investigated here is a merchant Web server. Customer, Catalog Item, Price, Inventory, Order, and Payment typify the business objects needed for Web shopping that are also stored in typical internal business process back-ends. Two types of users will be considered: external customers on the front-end and internal employees using a system for internal business processes on the back-end.

Implications of the paradigm attributes on the business objects

Customer

A Web merchant server front-end allows new customers to register and may allow for customer data to be imported from a back-end. The customer object includes obvious data such as name and address, but may also include profile information used to gear what kinds of marketing strategies are used to sell to the customer. There are two different patterns for access of customer data by the front-end (not including a customer userID and password required at each login):

The customer information is accessed only during the creation and modification of registration information. The back-end handles pricing, including tax and freight, as well as order processing details such as shipping.
Some of the merchant server functions access and rely on the customer information.

The first option requires no customer information to be permanently stored at the front-end. Only the second option will be considered, as it enables more functionality for a merchant server such as specialized marketing to the customers based on their information.

up-to-date, consistent information It is generally not critical for a back-end to have the latest front-end update immediately. A customer can accept entering a change via the Web and there being a lag before a customer service representative sees the change. On the other hand, it is critical for the front-end to have the latest back-end changes as inconsistencies can cause pricing, shipping, and other problems. For example, a customer may wind up with a different classification (for credit, pricing, etc.) and be treated incorrectly via the front-end. Many of these problems can be rectified on the front-end by verifying them at the back-end during shopping. Also, ordering, processing and, depending on the strategies chosen for the inventory and price objects, shipping and ordering at the front-end may be making back-end requests already.
quick retrieval Quick retrieval is critical for both ends, but if a front-end can be made to read customer data while a customer logs on (before beginning shopping on the front-end), the response time restrictions are less severe.
quick update Update response time is critical for both ends, but if the forwarding of an update by the front-end may be done asynchronously without waiting for a response, it is no longer critical; for asynchronous update to be allowed, a means for warning a customer of a failed update, such as e-mail or a warning via the Web interface during subsequent shopping, is necessary.
front-end independence Customer data independence at the front-end is critical for those merchant servers using the information throughout their Web shopping setup.

Catalog item

Catalog item information is the center of a business. A catalog item gives the internal identifier for the product, product descriptions and attributes, and a price or its associated pricing object for more complicated pricing mechanisms.
current, consistent information Some businesses have few changes to their product catalog, and others make bulk changes infrequently. Consistency does not burden these types of businesses. On the other hand, many businesses constantly change their product catalog and require these changes to be immediately available, making consistency highly critical.
Some merchant servers supply catalog tools which aid the server administrators in designing Web pages based on the catalog. Some tools include categorization, browsing paths, and attribute assignment. These tools add an extra level of indirection where back-end catalog changes are passed to the catalog tool. Then, when the administrator finishes using the tool to design how the changes will affect the Web pages, the administrator publishes the pages to the actual merchant server. When these steps are being taken, up-to-date information is unimportant as changes made at the back-end do not reach the front-end until the catalog tool is used.
quick retrieval The majority of data accesses by a merchant server are of catalog items. Shopping, made up of product searches and catalog browsing, make up the bulk of these accesses and require quick retrieval. Back-ends such as Enterprise Resource Planning systems handle warehousing, inventory tracking, and purchasing, relying heavily on the product catalog and quick access.
quick update Businesses which infrequently update their product catalog do not require quick updates. Businesses with constantly varying catalogs may, but these updates typically occur only at the back-end and it usually suffices to do them periodically.
front-end independence A Web merchant server needs the catalog items to have any significant functionality. Dependency on the back-end for this data is unacceptable unless front-end downtime is tolerable.

Inventory

A front-end views inventory only for availability. The back-end may have a much more elaborate view of inventory such as how much is at a particular warehouse, expected incoming and outgoing shipments, etc. The only inventory relevant to integration is product as opposed to materials used for product creation.

current, consistent information During periods of high inventory levels for a product, having up-to-date inventory levels is unnecessary as there is no doubt about availability. The back-end will always want current inventory levels as they play an important part in many business decisions and processes.
quick retrieval On the front-end, access time is crucial when the following apply:
Inventory is being checked for availability (during low-level periods)
Shopping is real-time. This is the case with most merchant servers. The alternative is offline availability or order completion responses (via e-mail) to shoppers, a generally unacceptable solution.

quick update Quick updates become imperative during low inventory levels and at other inventory-sensitive times of internal business processes such as inventory replenishment.
front-end independence Independence is preferable but not mandatory. During back-end downtime, the front-end can warn users that an order may be rejected via e-mail and that the status can be checked at a later date (for those who do not have e-mail).

Price

In most business-to-business environments, as well as some consumer environments, prices can be very complicated and become business objects of their own (as opposed to just being a single-value attribute of a catalog item).

current, consistent information Current prices are needed when there are customers who purchase through the Web and some other means such as phone order. These customers should not be presented two sets of prices unless intended. The nature of some businesses also requires up-to-date prices.
quick retrieval Web shopping necessitates quick price retrieval. Cases exist where the back-end has a complicated pricing mechanism that is not mimicked in the front-end and the back-end does not provide quick response. One way to deal with this is to store "list prices" in the front-end, making clear to shoppers that their actual price will be accessible later, possibly e-mailed to them.
quick update When pricing is done in the back-end, updates are not applicable as prices cannot be altered from the front-end. When pricing is replicated in the front-end and pricing changes are allowed at the front-end, the quickness needed for updates depends on the business.
front-end independence No choice is available here. Having "list prices" in the front-end gives it independence. Complicated pricing methods in the back-end that cannot be mimicked in the front-end remove the possibility of independence.

Order

The three actions performed on an order are price, place, and check status. An order consists of the purchaser and the items and quantities as well as other additional information such as ship-to addresses. Shoppers order from the front-end and the back-end. The back-end needs to know about all orders while a front-end needs to know only about its own orders unless customers are allowed to check the order status of nonWeb orders via the Web. Typically, merchant servers need not know about any orders placed in the back-end. However, other front-ends (besides merchant servers) and possibly a merchant server may allow order status checks on back-end orders.

current, consistent information Pricing and order placement only have existing information when incomplete orders are allowed to be saved. Otherwise, all the order information is new, and the notion of current information does not exist. For order status, it is generally unimportant to have the most recent information immediately.
quick retrieval For order pricing, quick retrieval is imperative as you do not want to lose the customer's attention and potentially a sale. For order status, choosing the correct paradigm will make it easy to provide quick retrieval, but it often is not necessary. In many business-to-business situations, an e-mailed order status response is the norm.
quick update During order placement, the confirmation that the order has made it into the system (that is, initial update of the order status to show that the order has been accepted) needs to be immediate. The rest of the order status changes do not have this requirement.
front-end independence For order status, when delayed response via e-mail is acceptable, independence is unnecessary. When an order status requires an online status, independence is necessary. For order placement, independence is desirable, but there usually is already a strong dependence on the back-end for pricing. In the cases where the list price is shown to the user during the ordering (instead of going to the back-end) or the price is stored in the front-end, independence becomes quite useful.

Payment

Similar to orders, payments are generally only tracked in a merchant server when their corresponding orders were placed there. Consumers pay at the time of purchase on the merchant server using a credit card or some standard payment protocol such as Secure Electronic Transaction (SET). In the business-to-business environment, payments are usually made in traditional manners and entered in the back-end internally. Computerized standard payment schemes will change how payments are made here, too.

current, consistent information Consumers pay for their front-end purchases during the purchasing process. So, the information is up-to-date. For businesses receiving credit for their purchases, the immediacy of payment information needed directly relates to the desired level of up-to-date information that these businesses get when they query the payment status of an order on the Web. Typically, a lag is quite acceptable.
quick retrieval Typically, quick retrieval of payment information is not critical at the front-end. If the merchant server allows a buyer to check on past payments, e-mailing the payment information to the buyer suffices. Of course, if quickly available, then browser display is even better.
quick update Payment updates need not be quick. The front-end will have current information on payments made through the front-end. Payment information does need to be propagated to the back-end for internal processes, such as accounting, in a somewhat timely fashion but not immediately.
front-end independence Independence comes automatically as front-end purchases come through the front-end. If the back-end is down, propagation of the payments can wait without having any effect on the front-end's ability to perform.

Paradigm recommendations for the business objects

Customer

Because customer data is vital in so many parts of the front-end, including marketing, product display, and even shopper log-in, independence and, therefore, replication are needed. Changes made in the front-end need not be seen in the back-end immediately, but those made in the back-end that affect pricing or terms given need to appear quickly in the front-end. So, if periodic replication is used, this period needs to be short in the back-end to front-end direction.

Catalog item

When the front-end has a tool to design Web pages and shopping scenarios based on the catalog, periodic replication with a relatively long period is the clear choice. Having the data at both the front-end and back-end is needed, but the absolute latest changes are not. An exception to this recommendation arises when there is no catalog tool and the business requires constantly up-to-date information. In this case synchronous replication is generally appropriate, with real-time access being acceptable for back-ends with quick response times and high availability.

Inventory

When inventory is above some danger level (where the largest typical order of an item is able to be filled), the periodic replication fits well. The period of replication need not be a fixed time interval. Back-ends can send updates to front-ends when levels get low, while front-ends update back-ends at fixed item order counts. Most merchant server systems display availability to shoppers on the Web browser. So, periods of low inventory (below the danger level) necessitate synchronized replication. Back-ends control the switching between synchronized and periodic replications by comparing the inventory level at each inventory change they process (including updates sent from front-ends).

Price

Enterprise Resource Planning systems provide the ability to create extremely complicated pricing schemes based on a number of factors: the product, the customer, the quantity, the payment method, and many others. When such complicated methods are being used, it is unreasonable to attempt to duplicate this logic in a merchant server. In this case, real-time access of pricing in the back-end must be used by the front-end. The merchant server will not need to send updates, as pricing changes will have to be made in the ERP system. On the other hand, when pricing is straightforward, based on just the product or just the product and a customer classification, then replication makes sense (in a fashion similar to that described in Catalog Item).

Order

For integration with a back-end that provides quick response time to order status checks, a form of the real-time access and update works well. Incoming orders are fed to the back-end during order placement at the front-end. Order status requests can go to the back-end. On the other hand, when back-ends cannot be relied on for quick turnaround, replication needs to be used -- synchronous when highly accurate order status checks are required, periodic when not.

Payment

For merchant servers handling consumer-type purchases (via credit card or a protocol such as SET), payments are typically replicated in a synchronous fashion during the order processing. The payment data on the front-end is a subset of that on the back-end, representing only the payments for front-end orders. In business-to-business environments where credit is used, periodic replication makes sense. All payments for businesses allowed access to the merchant server would need to be replicated. Payment data need not be immediate, allowing for the replication to be periodic.

Reference

[1] Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman, Concurrency Control and Recovery in Database Systems. Addison-Wesley, 1987.

INET Conferences

Integrating Front-End Web Systems with Back-End Systems

Abstract

Contents