For the mobile computing environment, binding the cyberworld to the real world is an important service. We have developed a geographical location information (GLI) system that binds a logical location of network entities to its geographical location information. Privacy control in the GLI system is important. In this paper, we propose a privacy control scheme for our GLI system. The scheme provides the following capabilities:
We are implementing a prototype GLI system with a privacy control mechanism. A preliminary evaluation of the prototype system is also given.
With the advance of computing and internetworking technology, the public today has a fairly wide knowledge of the Internet, and the mobile computing environment, including wireless communications and some navigation systems, has become popular. With the omnipresence of fixed and mobile computers on the Internet, their physical spatial location information is valuable for many kinds of computing applications on the Internet.
We have developed the GLI system to make that location information available on the Internet[1]. The GLI system provides a way to map the GLI of entities and identifiers on the Internet. Now, we are using the system in the InternetCAR Project[2] to manage the location information of moving cars. However, the current GLI system does not have any privacy control mechanism.
Many people do not like the prospect of having their every move tracked. This concern is legitimate, given that somebody's activities can often be inferred from where that person is. Hence, mechanisms for protecting privacy are necessary. After the experiments from the InternetCAR Project, we recognized that privacy control in the GLI system is quite important. Moreover, the integrity of the GLI is also essential.
Thus, we concluded that GLI needs to be protected against unauthorized disclosure and modification. In this paper, after we outline our GLI system, we consider the security requirements of the GLI system. Then we propose a concrete privacy-protecting security model for the system. We have implemented this model over a prototype of the GLI system. A brief description of our prototype implementation is also given.
Considering the intrinsic requirements for location service on the Internet (e.g., global scalability, reciprocity, and serviceability), we have proposed the GLI system to bind a logical location of network entities to the GLI.
In this section, we describe the structure of our GLI system and how to register the data and provide the location information to the client.
To access the entities, such as fixed and mobile computers and noncomputer entities with network connectivity, in a practical way, obtaining the GLI of an entity connected to the Internet is necessary. Put another way, the correspondence of spatial location information and cyberidentifiers in the Internet is required for such different deployment environments. Thus, we have developed the GLI system to bind a logical location of network entities to the GLI. This system was composed of the following three elements in the first prototype:
The GLI parameters are shown by the following table.
Table 1: GLI Parameters
Location | Latitude-longitude-altitude |
---|---|
Velocity | North-east-up |
Device | GPS, etc. |
Datum | WGS-84, OSGB-36, NAD-27, etc. |
Data type | C/A, P, RDGPS, Kinematic, etc. |
Time | Time of fix |
Basic GLI parameters include location, velocity, device type, datum, and data type. Location is expressed by a latitude, longitude, and altitude coordinate. Velocity is expressed by a combination of speeds in three directions: north, east, and up. Device represents a type of device for determining location. Datum is used to translate GLI to GLI based on this local area. Data type represents the order of accuracy. Time represents the time when the GLI was fixed.
With respect to scalability, in the second prototype system, we have separated the servers into area servers and home servers; the area server gathers the data from the entities that exist in the area the server covers, and the home server maintains the personal data of each entity.
This system makes it possible to look up the identifier of an entity on the Internet based on the physical location information and to search the location of the entity based on the identifiers. In other words, a client can look up the entity identifiers by specifying location as a key (the query "who is there?") and vice versa (the query "where are you?"). The upshot is that our system can allow the users to search for various entities in the real world through the Internet and to obtain up-to-date location information about the specified entities in real time. The current system uses an IP address or FQDN (fully qualified domain name) as the identifier in the database of the server.
The effectiveness of the GLI system was shown through experiments of the InternetCAR Project[2]. This system was independent of machine architecture and has been tested on BSD/OS, Sun/OS, and NEWS-OS.
The current GLI system is composed of four elements: home server, area server, agent, and client. In this section, we describe the role and the behavior of each element.
Some GLI information is valuable and available for everybody on the Internet. For example, geographical location and speed of entities can convert to traffic jam or parking lot information. But simultaneously, we also have some private information, such as the user name of the owner of each entity, in our system. The private information must be protected, but the current GLI system does not have a privacy control mechanism.
This section discusses topics related to secrecy and integrity in the GLI system: public or private information, the identifiers on this system, how to protect the personal data, and communication among system elements. We are especially concerned with finding a way to balance the need for privacy control against the need to maintain processing speeds and other measures of performance.
We basically define the following data stored in our system as public: geographical location (latitude, longitude, altitude), movement of entities (direction, speed), and several attributes of entities (switch position of lights, outside temperature, and so on). Such public information is useful for everybody on the Internet. For example, geographical location and speed of entities can convert to traffic jam and parking information.
However, many people do not like the idea of having their moves tracked. Thus, private information (e.g., the user name of the owner of each entity) should not be available to everybody. In our scheme, we control access to such private information. The GLI servers store encrypted private information, but they cannot decrypt the information. Servers can access these data only with the added access control lists (ACLs). Decryption of private information is the responsibility of receiver clients. By this method, we can keep the private information secure.
As stated above, we used the IP address or FQDN as identifiers on the current system. Nevertheless, not only raw personal data like these, but also the data that can be conjectured are quite insecure. For example, if you sent your GLI with an unprotected identifier frequently, an intruder could easily and successfully impersonate you. Needless to say, identifiers should be made by their legal owners if and only if they are authenticated.
This system requires that identifiers meet the following requirements:
Leaving private information unprotected, for everybody to see, is clearly undesirable. Processing the information is a simple solution. But when should the information be processed? In this section, we consider the three places for data to be processed:
We should determine the opportunity for communications, that is, protocol, by taking account of the disconnected operations.
We have to be concerned with the behavior of each component on the system, especially the communication and data protection components. The best systems will work in the following ways:
Unfortunately, it is impossible to make a system that satisfies these requirements with average computers in a practical way. Thus, as a result of some discussion, we propose the following realistic security mechanism for our GLI system.
We propose the following privacy control model on the GLI system to make it practical with some encryption protocols. The process of data encryption is a common way to protect information and to warrant its integrity. Moreover, the combination of some encryption protocols can allow applications to authenticate users if they have applicable information such as keys or if they can generate the adequate data using the information and tools that they have.
First, we describe some terms that are employed in our GLI system:
In the following sections, we explain the privacy control scheme on the latest GLI system using these terms.
We discussed the need for protected identifiers on this system to run it on the Internet. Because of that, we cannot continue to use the IP address or FQDN as identifiers. Thus, the pseudo ID, which cannot be guessed from the public information, has been introduced to the latest GLI system prototype.
What we need for our system is a unique identifier that seems meaningless to illegal users. Such an identifier, a pseudo ID, is generated by any string including t (time) and some area server information on the area server. The pseudo ID and real identifier are bound and managed on home servers.
The two possible methods for generating identifiers were discussed: (1) heuristic hash algorithms such as SHA[4][5] or HMAC[6], and (2) encryption chaining mode. We concluded that it was best to use the hash function for generating the identifiers on this prototype.
An ideal security mechanism for our GLI system is to make sure that protected identifiers and GLI information do not correspond. It is even better to have a feature that renders the GLI meaningless for intruders (surely legal users can obtain the right GLI). For these requirements, we utilize a cryptography technology to process a GLI.
In this section, we outline the state of data on the area server and the home server. On area servers we store some data (GLI, t, pseudo ID, and E[HS]). E(HS) is home server information encrypted by a shared secret key. On home servers we maintain another data set (pseudo ID, AS, t, and E[ID]+ACL). Since both of these servers have no decrypt routine (i.e., the servers do not recognize what they store), they cannot communicate with each other. Even if intruders succeed in capturing all of the data on these servers, they do not have the secret key.
We have to give careful consideration to balancing this logical fine-grained security mechanism against realistic performance on the Internet. As a result of this consideration, we determined to use symmetric cryptography to protect information and to guarantee the integrity on this prototype, and asymmetric cryptography for authentication.
In this section, we assume that agents have the secret key, which is shared by a group of agents for encryption/decryption and is distributed in advance. We call the shared secret key the "group key." The group consists of a hierarchy like IP address domains. A home server manages the agents of one or more groups that share the same secret key.
There is some doubt whether the shared secret key would be distributed safely to all huge groups on the Internet. Yet we expect that the key distribution would work well for small groups. That is why the group consists of a hierarchy, that is, the huge group would be divided into some small groups.
As stated above, the GLI system is composed of four elements: the area server, the home server, the agent, and the client.
We need the routine for generating a pseudo ID, that is, hash function, here. The data managed by area servers are pseudo ID with encrypted home server information of the agent, GLI, and time. Public location information is maintained by this server, and the client obtains some valuable information, such as traffic jam information, from the query "who is there?"
Home servers manage the following data: time, the encrypted real identifier with ACL, a corresponding pseudo ID of the agent, and the latest area server that generated the pseudo ID. This server maintains the latest pseudo ID and area server information of each agent, thus answering the query "where are you?" from legal clients.
Both servers should not have a decryption routine.
The agent needs the routine to encrypt the home server information and identifiers. Once they are encrypted, they would not be changed and used until the key is altered. If the agent belongs to only one group, there is no doubt about the key selection. This problem is under investigation in the case of multiple groups.
The data that are sent to servers are the following:
If the shared key is modified or the group members are changed, sometimes, according to the speed of key distribution, a disagreeable agent might use a shared key. This is why we decided to add the ACL to the encrypted data by the shared key.
In this element, we require the decryption routine and encryption one to request the location information of the specified person "where are you?" The client sends requests to applicable servers. If the client can obtain the encrypted data successfully, it decrypts and displays the result. Thus, the client can obtain all the public information provided by all agents but cannot see private information of an agent who belongs to a different group.
Here is a flowchart of GLI registration from agents to servers.
The query "who is there?" would be treated in the following way:
The query "where are you?" would be treated in the following way:
Our proposed model, that is, the latest prototype, can be used to address the requirements mentioned above. The next section gives a brief description of our prototype implementation.
We implemented a GLI system prototype with a privacy control mechanism as described in section 4. We used some functions from OpenSSL-0.9.4[7] EVP library to encrypt the data or to make a pseudo ID.
Our proposing model can be used to address the requirements of privacy control as mentioned above. This prototype allowed us to verify the consistency of our location system and its security model.
We will continue to research the following issues and discuss improving the GLI system. We want to develop a more precise information security scheme considering the balance between theory and practice.
Pseudo IDs should not be fixed forever, as mentioned above. We will considers having one-time pseudo IDs or setting a certain period, such as a day or a week, for pseudo ID expiration.
We expect to solve this problem using an authentication technique such as a digital signature.
We are now discussing solving the key distribution problem using KDC (Key Distribution Center), CA (Certificate Authority), and KPS (Key Predistribution System)[9] or IDKMS (ID-based Key Management System)[10].
When an agent tries to register the data in two servers, the area server and the home server, using this privacy control scheme, wire tapping might be possible. To be brief, the intruder might obtain some pseudo ID and source address pairs of the agent from IP packets. This would not always be a grave issue since the source address of the agent can be varied because of its mobility. However, in the interest of even more secure system architecture, we are investigating the use of IPsec as a transport of data registration in addition to the pseudo ID validity.
Location information is valuable to many kinds of computing and networking applications on the Internet. However, there are fears that inclusion of personal information, such as the owner names of the mobile computer, could lead to new security risks, that is, invasion of privacy.
In this paper, we have discussed the security requirements faced by location information services on the Internet, especially our GLI system. To enhance our GLI system's privacy control scheme, we considered several points, such as the kind of information, the disconnected operation, and the use of an encryption protocol.
We discussed a concrete privacy control model in the GLI system and concluded with a brief description of our prototype implementation, along with a discussion of related work.
The authors thank the members of Minato Lab, ITC, Nara Institute of Science and Technology, especially Ms. Mika Ito for providing support. We also thank the members of the WIDE Project, especially the rover working group.