Technology
Assisted Research Methodologies: Marianne
J. D’Onofrio Management
Information Systems 860-832-3297 IntroductionTechnology
has been used to supplement researcher activities for many years.
Specifically, technology has been instrumental in providing new and
innovative ways to collect and process data. New data collection techniques
have changed over time with a focus on enhancing statistical validity
and limited generalizibility while new processing methods have essentially
provided better and more sophisticated ways to analyze data. Both
new data collection techniques and processing methods have attempted
to increase efficiency and reduce costs. These new processing methods and data collection
techniques are being referred to collectively as technology assisted
research methods (TARM) in this paper. Technology
assisted research methodologies (TARM)
include: 1) data collection methods (TARM-c) and 2) data processing
methods (TARM-p). This paper
provides a historical perspective for TARM-c and provides insight
that should be carried forward as TARM-c is implemented using the
Internet. We
describe the data collection methods that have developed over a lengthy
period of time as TARM-c. These
data collection methods are described in the timeline that follows. TARM-c
is a superset of methods, which contains traditional statistical sampling
methods as well as ad hoc data collection methods. A framework for
TARM-c is shown in Figure 1. Figure 1
Technology
assisted ad hoc data collection (TARM-ca) is a term we use to describe
data collection activities that do not have the statistical validity
or reliability of data collected using statistical sampling techniques. TARM-ca can be referred to as selective non-sampling
(that is, collecting data for a specific purpose, but without the
rigor associated with the development of a sampling frame--usually
a convenience or non-random sample).
Much of the data collected over the Internet is of this type.
A sample of Internet TARM-ca might be the creation of a web
survey that is linked to your organization's home page inviting participation
by all who are interested. This
represents a non-random sample of people who visit your homepage and
by definition has limited generalizibility. We
refer to data collection assisted by technology that employs the rigor
of statistical sampling as TARM-cs.
An example of this would be the use of statistical sampling
methods to create a sampling frame then directing members of the sampling
frame to a secure website for survey completion.
You would then be able to generalize the results to your target
population. These
are two dramatically different approaches within TARM. In one, TARM-ca, you potentially get large quantities of data from
people who are interested in your web site but do not know much about
them. You are not assured
they represent the underlying population.
In TARM-cs your resultant data represents the underlying population
from which you drew the sample, and you are thus allowed to generalize
the results to that population. TimelineToday,
it is incumbent on us to examine the Internet for implementations
of TARM. Both TARM-ca and TARM-cs can be performed using
the Internet, but it is helpful to reflect on how technology has been
used to collect data over the last century.
There are valuable lessons to learn from earlier implementation
that can be carried forward to Internet TARM. Sampling theory can be traced to the late nineteenth century; however, basic statistical techniques for probability sampling were first proposed by J. Neyman (Neyman 1934) in his seminal work “On the Two Different Aspects of Representative Methods: The Method of Stratified Sampling and the Method of Purposive Selection.” Ensuing years gave us statistical theory for many new sampling methods, including cluster sampling, multistage sampling, replicated sampling, and systematic sampling. During the years that these theories were developed, acceptable sampling methods were galvanized by statistician’s attempts to correctly identify presidential election victors. In the 1970s social scientists were confronted with new challenges; namely, to collect more complex data via survey samples and to perform more complex statistics. (Frankel 1987) Today, researchers are faced with new challenges: Does the Internet make data collection more cost-effective and efficient? If so, can we collect even more and better data and perform more complex studies? However, before we answer these question, we should understand the uses of technology to collect data. The major milestones of computer assisted survey information collection are delineated in Table 1. The insights gained from these activities should be carried forward to TARM. Table
1
Figure
one shows us a growth from operator run technology assisted interviewing
in the 1970s to computer self-administered questionnaires in the late
1980s and early 1990s. In
the early 1990s the possibilities for using technology for data collection
grew exponentially with the ubiquitous nature of the Internet.
The years up to the mid-1980s were the OPERATOR YEARS, then
in the mid-1980s the notion of self-administered surveys was introduced. It should be noted that all of these techniques
involved the creation of a valid sampling frame. In
the 1990s the Internet came into fashion with all it portends for
data collection. However,
without the 1980’s value switch from technology assisted operator
run surveys to technology self-administered surveys, the interest
in using the Internet to collect data may have not developed. ConclusionThe
Internet has made it easier than ever for non-trained individuals
to collect massive amounts of data.
For instance, if you create a website which is fortunate enough
to generate large amounts of traffic, you have a ready-made sample.
It is self-apparent that you could retrospectively reference
all individuals that visit your website as your population, but there
are many issues that are not addressed using that logic. Specifically, the issues of statistical validity, reliability, and
generalizablity need to be considered. We
have shown a model of TARM that incorporates data collection and data
processing, and that the data collection subset (TARM-c) includes
two types TARM-ca – ad hoc data collection and TARN-cs – sampling
frame based data collection. TARM-c incorporates all of the technologies
in Computer Assisted Survey Information Collection (CASIC), Computer
Assisted Telephone Interviewing (CATI), Computer Assisted Personal
Interviewing (CAPI), and Computerized Self-administered Questionnaires
(CSAQ), as well as the host of Internet-based data collection technologies.
However, TARM-c adds the dimension of ad hoc data collection
and sampled data collection. With
the easy nature of data collection over the Internet, it is incumbent
that we carefully examine and understand how we employ TARM. This is especially important to organizations when business decisions
are made using the data generated through an internet-based survey. ReferencesCouper, M. and W. Nicholls, Eds. (1998).
The history and development of Computer Assisted Survey Information
Collection Methods. Computer Assisted Survey Informaiton Collection,
John Wiley & Sons, Inc. Frankel, M. R. a. F., L.R. (1987). "Fifty
years of survey sampling in the united states." Public Opinion
Quarterly 51(4): S127-S138. Neyman, J. (1934). "On the two different
aspects of the representative method: the method of stratified sampling
and the method of purposive selection." Journal of the Royal
Statistical Society 97: 558-606. |