Presented at the 38th Annual Conference of the International Military Testing Association (IMTA), 12 - 14 November
1996, Gunter Sheraton Hotel, San Antonio, Texas; co-hosted by the Air Force Personnel Center, Armstrong
Laboratory/Human Resources Directorate, and the Air Force Occupational Measurement Squadron.
Validation of a Procedure for Clustering Expressions of Frequency and Amount
William J. Phalen
Institute for Job & Occupational Analysis
Robert M. Yadrick
Technical Training Research Division
Armstrong Laboratory Human Resources Directorate
Walter G. Albert
Manpower and Personnel Research Division
Armstrong Laboratory Human Resources Directorate
This paper is the third in a series of reports concerned with the scaling of words and phrases
expressing qualitative levels of "frequency" and "amount." The initial paper by Yadrick, et al.
(1993) replicated a study by Bass, Cascio, and O'Connor (1974) which used a magnitude
estimation procedure to scale 39 expressions of frequency and 44 expressions of amount. This
study, however, used Air Force basic trainees to provide the estimates. The resultant scaling of
expressions was quite similar to that of Bass, et al., even though the rater populations were quite
different. This suggested that there are commonly shared perceptions regarding the weightiness
of such expressions, independent of the samples which provided the estimates. Nine expressions
of amount were selected to constitute an equal interval scale, based on their magnitude-estimated
weights. These were tested in 40 computer administered occupational surveys to 160 cases and
were found to produce more valid and reliable results than the traditional nine-point relative time
spent scale, as reported in Albert, et al. (1994).
The next study by Yadrick, et al. (1994) described the application of a new, univariate procedure
for clustering the weighted expressions into groups of equivalent (or synonymous) expressions. It
appeared to the researchers that the expressions within each group were sufficiently synonymous
with one another that a single expression might be picked from each group to represent an equal
interval scale with the optimal number of points. It remained to be determined whether the
divisions or cut points suggested by the groups in the cluster solution had captured the
"psychologically real" levels of a frequency or amount scale embedded in the ordered lists of
magnitude-estimated expressions. Therefore, a validation study was undertaken to determine the
degree of correspondence between mathematically defined cluster groups and perceptually
defined, or psychologically real, groups of expressions. This paper will report on the development
and application of the criterion measure used to validate the univariate clustering procedure.
The Clustering Procedure
A special measure of "equivalence" was designed in the previous study to describe the similarity
of expressions and to cluster them into homogeneous groups along a single dimension
("frequency" or "amount"). Since the dimensional values for each expression were derived by a
ratio measurement technique (magnitude estimation), equivalence was computed as a ratio-based
measure with exponential magnification of ratio differences to accentuate dividing lines between
nonequivalent sets of equivalent expressions. In the basic equation, the pairwise ratios are
converted to logarithms and the logarithms are summed algebraically so that positive logs (ratios
> 1.0) and negative logs (ratios < 1.0) representing equal ratio differences from log 1.0 = 0 cancel
each other (compensatory effect). Thus, two raters who disagree as to which of two expressions
is greater or less will negate each other's estimates. This feature is contrary to standard similarity
or overlap measures, which treat all differences as representing dissimilarity (noncompensatory
effect). A detailed description of the "equivalence" equation with example computations can be
obtained from the senior author.
In this study, the equivalence equation was applied first to determine the equivalence of the top
two expressions in the list, i.e., expressions "1" and "2" on the lefthand side of Table 1 for
"frequency" and likewise on the lefthand side of Table 2 for "amount". If the equivalence value
exceeded 80.00%, then the equivalence of expressions "1" and "3," then "1" and "4," etc. was
computed, with expression "1" repeatedly being used as the target expression, until the
equivalence value fell below 80.00%. At this point, the set of expressions falling within a
minimum linkage of 80.00% was selected as the first group of expressions representing the
highest level of the scale. Thus, in Table 1 (lefthand side) expressions "1" through "3" formed the
first group of equivalent frequency expressions with a minimum equivalence (between expressions
"1" and "3") of 83.81%. The next target was expression "4." Expressions "4" and "5" were
compared, "4" and "6," etc. until "4" and "10" yielded the lowest acceptable equivalence of
80.66%. The remaining groups were formed in a similar manner, resulting in 12 "frequency"
groups (or scale levels) and 13 "amount" groups (or scale levels). The minimum equivalence
values are reported to the right of each group.
Column 1 of Tables 1 and 2 also shows the magnitude-estimated weights for each expression, as
derived by Bass, et. al. Although the groups of expressions generated by this clustering
procedure seemed to be very reasonable in the judgment of the authors, it could not be assumed
that the groupings represented psychologically real divisions without comparing them against a
criterion of psychologically derived groups of expressions.
A survey was developed that contained the ordered lists of frequency and amount expressions
(with their weights deleted) in four different versions that presented frequency expressions first,
followed by amount expressions, or vice versa, and presented the expressions in high to low order
("Always" to "Never," and "All" to "None"), or vice versa. The instructions asked respondents
to follow a procedure that was analogous to what was done in the clustering procedure, but
substituting his or her psychological or perceptual estimates of equivalence in place of our
mathematical calculations of equivalence. More specifically, each respondent was asked to begin
with expression "1" on the list, which was already circled, as the first target expression, and to
compare expression "2" with it. If expression "2" appeared to be pretty much equivalent in
meaning to expression "1," the respondent would proceed to compare expression "3" to
expression "1," then "4" to "1" and so forth, until reaching an expression that did not appear to be
reasonably synonymous or equivalent to expression "1." The respondent would then circle this
expression as the next target expression against which to compare the expressions following it.
This procedure would continue until the entire list was evaluated. The result would be a set of
psychologically derived groups of expressions based on the perceptions of that respondent. In a
completed survey, each group of expressions could be identified as beginning with a circled
expression and ending with the expression immediately preceding the next circled expression.
A sample of 42 respondents, consisting of behavioral scientists and clerical workers at the
Armstrong Laboratory and at three contractor offices provided responses. Approximately equal
numbers of each version of the survey were completed to provide the desired counterbalancing.
The groups identified by the 42 respondents were consolidated into a matrix whose rows and
columns indicated the number of times each expression was selected as a "beginning" or "ending"
expression, respectively. Evaluation of the row and column totals made it possible to select the
set of groups which provided the best overall fit of the individual respondent data. The resultant
sets of psychologically defined groups based on the perceptual judgments of 42 respondents are
shown in the righthand portion of Table 1 for the "frequency" expressions and in the righthand
portion of Table 2 for the "amount" expressions. The minimum equivalence values for the
psychologically defined groups are also reported as additional points of comparison with the
mathematically defined groups.
Table 1 clearly shows a fairly high degree of correspondence between the mathematical versus
psychological clustering of expressions of frequency. Three groups are identical, and three
additional groups share one boundary in common. Although the mathematical clustering defined
12 groups vs 8 groups for the psychological clustering, three of the mathematical groups are
single expressions which abutted sharp changes in the magnitude-estimated weights.
It might be argued that the expressions constituting the singleton groups should be dropped,
especially "seldom," which is clearly out of place. This would reduce the number of
mathematically defined groups to nine, while the psychologically defined groups would remain at
eight. This would also increase the correspondence between the two clusterings. A major point
of difference involves expressions "28" through "35" (if "seldom" is eliminated). The
mathematical clustering separated this set of expressions into three groups, while the
psychological clustering considered the set to be one group. In this case, the psychological
grouping makes more sense. However, it should be noted that the minimum equivalence for this
group (expressions "28" through "35") is only 0.19, which is to say that there is an extremely
large ratio difference between a weight of 4.72 for "very seldom" and a weight of .33 for
"seldom." If "seldom" is dropped, the minimum equivalence for the group would jump to 48.84,
which is still low. Overall, the correspondence between the mathematical and the psychological
clustering is reasonably good, especially considering some of the questionable weighting of
As is evident from examining Table 2, the correspondence between the mathematical and the
psychological clustering is not as clear for expressions of amount. The greatest difference is at
the very top, where the first four expressions in the psychological clustering constitute four
singleton groups. It appears that the respondents felt that "all" was significantly more inclusive
than "an exhaustive amount." However, "almost entirely" is certainly not "exhaustive" and so
had to be separated out. Then comes "completely," which seems suspiciously like "all" (if
adverbs can be like adjectives) and so "completely" had to be separated from "almost entirely."
The problem encountered here is one of context. The raters who made the magnitude estimates
rated each expression separately, without seeing how the expressions would ultimately be ordered
when listed together; whereas, the respondents in our study had no choice but to follow the order
in which the expressions were listed when they defined the group boundaries for equivalent
expressions. Again, the mathematical clustering defined more groups in the mid- and low-ranges
than the psychological clustering. The psychological clustering seems a bit stretched in putting "a
lot" in the same group as "a moderate amount." It is harder to quibble with the psychological
group that includes expressions "34" through "43."
In the mathematical clustering, the singleton group containing "a limited amount" should be
dropped. It is another one of those fuzzy expressions that is hard to define clearly, since all
amounts other than "all" are "limited" amounts. Where the mathematical and psychological
groups do not correspond, it would appear that the mathematical grouping at the upper end of the
scaled list is superior, but the psychological grouping at the lower end is superior.
It would appear that there is sufficient evidence that the mathematical clustering procedure used
in this study does a reasonably good job of clustering expressions of frequency and amount into
groups of equivalent expressions representing psychologically real levels of frequency and
amount. As time permits, a more precise validation study is planned to perform a statistical test
of correspondence between the mathematical and the psychological clustering solutions. This test
will consist of a t-test between the mean correlations (as represented by Fisher Z's) of the 42
respondents' individual groupings with the mathematically defined groups and the psychologically
defined groups. Our hypothesis is that the mean correlations for the mathematical groups will be
lower, but not significantly lower, than the mean correlations for the psychological groups.
Perhaps, too, the study will be replicated after removing all ambiguous and controversially
weighted expressions from the two lists.
Albert, W.G., Phalen, W.J., Selander, D.M., Dittmar, M.J., Tucker, D.L., Hand, D.K., Weissmuller, J.J. & Rouse, I.F.
(1994, October). Large-scale laboratory test of occupational survey software and scaling procedures. In the
symposium, Bennett, W. Jr., Chair, Training needs assessment and occupational measurement: Advances from
recent research. Proceedings of the 36th Annual Conference of the International Military Testing Association.
Rotterdam, The Netherlands: European Members of the IMTA.
Bass, B.M., Cascio, W.F., & O'Connor, E.J. (1974). Magnitude estimations of expressions of frequency and amount.
Journal of Applied Psychology, 59, 313-320.
Yadrick, R.M., Phalen, W.J, Albert, W.A., Dittmar, M.J., Weissmuller, J.J., & Hand, D.K. (1994, October). Clustering
of magnitude estimations of frequency and amount of time. Proceedings of the 36th Annual Conference of the
International Military Testing Association. Rotterdam, The Netherlands: European Members of the IMTA.
Yadrick, R.M., Phalen, W. J., Albert, W.G., Dittmar, M.J., Weissmuller, J.J., & Hand, D.K. (1993, November).
Magnitude estimations of frequency and amount of time. Paper represented at the 35th Annual Conference of the
International Military Testing Association. Williamsburg, VA: U. S. Coast Guard.
Back to the IJOA home page