The Development of a General Measure of Performance
Travis C. Tubre, Winfred Arthur, Jr., Don S. Paul
Department of Psychology, Texas A&M University
Winston Bennett, Jr.
Human Resources Directorate, Armstrong Laboratory, Brooks AFB
The U.S. military has invested considerable resources in developing approaches to measuring job performance. These approaches have typically been expensive to develop and time consuming to administer. In addition, considerable information about specific job content is often required to develop performance measures using these approaches. This paper describes recent research activities related to the development of a general measure of performance based on recent conceptualizations of the structure of performance which assert that aspects of performance generalize across different jobs. One appealing aspect of such models rests in the ability to develop approaches to measuring and predicting performance which are useful across a broad range of jobs. To date, these models have generally been examined at the conceptual level only and have rarely been empirically tested. The present paper describes the development of a core set of items which could be used to (1) empirically test various latent factor models of performance and (2) form the basis for a measure that could be used to obtain general job performance criterion data for a variety of uses (e.g., test validation, program evaluation). Results from the work to date and plans for future activities are highlighted and discussed.
Background
Job performance is, perhaps, the most important construct in industrial and organizational (I/O) psychology and human resource management (HRM). However, despite its importance, relatively little is known about the latent structure of performance. Indeed, many authors (e.g., Campbell, 1990a; 1990c) have noted that of the parameters in the classic prediction model, performance has been the most ignored. As noted by Viswesvaran (1993), very few efforts have been directed toward developing generalizable models of performance. It has typically been assumed that what constitutes performance differs from job to job. As a result, researchers have used countless numbers of measures as indicators of performance.
More recently, however, researchers (e.g., Borman & Motowidlo, 1993; Campbell, 1990a; Campbell, McCloy, Oppler, & Sager, 1993) have developed theories of job performance which posit that some latent performance dimensions generalize across a broad range of jobs. For instance, Campbell (1990a) asserts that core task proficiency, demonstrating effort, and the maintenance of personal discipline are components of every job. Models that posit the existence of core sets of performance dimensions which exist across a broad range of jobs are appealing for a number of reasons. First, as noted by Campbell (1990c), theory building is becoming an increasingly important component of research in I/O psychology. Since job performance is arguably the most important construct in our domain, a more complete understanding of its structure is a necessity (Viswesvaran, 1993). Second, if substantiated, such models could provide the basis for developing approaches to measuring and predicting performance which are useful across a variety of jobs. The latter proposition is a primary focus of the present study.
More specifically, the first step in the present study is to empirically test competing conceptualizations of the latent structure of performance. To date, these models have generally been examined at the conceptual level only and have rarely been empirically tested (Campbell, 1990a; Campbell et al., 1993). This paper describes the development of and conceptual basis for an item pool which will be used for this purpose. If performance components which generalize across jobs can be identified, the item pool could also form the basis for a measure that could be used to obtain general job performance criterion data for a variety of uses (e.g., test validation, program evaluation) across a broad range of jobs. Such an instrument could have tremendous utility by reducing the resource demands associated with gathering criterion data.
The Latent Structure of Performance
Viswesvaran (1993) provides an excellent comprehensive review of historical developments in the conceptualization of job performance. As he notes, the literature examining the structure of job performance is fragmented and incomplete. Early conceptualizations (e.g., Brogden & Taylor, 1950) focused largely on the economic value of individual behaviors to the organization. With the emergence of the literature on expectancy theory, many researchers began to focus on measures that reflected the effort expenditure and productivity of workers (Viswesvaran, 1993). In the 1970s and 1980s research on prosocial and organizational citizenship behaviors proliferated (e.g., Bateman & Organ, 1983; Smith, Organ, & Near, 1983). This resulted in the introduction of a variety of criterion measures such as teamwork and altruism. Finally, in recent years, the impact of counterproductive behavior in the workplace has been studied extensively (e.g., Collins, 1996; Ones, Viswesvaran, & Schmidt, 1993). This literature has yielded a number of criterion measures related to honesty and integrity in the workplace.
Campbell (1990a; Campbell et al., 1993) provided one of the first large scale attempts to integrate the numerous dimensions of performance into a comprehensive model. According to Campbell, the latent structure of job performance can be modeled using the following eight general factors: (1) job-specific task proficiency, (2) non-job-specific task proficiency, (3) written and oral communication, (4) demonstrating effort, (5) maintaining personal discipline, (6) facilitating peer and team performance, (7) supervision/leadership, and (8) management/administration. According to Campbell (1990a; Campbell et al., 1993), these eight factors represent the highest-order factors that can be useful for describing performance in every job in the occupational domain, although some factors may not be relevant for all jobs. As mentioned previously, he contends that core task proficiency, demonstrating effort, and maintaining personal discipline are important components of performance in every job. While this model represents one of the most comprehensive treatments of the latent structure of job performance currently available, it has rarely been empirically tested. In fact, Campbell et al. (1993, p. 49) admit that direct evidence in support of the model is sparse. In response, they call for future construct validation efforts to test the adequacy of the eight-factor model.
Similarly, Borman and Motowidlo (1993) outlined the conceptual basis for expanding the criterion domain beyond task performance to include elements of contextual performance. Drawing from the literature on organizational citizenship behavior (Barnard, 1938; Smith et al., 1983), prosocial organizational behavior (Brief & Motowidlo, 1986; Organ, 1988), and findings from Project A (Campbell, 1990b), Borman and Motowidlo (1993) described the structure of contextual performance (Viswesvaran, 1993). Within this framework, contextual performance is defined as behaviors that support the broad organizational, social, and psychological environment of the organization in contrast to behaviors that support the organization's technical core (Borman & Motowidlo, 1993). Contextual performance is further distinguished from task performance in that it is typically more discretionary as opposed to role prescribed. The authors describe five categories of contextual performance as follows: (1) volunteering to carry out task activities that are not formally part of the job, (2) persisting with extra enthusiasm when necessary, (3) helping and cooperating with others, (4) following organizational rules and procedures, and (5) endorsing, supporting, and defending organizational objectives.
As with Campbell's (1990a) model of performance, much remains to be accomplished with regard to providing empirical evidence for the adequacy of the task versus contextual performance distinctions. However, the model proposed by Borman and Motowidlo (1993) has recently received empirical support. Motowidlo and Van Scotter (1994) demonstrated that task and contextual performance contributed independently to overall performance in a sample of 421 U.S. Air Force mechanics. Further, their findings suggested that job experience was more highly correlated with task performance than with contextual performance, and personality variables (e.g., dependability) were more predictive of contextual performance than of task performance. These findings are logically consistent with Borman and Motowidlo's (1993) description of task and contextual performance dimensions. That is, within their framework, variation in task performance is posited to reflect individual differences in the proficiency with which task activities are carried out. Thus, individual differences in the knowledge, skills, and abilities associated with a given task should be more predictive of task performance than personality characteristics. Additionally, experience and training performance should be more highly correlated with task performance (Motowidlo & Van Scotter, 1994). Conversely, behaviors such as cooperation, persistence, and compliance would likely be more strongly related to personality variables than to experience, training performance, or ability.
While somewhat similar in their treatment of the criterion domain, neither the model proposed by Campbell (1990a), nor that proposed by Borman and Motowidlo (1993) fully examines the possibility of a general performance factor at the highest level of a hierarchical structure. In fact, as noted previously, Campbell (1990a) explicitly argues that his eight factors describe the highest order latent variables that can usefully describe performance. In contrast, a model proposed by Viswesvaran (1993) posits the existence of a strong general performance factor which explains substantial variation in virtually all measures of job performance that have appeared in the literature.
Using meta-analytic techniques, Viswesvaran (1993) cumulated studies reporting correlations between various measures of job performance. Next, he grouped the large number of measures into 25 conceptually distinct categories (e.g., quality of performance, communication skills, compliance and acceptance of authority). Based on an extensive literature review, he identified five themes which captured the vast number of performance measures utilized in the literature and sorted the 25 measures into these groups. The groups he utilized are as follows: (1) productivity, (2) conscientiousness, (3) interpersonal skills, (4) withdrawal (e.g., absenteeism, turnover), and (5) measures of overall job performance. Finally, he tested the adequacy of a three-level hierarchical model of job performance with a general performance factor at the highest level, the five-group factors at the second level, and the 25 categories of performance measurements at the lowest level. His results indicated a positive manifold of true score correlations among the 25 performance dimensions (Viswesvaran, 1993). In addition, the three-level hierarchical model provided a better fit to the data than a two-level hierarchical model in which the 25 dimensions were posited to load on a general factor.
The review of the literature on the factor structure of performance provided in this paper indicates that no clear consensus exists concerning the structure of the criterion domain. However, models such as those provided by Campbell (1990a), Borman and Motowidlo (1993), and Viswesvaran (1993) represent a much needed foundation in the development of comprehensive theories of work performance. The next logical step in this process is to empirically test the adequacy of these competing conceptualizations. It is this process which is the central focus of the present study.
Methodology
Testing competing theories concerning the latent structure of job performance necessarily requires the development of an item pool which is representative of the various performance dimensions presented in the competing theories. The development of this item pool represents the initial step in the present study. Based on an extensive review of such sources as published articles, books, and a variety of instruments designed to measure various dimensions of performance, a 125 item scale was constructed which adequately captures all of the generalizable dimensions of performance specified in the previously mentioned theories. To ensure accuracy, each stage in the item development process involved the collaborative efforts of two graduate students, a senior faculty member at a large research institution, and a senior research scientist from the U.S. Air Force. Approximately 500 items describing performance in a broad variety of jobs were extracted and modified from sources such as those listed above. Next, each item was examined within the framework of the competing models identified previously. This was done to ensure that every dimension presented in the models was represented by a subset of items. Next, items that were vague or unclear were removed from the pool. Following this, items whose content was extremely similar to other items were removed from the pool. This process was repeated several times to reach consensus among the participants in the process.
The next stage of research proposed in the present paper is to empirically test the previously presented theories using the recently developed scale. This process will involve the administration of an adaptive computer-based version of the scale to a large number of U.S. Air Force personnel across a broad range of job categories. Incumbents, in selected occupations, will be asked to rate each of the scale items according to the extent to which each item would be an appropriate measure of performance in their jobs. In addition, data will be collected from supervisors who will rate the extent to which each of the scale items would be an appropriate measure of performance for their subordinates. It should be noted that the scale does not include items representing job-specific task performance. Rather, for each occupational classification included in the sample, a number of items dealing with job-specific task proficiency will be added to the scale. These items will be drawn from existing performance measurement instruments and job analysis data. In cases where this type of data is nonexistent or outdated, job analytic techniques will be employed to gather data that will be used for constructing items which measure job-specific task performance.
Following the data collection stage, factor analytic techniques will be utilized to assess the adequacy of various models describing the latent structure of job performance. This analytic process will also be used for the purpose of scale refinement. In addition, the nature of the data set will facilitate the testing of hypotheses concerning the stability of the factor structure of performance across different levels of occupations and incumbent characteristics (e.g., sex). Further, administering versions of the scale which incorporate or omit job-specific content will allow for the testing of various research questions concerning the relative impact associated with the inclusion or exclusion of job-specific content. To increase the generalizability of our findings, we will attempt to incorporate a large number of occupational classes that are analogous to jobs in the civilian sector. In addition, data will be collected from incumbents in jobs with varying degrees of technical content.
It is important to note, however, that the goals of the present study extend beyond simply conducting empirical tests of competing theories of job performance. If our initial analyses indicate that various dimensions of performance are generalizable across a broad range of jobs, the scale items could form the basis for a measure that could be used to obtain general job performance criterion data for a variety of uses (e.g., test validation, program evaluation). Depending on the nature of the criterion data of interest, the scale could be modified to include job-specific content. The U.S. military has invested considerable resources in developing and validating approaches to measuring individual and workgroup performance which are largely based on specific job content. However, these approaches have typically been expensive to develop and time consuming to administer. Thus, a general measure of performance which could be modified to obtain general criterion data across a broad range of jobs would be of tremendous utility. The development of such an instrument is the long term goal of our efforts thus far. However, before moving on to the next stage, questions concerning the latent structure of performance must be addressed.
As many authors (e.g., Arthur & Bennett, 1995; Viswesvaran, 1993) have noted, the existing literature on the criterion domain is fragmented and incomplete. This study represents an attempt to clarify some of the confusion concerning what constitutes successful or unsuccessful job performance. The theories presented by Campbell (1990a), Borman and Motowidlo (1993), and Viswesvaran (1993) are the groundwork upon which the present study will build. Our understanding of the structure and composition of job performance lags behind our understanding of predictors and outcomes of successful job performance. For this reason, efforts directed at identifying generalizable dimensions are particularly valuable. As noted by Viswesvaran (1993):
Developing theories of job performance for each task (or even job) will hinder the development of a general theoretical understanding of the construct of job performance. As the content generality of the dimensions increases, the value of the dimensions in developing prediction instruments and theories of work-behavior increases (p. 64).
Further, empirically identifying dimensions of performance with generalizable content represents the first step in the development of instruments that could be used to obtain general criterion data across a broad range of jobs. Such instruments could substantially reduce the resource demands of criterion measurement and expand the criterion domain to include elements of contextual performance which may be overlooked in more traditional job-specific approaches to criterion measurement.
References
Arthur, W., Jr., & Bennett, W. R.., Jr., (1995). The international assignee: The relative importance of factors perceived to contribute to success. Personnel Psychology, 48, 99-114.
Barnard, C. (1938). The functions of the executive. Cambridge, MA: Harvard University Press.
Borman, W. C., & Motowidlo, S. J. (1993). Expanding the criterion domain to include elements of contextual performance. In N. Schmitt & W. C. Borman (Eds.), Personnel selection in organizations (pp. 71-98). San Francisco, CA: Jossey Bass.
Brief, A. P., & Motowidlo, S. J. (1986). Prosocial organizational behaviors. Academy of Management Review, 11, 710-725.
Brogden, H. E., & Taylor, E. K. (1950). The dollar criterion: Applying the cost accounting concept to criterion construction. Personnel Psychology, 3, 133-154.
Campbell, J. P. (1990a). Modeling the performance prediction problem in industrial and organizational psychology. In M. D. Dunnette & L. M. Hough (Eds.), Handbook of Industrial and Organizational Psychology (2nd Ed., Vol. 1, pp. 687-732). Palo Alto, CA: Consulting Psychologists Press.
Campbell, J. P. (1990b). An overview of the army selection and classification project. Personnel Psychology, 43, 231-239.
Campbell, J. P. (1990c). The role of theory in industrial and organizational psychology. In M. D. Dunnette & L. M. Hough (Eds.), Handbook of Industrial and Organizational Psychology (2nd. Ed., Vol. 1, pp. 39-73). Palo Alto, CA: Consulting Psychologists Press.
Campbell, J. P., McCloy, R. A., Oppler, S. H., & Sager, C. E. (1992). A theory of performance. In N. Schmitt & W. C. Borman (Eds.), Personnel Selection in Organizations (pp. 35-70). San Francisco, CA: Jossey Bass.
Collins, J. M. (1996). A narrative and empirical evaluation of the socialization trait as a predominant predictor of productive and counterproductive work behavior. Manuscript submitted for publication.
Ones, Viswesvaran, C., & Schmidt, F. L. (1993). Meta-analysis of integrity test validities. Journal of Applied Psychology, 78, 679-703.
Organ, D. W. (1988). Organizational citizenship behavior: The good soldier syndrome. Lexington, MA: Lexington Books.
Motowidlo, S. J., & Van Scotter, J. R. (1994). Evidence that task performance should be distinguished from contextual performance. Journal of Applied Psychology, 79, 475-480.
Smith, C. A., Organ, D.W., & Near, J. P. (1983). Organizational citizenship behavior: Its nature and antecedents. Journal of Applied Psychology, 68, 653-663.
Viswesvaran, C. (1993). Modeling job performance: Is there a general factor? Unpublished doctoral dissertation, University of Iowa.