Jimmy L. Mitchell, Institute for Job & Occupational Analysis,
San Antonio, TX
Johnny J. Weissmuller, Metrica, Inc., San Antonio, TX
Winston Bennett, Jr., Air Force Research Laboratory, Mesa, AZ
Abstract
As part of the research and development of a prototype internet survey software capable of handling typical occupational analysis (OA) task lists, some research was conducted to explore the possible impacts of the new survey form (internet) on data quality and reliability (Stanton, 1998). The objectives of this research were to find new ways to enhance data collection efficiency, as well as improve data quality. Such improved data collection methods have considerable promise for use in training evaluation, occupational analysis, and as a way to expedite data for emerging decision support systems (DSSs) used in making critical manpower, personnel, and training (MPT) decisions (Bennett, Sego, Teachout & Phalen, 1994; Mitchell, Bennett, & Yadrick, 1993).Since a Behavioral Scientist (AFS 3BSX1b) survey was recently completed by the Air Force Occupational Measurement Squadron using disk-based surveying technology (OASurv), this officer specialty was selected for an experimental study to test the equivalency of Internet surveying. The questions of interest in this experiment involved having all respondents rate major duty areas in terms of which duties they perform and then to present the tasks to be rated either in inventory order (the traditional way) or in descending order of rated importance (omitting those rated as not performed). This presentation order should control for the effects of fatigue on long surveys (i.e., those with many tasks) and should insure that all important duties are rated first. Those rated last would be the tasks where few people perform or where trivial amounts of job time are involved (Weissmuller & Mitchell, 1998). Theoretically, this tailored survey should develop more reliable information and should take considerably less time to complete, thus minimizing the amount of job time invested in survey data collection. Such an approach was used successfully in the recent U.S. Army Enlisted Common Soldiers Task survey to cope with a 900-item task list administered to about 20,000 soldiers.
Experimental Design
Key questions to be answered in this experiment were as follow:
Will Presentation Order Impact Group Job Description?
Less Time to Complete Survey?
To address these issues, the GenSurv version of the Behavioral Scientist survey was developed to include a random assignment of cases to two different survey forms, one in traditional "Lock Step" inventory order, and one based of descending order of rated importance of duties in present job (i.e., "how much is this duty a part of your job?"). In both forms, the incumbent was asked to rate the tasks in terms of relative time spent (RTS), which is the usual scale used in the Air Force occupational survey program.Can the experiment be done quickly using an Internet Survey? Additionally, some questions were included at the end of the normal survey to assess the attitudes of survey respondents toward the survey in terms of what percentage of their present job is covered by the tasks they rated and how long it took to complete the survey. Provision was also made for survey takers to write-in any comments they wanted about the survey and the experimental process.
Sample
Since this was the first operational data collection using GenSurv, we felt that we should use the study as a pilot test of the software. Thus, the members of the Air Force Occupational Measurement Squadron were ideal subjects for the experiment; they are now the largest single concentration of Behavioral Scientists (military and civilian) in the Air Force. A side benefit would be to let AFOMS members become familiar with the software, which might be used in future Air Force internet surveys. Thus, we included provision for them to record any suggestions they might have about what should be in military internet surveys.
An initial version of the Behavioral Scientist internet survey was developed in early February 1999 and a number of scientists were invited to assess and use the survey and to provide feedback and suggestions for its possible operational use. Many of their suggestions were incorporated in the final version of the survey which was posted to the web on April 15th, 1999; AFOMS was then notified that it was ready to begin data collection. All AFOMS personnel were given a website address and password, and asked to complete the survey as quickly as possible. After some initial difficulty gaining access via a webpage, which explained the purpose of the survey (our IJOA website was down the first weekend), survey administration proceeded quickly and was terminated about May 15, to permit data analysis to begin. The final sample included 52 military and civilian Behavioral Scientists, most from AFOMS. Thirty one completed the survey in inventory order (Treatment 1) and 21 completed it in duty importance order (T2). The difference in these numbers involved some additional cases (10 or more) where individuals had begun the survey but not completed it, and a few who obviously were just playing with the software. The final sample did include some individuals in one-of-a-kind jobs as well as a few supervisors not involved in technical work.
Data Analysis
One analysis using the total sample was to examine their responses to questions about the time it took to complete the survey. In one question, survey respondents were asked their degree of agreement with the statement that the "Time was reasonable" to complete the survey.
Table 1. - Time Reasonable Analysis
(7 point scale)
A second question involved the question "What percent of you job is
covered by the tasks in this inventory?" Results were as shown below:
Table 2. Percent of Job Covered
Analysis of Group Job Descriptions
For this analysis, only people with a job title of Occupational Analyst were included; this makes for a more equitable analysis of the equivalency of job descriptions. The group size was: T1 (Inventory Order) n = 10 cases, and T2 (Duty Importance) n = 8 cases. The results of this analysis were as follow:
Table 3. Group Comparisons
Group Comparison Statistics T1 T2
Note that while the T2 (Duty Importance) group rated more tasks
on the average and had more Core Tasks, the two groups share 38 tasks and
have a high overlap in terms of core tasks of the other group. Thus, while
those who take the shorter, descending order of duty importance actually
rate more tasks (even though they do not deal with the tasks of the duty
areas rated 0). Those who saw every task of every duty rated fewer tasks
as requiring some time in their job.
A review of the actual job descriptions for the two groups revealed that all of the top 30 tasks in both groups were the same, and though there was minor variance in the order in which tasks were displayed (descending order of RTS) as might be expected in groups this small. Our conclusion from this analysis was that the job descriptions were equivalent. To quantify the degree of agreement within each group, atCODAP GRPRELs were run on both groups. Results are shown below:
Table 4. GRPREL Results
r1,1 .2731 .3621
r k,k .6200 .6958
r30,30 .9185 .9445
The average interrater agreement (r11) is well above
the normal expected level of .20, but with groups these size, the extrapolated
level of agreement (rkk) cannot reach the optimum level of .90.
With these size samples where the objective is to compare two groups, a
more reasonable extrapolation is to calculate the value if the groups were
both n = 30 (r30,30). These calculated values both exceed the
optimum level of .90. Values for the T1 (rated importance) group are consistently
higher, even though that group is smaller. This may be the result of one
occupational analyst in T1 also performing supervisory tasks.
Actual Time for Survey
In the initial analysis, participants perceptions of the reasonableness of time to complete the survey were examined, but it is also possible to examine the server log and determine how long each person actually spent on the system doing the survey. Again, the issue is to compare the two groups and assess their difference. Results of the analysis are show below:
Inventory Order (T1) 39.00 39.0 12.35
Rated Duty Order (T2)
35.12 36.5
14.36
The group difference, and potential for timesavings, would undoubtedly be greater with specialties with much longer task lists, which is where the "fatigue" issue usually arises. To assess such potential timesavings, this study needs to be repeated with one or two specialties, which have much longer task lists.
Write In Comments
As noted earlier, provision was made for survey respondents to comment on the survey and make suggestions on how to improve the internet survey process. Table 6 summarizes analysis of these comments.
Table 6. Write-In Comments
Generally positive - most like the idea of web surveys
Several suggested a need to see the tasks comprising a duty
before they would be comfortable rating
Overall, the comments were highly positive about the use of the
internet to conduct occupational analysis surveys. Several individuals
were very enthusiastic about the possibility of the Air Force (and military
services in general) collecting needed information in this way. A number
of AFOMS occupational analysts found the idea of rating the importance
of duty areas to be an unfamiliar activity, or, for those who received
the Inventory Order administration, could not understand its purpose. One
suggestion, made by several respondents, was that "look up" tables of the
tasks comprising a duty should be available to insure better understanding
of the sometimes generalized duty titles. Respondents were not hesitant
to also point out minor errors, particularly misspelled words or obsolete
acronyms.
Conclusions
This project was a successful employment of the new GenSurv internet survey technology to collect experimental data quickly (one month administration) in an attempt to answer some questions about how surveys might better be administered. Answers to such questions were not as definitive or complete as we might like, but a trend of saving time with the tailored administration approach was visible. Findings tended to be in the expected direction, and were sufficient to raise the expectation that more substantial savings of job incumbent time to take surveys, reduction in project data collection time, and improvement of data quality are possible, particularly with longer, more complex task lists. We do believe that the data support our expectation that ordered presentation (by duty importance) is a reasonable approach to combat rater fatigue with long lists. As with most research projects, we believe that additional research in needed to pursue these issues, and hope to have the opportunity for such research during on-going GenSurv projects. We anticipate at least four operational GenSurv military projects will be on-line during the coming year.
Perhaps the most dramatic result of this project is that the entire survey process began in February with a draft inventory and results are available to report in this late May forum - a four month process. Data collection began on 15 April and closed 15 May - just one month. These timelines document some of the major strengths of GenSurv and the Internet: data can be collected quickly and efficiently while eliminating paper-and-pencil and computer diskettes as well as mailing costs (and time). While GenSurv is still a prototype system, it is clearly at the point of operational field surveying capabilities where studies need very rapid response time and where data are needed very quickly.
References
Mitchell, J.L., Yadrick, R.M., & Bennett, W. R. (1993). Estimating Training Requirements From Job and Training Models. Military Psychology, 5(1),1-20.
Stanton, J.M., (1998). An empirical assessment of data collection using the internet. Personnel Psychology, 51:709-725.
Weissmuller, J.J., Mitchell, J.L. (1998, October). Self-prioritizing
inventory administration to maximize validity. In the Symposium, J.L. Mitchell
& J.S. Tartell, co-chairs, Evaluating Innovations in Training Assessment
& Occupational Modeling Technology, Proceedings of the 40th Annual
Conference of the International Military Testing Association. Pensacola
Beach, FL: Naval Education and Training Professional Development and Technology
Center (available at html://www.internationalmta.org ).