Opportunities and Challenges for Evaluating Distributed Learning Environments
Winston Bennett, Jr.
Robert M. Yadrick
Technical Training Research Division
Armstrong Laboratory Human Resources Directorate
Brooks AFB TX USA
Theresa McNelly
Department of Psychology
Texas A&M University
College Station, TX USA
INTRODUCTION
Over the past few years, the United States Air Force has identified a number of advanced technologies for automating technical training development and delivery. These include computer-based instruction (CBI), video-teleconferencing, virtual reality, intelligent tutoring, and distance learning. Most of these technologies have been applied within the training community as a part of in-residence technical training. Distance learning offers the advantage of providing technical training at the individual trainee's home base instead of requiring travel to formal training at a schoolhouse. One of the most pervasive challenges facing educators and trainers is assessing the effectiveness of training conducted using these technologies. This paper will discuss our plans, practical problems and constraints in evaluating two current applications. One application involves using distance learning as a means of pre-training individuals before they arrive at a formal training site, while for the other a journeyman Medical Technician Career Development Course will be made available as electronic documents on the AFMSA Server for Internet access.
MENTOR 2010 EVALUATION
The MENTOR 2010 system, developed by TRW for the USAF School of Aerospace Medicine in collaboration with the Human Systems Program Office, converts some 100 hours of initial platform instruction in a Fight Nurse/Medical Technician course to 100 hours of interactive multi-media courseware on essentially a 1:1 basis. AL/HRTD is gearing up to conduct a systematic comparison of the platform and interactive versions by collecting data regarding approximately 500 students to be trained over the next 12 months. This effort offers R&D opportunities in cost/benefit analysis as well as in-training and field evaluation of the effects of using interactive courseware rather than platform training. Ideally, the objectives of the evaluation effort would be to:
This is what we consider necessary for a thorough and rigorous evaluation of the system, and this is what we are proposing to do; unfortunately, it may be unrealistic to expect that we can carry out this program. Indications are that it is, for example, impractical to assume that a sizeable number of reservists will be willing, or for that matter able even if willing, to spend 100 hours at their reserve sites for training that they are not presently required to take. When the system comes online, this training time will be required in lieu of 100 hours of classroom instruction, but we are required to conduct an evaluation before the system comes online. That is, the system will be delivered to the Air Force for final examination, testing, and approval several months before it is used for actual instruction. Neither does it appear likely that we can have some students receive the first 100 hours of instruction from the unapproved test version of MENTOR 2010 after traveling to the school to attend the course, which would be an acceptable substitute for the ideal design.
Table 1 shows the alternative evaluation designs. We would like to compare the "Ideal Group" column with the "Schoolhouse Group" column. The "Ideal Group" column indicates that students would receive the full 100 hours of CBT at their reserve duty sites, followed by a test of their domain knowledge and questionnaires concerning their perception of the effectiveness of the CBT lessons, ease of learning to use the system, general satisfaction with the training provided by the system, and the like. They would not retake the 100 initial classroom hours of instruction, so that final course outcomes could be compared equitably. In addition, we would use information about their background (e.g., education, civilian profession, years in their civilian profession, years in their Air Force career field, etc.) as covariates in evaluating interim and final course test grades and other available measures, such as washback rates for particular course topic areas, when comparing student who had or had not received the CBT.
Table 1. Basic Design of Planned and Ideal Evaluations
|
PROBABLE MENTOR 2010 GROUP |
SCHOOLHOUSE GROUP |
IDEAL GROUP |
|
CBT IN FIELD |
||
|
Test knowledge and attitude |
||
|
SCHOOL COURSE -- |
SCHOOL COURSE -- |
CBT IN FIELD |
|
Test for incremental knowledge and attitude |
Test knowledge and attitude |
|
|
BALANCE OF SCHOOL COURSE |
BALANCE OF SCHOOL COURSE |
BALANCE OF SCHOOL COURSE |
|
Interim knowledge and other performance measures |
Interim knowledge and other performance measures |
Interim knowledge and other performance measures |
|
Final test |
Final test |
Final test |
|
Self confidence ratings |
Self confidence ratings |
Self confidence ratings |
|
Field Performance Supervisor/Peer ratings |
Field Performance Supervisor/Peer ratings |
Field Performance Supervisor/Peer ratings |
The "Schoolhouse Group" will presumably receive the same treatment regardless of whether we eventually compare it to the Ideal Group or the "Probable MENTOR 2010 Group." We would simply arrange to test the groups at particular times (if tests were not already planned at those times), most importantly after they had completed the first 100 hours of instruction that comprises MENTOR 2010. Attitude measures would concentrate on the adequacy and quality of instruction, while the same field performance measures described for the Ideal Group would be collected.
The "Probable MENTOR 2010 Group" column reflects the minimum evaluation that we are prepared to undertake. The "CBT IN FIELD" at the top of the column doesn't necessarily mean that all students would receive the entire 100 hours of instruction before coming to the schoolhouse, but rather that students in this group would receive as much of the CBT as possible. In addition, the amount of instruction would probably vary considerably between students. It is also unlikely that we would be able to prevent these students from attending all the classroom instruction, including that corresponding to MENTOR 2010 modules that they have already completed.
Finally, we plan to (or would plan to, in the case of the Ideal Group) follow up end-of-course evaluation with assessments of later on-the-job performance using peer- and supervisor-questionnaires, and graduates' self-reports of perceived job competency, in specific skill areas trained in the course. It is possible that measures of actual performance in field training exercises would also be available later.
The problems with the probable evaluation are, of course, obvious, and the minimal evaluation is far from complete and rigorous. Our goals would therefore be quite limited. Nevertheless, we would hope to be able to ascertain whether those students who receive CBT show initial post-CBT test scores comparable to, worse than, or better than regular classroom students for the particular knowledge and skills taught in the modules they studied via CBT. In addition, we might be able to identify increments in knowledge and performance in these same specific areas once they had received both CBT and classroom training. Finally, we would probably be able to gather important information regarding the perceived completeness and effectiveness of the system's instruction, compared to the corresponding classroom instruction, for those modules that each student completed, as well as information on the usability of the system.
MEDICAL TECHNICAL TRAINING NEEDS ASSESSMENT AND EVALUATION
The goal of this effort is to develop, implement, and validate a process-oriented approach to needs assessment, instructional design and delivery, and evaluation of distance training. There are three major steps in this process. The first step involves needs assessment, while the second involves the design, development, and delivery of interactive courseware (ICW), along with data on the time and cost to produce the ICW. All volumes of the 5-level 4N0X1 CDC (that is, the journeyman Medical Technician Career Development Course) will be made available as electronic documents on the AFMSA Server for Internet access. We also plan on using the Server and the electronic CDC to try out various approaches to, and levels of, instruction.
In addition, we are examining medical training and logistic support initiatives underway in the Air Force (AF), Department Of Defense and private industry to determine what training content and delivery methods are presently being used for specific content. Examples of targets of opportunity that we are aware of include a Neurophysiological interactive training course developed at Sheppard AFB; MERLIN, developed at the Jackson Foundation and used extensively for triage training; MMT&E, developed at the University of Texas at Arlington and used for practice and evaluation of performance in mobility exercises in an automated medical battlefield arena; and the Joint-Service telemedicine program for surgical procedures.
The third step, training evaluation, is the focus of the present paper. The evaluation step involves developing and fielding measures which are tied to the learning objectives of the courseware. Specifically, we plan to develop measures of : (a) learning in training, (b) attitudes toward interactive courseware, motivation to train, perceptions of the quality of training, ease of use of the training and ease of access to the training using available computer equipment in the schoolhouse, field locations and potentially at home, and (c) knowledge, skill and actual situation-based job performance. Measures will be developed, pilot tested, refined and implemented in field administrations of the interactive courseware. Although the demonstration effort concludes in February 1997, we expect to continue throughout the remainder of 1997 to assess knowledge and skill retention and decay based on a variety of variables and real-world constraints on the amount of skill practice (and related exercises) trainees receive after training. We hope to use the longer-term performance information to develop guidelines for refresher training content and intervals for medical technicians. For this purpose we are testing a number of innovative approaches to field performance measurement, including measuring the links between a variety of criteria for assessing the overall impact of interactive courseware for (a) primary skills training and (b) refresher training and proficiency training.
We will use the demonstration as an opportunity to examine the extent to which existing hardware, software, and network capabilities are able to support exportable training over the Internet and via CD-ROM to AF users. We hope to demonstrate that a basic infrastructure to support such training is in place and to provide recommendations and specifications for leveraging the infrastructure for near term training development and delivery, and for enhancements and increased bandwidth requirements for longer term sustainability.
We also expect to conduct demonstrations and field studies in support of the AF Reserve and Guard community and to demonstrate potential training benefits to the Active Duty force as well. The process we have developed for this effort should serve as an exemplar for future training development initiatives.