How to measure learning

belinda-probert-thumb Deputy Vice-Chancellor Belinda Probert

First published in The Australian on 25 August, 2010.

The new Tertiary Education Quality and Standards Agency will not be fully operational until 2012.

Understandably, universities want to make sure it will focus adequate attention on the risky bits of the industry while not strangling it with red tape.

But perhaps more significant in the longer run will be the way it implements one of the most radical recommendations from the Bradley review, namely that universities report on direct measures of learning outcomes.

Earlier attempts to measure the quality of university teaching relied on indicators that had little research-based validity, leading to rankings that were uniformly rejected by the sector.

Six months ago the Bradley-inspired Department of Education, Employment and Workplace Relations' discussion paper on performance indicators proposed that cognitive learning outcomes ideally would include discipline-specific measures as well as measures of higher-order generic skills such as communication and problem-solving so valued by employers.

As recently suggested by Richard James, from the Centre for the Study of Higher Education at the University of Melbourne, the public has a right to know not just whether groups of graduates met a threshold standard but also whether their skills were rated good or excellent (HES, July 7).

The difficulty with his seemingly sensible suggestion is that there is almost no data on what students are actually learning.

Even the toughest accreditation criteria focus on inputs such as hours in class, words written, content covered, credit points earned and the status of teachers.
Tools such as the Course Experience Questionnaire and the increasingly popular Australian Survey of Student Experience provide data that can be used to good effect by academics with a serious interest in pedagogy. None of these measures learning, however.

Nearly every Australian university proclaims a set of graduate attributes that includes communication, problem-solving and teamwork.

But none defines the standards to be achieved or the method by which they will be assessed, despite pilgrimages to Alverno, the tiny private US college that knows how to do this.

And it would probably be unwise to hold our collective breath until the Organisation for Economic Co-operation and Development completes its Assessment of Higher Education Learning Outcomes feasibility study.

Does the absence of agreed measures and standards mean TEQSA should abandon this key Bradley recommendation and resort to input measures of the kind used to allocate the Learning and Teaching Performance Fund, together with some kind of graduate skills test?

If we agree with Bradley that learning is what we should be measuring, then what we have called Design for Learning at La Trobe University may be of help. Like most universities we have agreed on six graduate capabilities that all undergraduate programs should develop. But we also have agreed they will be defined in appropriate discipline or field-specific terms and be assessed against agreed standards of student achievement.

To develop these explicit standards of achievement, academic staff in each faculty are looking at real examples of student work, to define not just the standards but the indicators, measures and procedures for gathering and evaluating evidence of student learning. This is relatively straightforward for writing or quantitative reasoning, but it is not so easy when it comes to problem solving or teamwork, which may look rather different for physiotherapists and engineers.

We are not asking for spurious degrees of fine judgment (is this worth 64 or 65 marks), but for robust definitions that allow an evidence-based, university-wide judgment that the student has produced work that is good enough, better than good enough or not yet good enough.

If we expect students to demonstrate these capabilities at graduation, then we also have a responsibility to show where, in any particular course of study, they are introduced, developed, assessed and evaluated.

Most such capabilities require development across several years and are not skills that can be picked up in a single subject. Nor is there any point telling students they are not good enough if you cannot show them where and when they will have the opportunity to improve their capabilities.

For these reasons we need to be able to assess and provide feedback very early on in the course (in a cornerstone), somewhere towards the middle, as well as at the end, in a capstone experience.

It would be a lost opportunity and a backward step if TEQSA concludes that measuring student learning is too difficult and resorts to the suggested generic graduate skills assessment test -- which measures little of value about what students have learned -- or relies on students' assessments of their generic skills as captured in the CEQ. Students' assessments of their capabilities are no substitute for the skilled independent assessment, against explicit standards, of academic staff.

Would it not be better if TEQSA gave universities the opportunity to develop explicit, not minimum, standards for student learning, defined through their chosen institutional graduate capabilities?

Such a first step also would provide the foundation for setting measurable targets for improving this learning and would support the government's goal of encouraging diversity of institutional mission, by requiring not only explicitness of purpose but also of standards.

Having defined and mapped where La Trobe's capabilities are developed and assessed across the curriculum, we expect to be able to set targets for improvement, such as increasing the percentage of our graduates who meet the better-than-good-enough standard by improving the design of particular programs of study.

Or we may plan to raise the bar for what constitutes good enough by evaluating and revising parts of the curriculum.

In a diversified sector the standards chosen will vary from university to university but, once developed, the potential for benchmarking is obvious.