The unscience of evaluation
Evaluation is notoriously under done in the corporate sector.
And who can blame us?
With ever increasing pressure bearing down on L&D professionals to put out the next big fire, it’s no wonder we don’t have time to scratch ourselves before shifting our attention to something new – let alone measure what has already been and gone.
Alas, today’s working environment favours activity over outcome.
I’m not suggesting that evaluation is never done. Obviously some organisations do it more often than others, even if they don’t do it often enough.
However, a secondary concern I have with evaluation goes beyond the question of quantity: it’s a matter of quality.
As a scientist – yes, it’s true! – I’ve seen some dodgy pseudo science in my time. From political gamesmanship to biased TV and clueless newspaper reports, our world is bombarded with insidious half-truths and false conclusions.
The trained eye recognises the flaws (sometimes) but of course, most people are not science grads. They can fall for the con surprisingly easily.
The workplace is no exception. However, I don’t see it as employees trying to fool their colleagues with creative number crunching, so much as those employees unwittingly fooling themselves.
If a tree falls in the forest
The big challenge I see with evaluating learning in the workplace is how to demonstrate causality – ie the link between cause and effect.
Suppose a special training program is implemented to improve an organisation’s flagging culture metric. When the employee engagement survey is run again later, the metric goes up.
Congratulations to the L&D team for a job well done, right?
What actually caused the metric to go up? Sure, it could have been the training, or it could have been something else. Perhaps a raft of unhappy campers left the organisation and were replaced by eager beavers. Perhaps the CEO approved a special bonus to all staff. Perhaps the company opened an onsite crèche. Or perhaps it was a combination of factors.
If a tree falls in the forest and nobody hears it, did it make a sound? Well, if a few hundred employees undertook training but nobody measured its effect, did it make a difference?
Without a proper experimental design, the answer remains unclear.
Evaluation by design
To determine with some level of confidence whether a particular training activity was effective, the following eight factors must be considered…
1. Isolation - The effect of the training in a particular situation must be isolated from all other factors in that situation. Then, the metric attributed to the staff who undertook the training can be compared to the metric attributed to the staff who did not undertake the training.
In other words, everything except participation in the training program must be more-or-less the same between the two groups.
2. Placebo - It’s well known in the pharmaceutical industry that patients in a clinical trial who are given a sugar pill rather than the drug being tested sometimes get better. The power of the mind can be so strong that, despite the pill having no medicinal qualities whatsoever, the patient believes they are doing something effective and so their body responds in kind.
As far as I’m aware, this fact has never been applied to the evaluation of corporate training. If it were, the group of employees who were not undertaking the special training would still need to leave their desks and sit in the classroom for three 4-hour stints over three weeks.
Because it might not be the content that makes the difference! It could be escaping the emails and phone calls and constant interruptions. It could be the opportunity to network with colleagues and have a good ol’ chat. It might be seizing the moment to think and reflect. Or it could simply be an appreciation of being trained in something, anything.
3. Randomisation - Putting the actuaries through the training and then comparing their culture metric to everyone else’s sounds like a great idea, but it will skew the results. Sure, the stats will give you an insight into how the actuaries are feeling, but it won’t be representative of the whole organisation.
Maybe the actuaries have a range of perks and a great boss; or conversely, maybe they’ve just gone through a restructure and a bunch of their mates were made redundant. To minimise these effects, staff from different teams in the organisation should be randomly assigned to the training program. That way, any localised factors will be evened out across the board.
4. Sample size – Several people (even if they’re randomised) can not be expected to represent an organisation of hundreds or thousands. So testing five or six employees is unlikely to produce useful results.
5. Validity - Calculating a few averages and generating a bar graph is a sure-fire way to go down the rabbit hole. When comparing numbers, statistically valid methods such as Analysis of Variance are required to get significant results.
6. Replication - Even if you were to demonstrate a significant effect of the training for one group, that doesn’t guarantee the same effect for the next group. You need to do the test more than once to establish a pattern and negate the suspicion of a one-off.
7. Subsets – Variations among subsets of the population may exist. For example, the parents of young children might feel aggrieved for some reason, or older employees might feel like they’re being ignored. So it’s important to analyse subsets to see if any clusters exist.
8. Time and space - Just because you demonstrated the positive effect of the training program on culture in the Sydney office, doesn’t mean it will have the same effect in New York or Tokyo. Nor does it mean it will have the same effect in Sydney next year.
Don’t get me wrong: I’m not suggesting you need a PhD to evaluate your training activity. On the contrary, I believe that any evaluation – however informal – is better than none.
What I am saying, though, is for your results to be more meaningful, a little bit of know-how goes a long way.
For organisations that are serious about training outcomes, I go so far as to propose employing a Training Evaluation Officer – someone who is charged not only with getting evaluation done, but with getting it done right.