As promised, here is the second installment of Dr. Premack’s brilliant findings. I will let him do the “talking” and tell us a bit about some of the experiments he conducted that lead us to the rule of learning: Engaging in a MORE PROBABLE behavior will reinforce a LESS PROBABLE behavior… read on:
“Reward and punishment however are not opposites. They are, except for sign, equivalent! And they both contrast with freedom. This is a new point of view, a microblockbuster. It says that reward and punishment are produced by two simple rules. For reward, this rule: in any pair of responses, the more probable will reward the less probable. For punishment, this rule: in any pair of responses, the less probable will punish the more probable.
We test this new view of reward by first giving the individual free access to water, food, a mate, music etc., items to which the person or animal is likely to respond - and we record the length of time (the duration) the individual spends with each item… The amount of time (the duration) which an individual spends with each item turns out to be the ideal measure for making comparisons of this kind. Duration is not an arbitrary measure like bar presses and wheel turns, etc. It is a universal measure and is applicable to all behaviors.
Here is a simple example of the test. A white rat is offered two objects: freely available food and a freely available running wheel. The rat is hungry. It spends more time eating than it does running. According to the first rule then: Eating should reward running. To test the rule, we remove the food. The rat remains in the running wheel. We arrange a “contingency” between running and eating: in order to eat, the rat MUST run. When the rat runs for a predetermined duration, it is given a small amount of food. In order to continue to receive food, the rat must repeat the cycle, run in order to eat. If the rule is correct, the rat should increase its running.
Exactly by how much will depend on the requirements of the contingency. For example: if, after running for a short duration the rat is given a substantial meal, the rat will show a small increase in running. The increase will be smaller than if it had been either: (a) required to run for a very long duration before eating, and/or (b) was given a very small meal each time it ran.”
Dr. Premack goes on to explain why this works the way it does…
“When held in the grip of a contingency, every individual faces a competition between two responses, one response more probable than the other. In the above example, the rat is being asked: How much MORE ARE YOU WILLING TO RUN in order to maintain your normal amount of eating? Alternatively, how much LESS ARE YOU WILLING TO EAT in order to maintain your normal amount of running? The rat has two choices: either to increase the less probable response (running); or reduce the more probable one (eating).
Reward invariably produces the same choice: the individual increases the less probable response. Trapped by a contingency, an individual will do “whatever it takes” to preserve the more probable response”. Essays: Reward and Punishment versus Freedom in: http://www.psych.upenn.edu/~premack/About.html
His findings are not only relevant in the laboratory – where there is more control over the environment, but what is really fascinating is that in the “real-world” of training we can use his principle with success! In order for us to do this, we first must be clear as to what is it that the animal wants at a given moment (most probable behavior). If we pay attention we will notice that our dog’s desire to engage in particular behaviors, does not remain constant. It varies. Say, for example, that I take Deuce sheepherding - we do this twice a week. Sheepherding when in the presence of sheep, for a working herding dog its a (very) high probable activity. Now, after Deuce has had the opportunity to engage with the sheep and is tired and hot, he would much rather take a break in the shade and catch his breath (and get some water) than continue to herd his sheep – at least for a while. So the desire for sheep herding has “dropped” and is now replaced by a more probable behavior which is to lay in the shade and have some water.
Another example: A dog that loves to chase after wildlife will at some point, on a given day, much rather get some food - if really hungry - or a drink of water - if thirsty - than chase after wildlife. This indeed is a big part of understanding how to implement a sound Premack-based training plan. In other words, it dictates how to “arrange” the pairing of the (a) most probable behavior and (b) least probable behavior. Also, how can I best modify the environment such that the contingencies play out as planned?
The concept of environment in training is very ample. By environment I mean anything in the surrounding that might influence the behavior (and motivation) of the dog. It includes: any kind of distraction, any kind of reinforcement, placement of reinforcement, timing of reinforcement and satiate levels: how hungry the dog is, how tired, thirsty, etc. when the training is taking place; just to name a few.
Management of the environment is such an important requirement when training that it is in essence an integral part of any behavior modification plan. To remind me of the importance of management when writing training plans, I have in my office a quote (I guess this is “quote week”) from Dr. Susan Friedman- a brilliant behavior analyst and currently a faculty member in the Department of Psychology at Utah State University.
Dr. Friedman has a popular course called Living and Learning with Animals, which I was
lucky to take last fall. See here: http://www.behaviorworks.org
The quote pinned on my office bulletin board is called Best Practices, a guide that reminds us of the obligation to engage in humane practices in the application of the behavioral sciences. Here is the quote:
- Control the environment not the animal.
- Control antecedents to make the right behavior easy.
- Control consequences to make the right behavior worth performing.
- The individual most be an operator of significant environmental events.
- Resilience is a function of the ratio of empowerment to submission.
Wow!
In other words, it is our responsibility to treat the animal that we are training and engaging with, with respect for his/her ability to learn from his/her own environment. In addition, we must also give the animal some "agency" so that the dog has the ability to make decisions – (to be an operator) instead of a passive recipient of our will.
Just imagine how different learning would be for all animals being trained if more people engaged with training/teaching animals (or teaching people for that matter) would subscribe to Dr. Friedman’s Best Practices guide? No more coercion! No more pain and subjugation but instead: a “conversation” between trainer and learner. This approach in teaching is not only more humane but it has also been demonstrated countless times (there are studies spanning decades to back this claim) that it works, producing results in the “real” world of training.
No comments:
Post a Comment