Operant Conditioning (OC) – Rory Miller

oc

There are a bunch of numbers running around: that it takes 300-500 reps to instill a new motor skill, 3000-5000 if you are replacing an old skill.  That’s training.  How many reps did you need to learn not to touch a hot stove?  Once.  And you pulled your hand away with perfect body mechanics.  That’s the difference between conditioning and training.

We are not talking about physical fitness conditioning.  That’s important too, but it isn’t learning.  We are talking about Operant Conditioning.

The principles of Operant Conditioning are simple and a Behavioral Psychologist will argue that this is the model for all learning.

Stimulus-Response-Reward/Punishment

Something happens.  That’s the stimulus.  Then you do something, that’s the response.  What you did either makes things better (Reward) or Worse (Punishment).

There are some rules and some warnings about how Operant Conditioning (OC) works. First of all, done properly, OC is a very fast method for learning simple things.  It is much harder and more difficult to condition a complex series of actions, and you cannot condition if the stimulus must be interpreted.

The stimulus should be unambiguous.  If you have to figure out that this stylized movement is supposed to be a punch, you have to think.  If you have to think, your neocortex is engaged and you are training, not conditioning. This isn’t a problem for many aspects of self-defense training.  A knife coming at your belly is not ambiguous.  Getting hit in the face is not nuanced.

Do not over coach.  Correct and explain to the minimum.  One of the beauties of OC is that the world does your teaching and the student cannot argue with the world.  If you stay at it, you will get good at ukemi or breakfalls because all the wrong ways hurt.  One of the dangers is that if you coach, the student will start to think instead of respond and your conditioning will magically turn into training and fail under pressure.

There is also a psychological condition called ‘learned helplessness’.  If anything a student does is corrected, or anything any person does is punished they will do less of that response.  If, however, they are punished no matter what they do or corrected every time they act, the student is being conditioned that doing nothing is the best strategy.  Over-correcting your fighting students conditions them to be passive under stress.  That is the exact opposite of your goal.

Any response that is even a slight improvement should be rewarded.  That’s as simple as saying, “Nice one!”

Reward and punishment are very specific words in OC training.  As are the words ‘positive’ and ‘negative.’ Reward is anything that increases behavior.  It is almost always something that makes people feel good.  Reward is different for different people because we don’t all like the same things.  But even things as simple as a warm smile or saying, “Good job” are rewards.  And saying, “Good job but this is wrong, do it this way” is not only a punishment (all corrections are punishments) put sends a mixed signal and makes the deeper part of the student’s brain trust you less.

Positive and Negative are not value judgments but only indicate presence or absence. Reward and Punish are the value holders.

Positive Reward (PR): Something good happens. Puppy gets a treat.

Negative Reward (NR): Something bad DOESN’T happen. Puppy doesn’t get kicked.

Positive Punishment (PP): Something bad happens. Puppy gets kicked.

Negative Punishment (NP): Something good is withheld. Puppy gets ignored.

In a counter-assault entry, like Dracula’s Cape, the positive reward is knocking the bad guy down.  The negative reward is the bad guy’s attack missing.  Double reinforcement.

If you do the technique wrong, you will get hit (positive punishment) and the bad guy won’t go down (negative punishment.)

Timing on reward is critical.  No criminal that I am aware of blames his prison time on his deeds.  He always blames them on the court case.  The punishment is being found guilty (it is a positive punishment because it introduces the knowledge of the intended punishment to come later) and that happens at the end of the court case but often months after the crime.  Just like you can’t punish a dog for a day-old stain on the carpet, it won’t do any good to tell your students they did a good job at the end of class.  Reward and punishment must be as immediate as possible, just like in the real world.

An example: Training Dracula’s Cape for Counter-Assault

Before we get into counter-assault, the foundation has been laid very carefully.  The students have already been exposed to the ambush as part of the context of violence and have had a discussion about Operant Conditioning as a training method. (Teaching).  They have worked with power generation, especially drop step, and structure. (Teaching, training, some conditioning and some play).

First we explain what Dracula’s cape is and isn’t, what it is for, when and why it is necessary and what it does.  That is taught.

Then they practice a few rounds of dropping into the technique.  This is training and I feel free to correct here, but I am not correcting for whether they do it the way that I do it or how it looks, I am correcting for whether it will accomplish the mission.

Next, they partner up and test the structure of each other’s technique.  This gives some immediate feedback. It’s on the line between conditioning and training.

The first stage of OC is to have partners with kicking shields give a stimulus.  It is possible to train a specific response to specific stimulus, but to condition for all possible attacks would take years.  The beauty of Dracula’s Cape or Tony Blauer’s SPEAR is that it works on a wide variety of attacks so you can condition it to a more general stimulus.  Instead of trying to condition four response for a jab, a cross, a round house punch and a kick (possibly eight responses, if your technique works differently for right or left attacks) you condition one response that works on the general stimulus of ‘an attack.’

This stage of training has a very definite uke and tori.  Uke exists to present the stimulus for tori to respond.  It’s not a mix or a contest and it is possible that one of the reason few people condition properly is that it can be boring if that is all you do.

During this stage, I walk around and tell people they are doing a good job.  I only correct if there is something clearly wrong that the student won’t figure out by pain, and that is almost always phrased in a positive way.  Positive does not mean cheery, positive in the conditioning sense: Say, “Try this” rather than “Don’t do that” whenever possible. Also, I never correct if someone does a completely different technique that works.  This is about effectiveness, not preserving my technique.  Never punish success.

I will spend a lot of time correcting the attackers, however.  The kicking shields must be held tight, or the tori will not get the right feedback.  Uke must stand very close, the distance a criminal would stand to set up a sucker punch.  The attack must be realistic and aggressive.  There must be a pause between the attacks so that each stimulus is clear and the attacks must have a broken rhythm.

The last stage is to have the student face off with three attackers, front and to the sides. A fourth coach behind indicates which of the three attackers will charge.  This drill starts getting things up to reflex speed.  It also rewards a good drop step because if the drop step is executed, the technique even works on multiple simultaneous attackers.  Ideally, the person in the middle should rotate out on one really good technique so success is the last memory.

Training doesn’t come out in your first fights.  Conditioning will, good or bad.  The things you have conditioned will happen too fast for your decision making process, and in a life or death emergency, good conditioning is a life saver.

If your students have practiced pulling punches and been punished for lack of control when they make contact, they will pull their punches under stress. If your students have always relied on tape and gloves to keep their fists safe they will hold their hands as if they were taped, and that can be costly.

Most critically, if you are one of the asshole instructors who teaches your students to fight but needs to make an example of any student that tags you, congratulations.  You have just undone your own training.  You have taught a student to win and punished them for winning.  You have taught the deep part of the brain that losing is the safest course.  Your asshole ego has created a loser. Never, ever, ever punish someone for being successful at what you taught her.

The fourth element of training is play.  In my opinion, it is the most important.  Any force encounter is chaotic with a nearly infinite number of variables.  Your conscious mind can’t manage that and it is too complicated for simple conditioning.  The only way we know to help people handle chaos, other than surviving a metric fuck ton of experience, is to play.

This is how animals learn hunting.  This is how you learned everything you are really good at.  You can memorize thousands of words in a foreign language know all the rules of grammar, but until you can go to the market and haggle and argue and flirt, you don’t really speak the language.

It is an effective learning method and we are wired to absorb and integrate knowledge this way.  How long does it take a kid to learn a video game?  Most kids are proficient in hours at most.  Because they run through the tutorial and then they play.  If we tried to make a video game master the way we try to train martial artists, how many years would it take to get to that level of proficiency, if at all?

Anything you teach, anything you train, will ingrain harder and be more available under stress, if you have made a good, hard physical game out of it.  This was Kano’s epiphany about judo.

Play has operant conditioning built in. There is immediate feedback on what works and doesn’t and realistic (sort of, more later) stimuli.  And this operant conditioning works in a framework of chaos and irrelevant stimuli and all the other things that make real encounters so difficult.  Life is an awful lot like play, if you want to look at it backwards.

One caveat, and this goes for designing play and for conditioning.  The game is the game.  It will never have a one-to-one correlation to reality, not unless the body count is the same. Getting good at sparring or getting good at push hands or getting good at rolling are all pieces.  You will get a visceral understanding of the principles by some of these games.  But if you believe the game is the reality, you have just willfully blinded yourself, and that blindness can be passed to your students.

The game or the conditioning is what it is, not necessarily what you meant it to be.  Case in point on the play side: I love grappling.  It is the best way to develop the skills to move a body.  But if I am in that position, I have one of three goals, to maneuver him so I can escape, to maneuver him so I can cuff him or to maneuver him so I can incapacitate him.  In real life you never grapple to prove that you are better at grappling.

On the conditioning side, I watched a video of ‘woofing.’  The instructors were getting into student’s faces, yelling horrible abuse and threats.  The instructors thought they were inoculating against adrenaline.  That the students were getting used to the emotional environment of an assault.  That may be true, but what the students were practicing was letting a bad guy invade their space, make threats and insult (stimulus) and the students did nothing (response). Thinking they were preparing, they were practicing passivity.

Leave a Reply