Theory of Operant Conditioning

The Theory of Operant Conditioning is a fundamental concept in behavioral psychology, primarily developed by Burrhus Frederic Skinner (although its foundations stem from the work of Edward Thorndike and his “Law of Effect”). This theory explains how the consequences of a behavior influence the probability of that behavior being repeated in the future.

The central principle is that the organism learns to operate on its environment to produce or avoid certain consequences.

Fundamental Concepts

Operant conditioning is structured around the three-term contingency, also known as the A-B-C model:

Antecedent (A): The stimulus present immediately before the behavior occurs. It signals when a response is likely to be reinforced or punished. (Example: A green traffic light).
Behavior (B, Behavior or Operant Response): The action or behavior emitted by the organism. This is the behavior that is sought to be modified. (Example: Crossing the street).
Consequence (C): The event that immediately follows the behavior. It is what determines the probability of the behavior being repeated. (Example: Reaching the other side safely).

Mechanisms of Consequence: Reinforcement and Punishment

The main mechanisms for modifying behavior are reinforcement (which increases the probability of the behavior) and punishment (which decreases the probability of the behavior).

Mechanism	Effect on Behavior	Operation	Definition	Example
Positive Reinforcement	Increases the behavior	Adding a pleasant stimulus.	The behavior is followed by the presentation of a desired stimulus.	A child picks up their toys (behavior) and their mother gives them a piece of candy (adding a pleasant stimulus). The behavior is repeated.
Negative Reinforcement	Increases the behavior	Removing an unpleasant (aversive) stimulus.	The behavior results in the avoidance or elimination of an aversive stimulus. It is not punishment.	Someone takes an aspirin (behavior) and the headache (aversive stimulus) disappears (removing an unpleasant stimulus). The person will take an aspirin again when they have a headache.
Positive Punishment	Decreases the behavior	Adding an unpleasant (aversive) stimulus.	The behavior is followed by the presentation of an unwanted stimulus.	An employee arrives late (behavior) and the boss scolds them (adding an unpleasant stimulus). The probability of being late decreases.
Negative Punishment	Decreases the behavior	Removing a pleasant stimulus.	The behavior results in the removal of something positive.	A teenager uses their phone while eating (behavior) and their parents take the phone away for the rest of the day (removing a pleasant stimulus). The probability of using the phone at the table decreases.

Extinction

Extinction occurs when a previously reinforced behavior is no longer followed by its reinforcer, which leads to the frequency of that behavior gradually decreasing until it disappears.

Example: A child cries to get a toy, and their mother always gives it to them. If the mother stops giving them the toy every time they cry, over time, the crying (behavior) will decrease and become extinct.

Additional Elements and Procedures

1. Discriminative Stimulus

It is the environmental cue or stimulus that indicates that a particular response will be reinforced (or punished). It does not cause the behavior, but rather establishes the occasion for the operant response to occur.

Example: The “Open” sign indicates that entering the store and requesting a product (behavior) will result in a purchase (reinforcement).

2. Generalization and Discrimination

Generalization: When a behavior that has been reinforced in the presence of a discriminative stimulus is also emitted in the presence of similar stimuli.
Discrimination: The process of learning to respond only to the specific discriminative stimulus and not to other stimuli, even if they are similar.

3. Schedules of Reinforcement

These are the rules that determine when and how a reinforcer is administered after a response. There are two main types:

Continuous Reinforcement (CRF): The behavior is reinforced every time it occurs. This produces fast learning, but extinction is also fast if the reinforcement is stopped.
Intermittent or Partial Reinforcement (IRF): The behavior is reinforced only some of the time. Learning is slower, but the behavior is much more resistant to extinction. It is divided into four schedules:
- Fixed Ratio (FR): The reinforcer is presented after a fixed number of responses. (Example: The worker is paid for every 10 units produced).
- Variable Ratio (VR): The reinforcer is presented after an unpredictable number of responses (the average is fixed). This produces very high and steady response rates. (Example: Slot machines).
- Fixed Interval (FI): The reinforcer is presented after the first response that occurs after a fixed period of time. (Example: A quarterly exam).
- Variable Interval (VI): The reinforcer is presented after the first response that occurs after a variable period of time (the average is fixed). This produces moderate and steady response rates. (Example: Emails or messages from a friend).

Applications of Operant Conditioning

Operant conditioning has vast practical applications, including:

Behavior Therapy: Techniques like Token Economy (using secondary reinforcers, like points, to obtain primary reinforcers) to modify behaviors in clinical or educational settings.
Education: Use of praise, grades, and feedback to encourage studying and participation.
Animal Training: The basis for teaching tricks or tasks to animals through positive reinforcement.
Parenting: Use of reinforcement techniques (instead of punishment) to encourage desirable behaviors in children.