Operant Conditioning: What It Is, How It Works, and Examples

Operant conditioning, developed by B.F. Skinner, is a learning process where behaviors are influenced by consequences. Positive reinforcement encourages a behavior by adding a reward, while negative reinforcement strengthens it by removing an unpleasant stimulus. Punishment, on the other hand, decreases a behavior by introducing a negative consequence or removing a positive one.

Key Takeaways

  • Operant conditioning is a learning process that modifies behavior through reinforcement and punishment.
  • Positive and negative reinforcement increase behavior, while positive and negative punishment decrease behavior.
  • Reinforcement schedules influence how quickly behaviors are learned and how resistant they are to extinction.
  • Skinner’s experiments, like the Skinner Box and superstitious pigeons, show how behavior can be shaped by even accidental rewards.
  • Token economies, reward systems, and punishment illustrate real-world applications of operant conditioning in classrooms, workplaces, and therapy.
  • Critics note that operant conditioning can overlook cognitive factors and ethical concerns, emphasizing the need for a balanced approach.
operant Conditioning quick facts

How It Works

Skinner is regarded as the father of Operant Conditioning, but his work was based on Thorndike’s (1898) Law of Effect.

According to this principle, behavior that is followed by pleasant consequences is likely to be repeated, and behavior followed by unpleasant consequences is less likely to be repeated.

Skinner introduced a new term into the Law of Effect – Reinforcement.

Behavior that is reinforced tends to be repeated (i.e., strengthened); behavior that is not reinforced tends to die out or be extinguished (i.e., weakened).

Skinner (1948) studied operant conditioning by conducting experiments using animals, which he placed in a Skinner Box, which was similar to Thorndike’s puzzle box.

Skinner box or operant conditioning chamber experiment outline diagram. Labeled educational laboratory apparatus structure for mouse or rat experiment to understand animal behavior vector illustration

A Skinner box, also known as an operant conditioning chamber, is a device used to objectively record an animal’s behavior in a compressed time frame. An animal can be rewarded or punished for engaging in certain behaviors, such as lever pressing (for rats) or key pecking (for pigeons).

Skinner identified three types of responses, or operant, that can follow behavior.

  • Neutral operants: Responses from the environment that neither increase nor decrease the probability of a behavior being repeated.
  • Reinforcers: Responses from the environment that increase the probability of a behavior being repeated. Reinforcers can be either positive or negative.
  • Punishers: Responses from the environment that decrease the likelihood of a behavior being repeated. Punishment weakens behavior.

We can all think of examples of how reinforcers and punishers have affected our behavior. As a child, you probably tried out a number of behaviors and learned from their consequences.

For example, when you were younger, if you tried smoking at school, and the chief consequence was that you got in with the crowd you always wanted to hang out with, you would have been positively reinforced (i.e., rewarded) and would be likely to repeat the behavior.

If, however, the main consequence was that you were caught, caned, suspended from school, and your parents became involved, you would most certainly have been punished, and you would consequently be much less likely to smoke now.

Positive Reinforcement

B. F. Skinner’s theory of operant conditioning describes positive reinforcement. In positive reinforcement, a response or behavior is strengthened by rewards, leading to the repetition of the desired behavior.

The reward is a reinforcing stimulus.

Primary reinforcers are stimuli that are naturally reinforcing because they are not learned and directly satisfy a need, such as food or water.

Secondary reinforcers are stimuli that are reinforced through their association with a primary reinforcer, such as money, school grades.

They do not directly satisfy an innate need but may be the means.  So a secondary reinforcer can be just as powerful a motivator as a primary reinforcer.

Skinner showed how positive reinforcement worked by placing a hungry rat in his Skinner box.

The box contained a lever on the side, and as the rat moved about the box, it would accidentally knock the lever. Immediately, it did so that a food pellet would drop into a container next to the lever.

After being put in the box a few times, the rats quickly learned to go straight to the lever. The consequence of receiving food if they pressed the lever ensured that they would repeat the action again and again.

Positive reinforcement strengthens a behavior by providing a consequence an individual finds rewarding.

For example, if your teacher gives you £5 each time you complete your homework (i.e., a reward), you will be more likely to repeat this behavior in the future, thus strengthening the behavior of completing your homework.

The Premack principle is a form of positive reinforcement in operant conditioning.

It suggests using a preferred activity (high-probability behavior) as a reward for completing a less preferred one (low-probability behavior).

This method incentivizes the less desirable behavior by associating it with a desirable outcome, thus strengthening the less favored behavior.

The Premack Principle is sometimes called Grandma’s Rule – “First eat your vegetables, then you can have dessert” – and is considered one of the more effective and positive approaches to behavior modification.

Dependency on Rewards

  • Continuous external rewards can lead to “reward dependency,” where individuals perform tasks solely for tangible incentives rather than personal satisfaction.
  • In settings like schools or workplaces, this reliance can diminish intrinsic motivation and cause behavior to wane once rewards are removed.

Intrinsic vs. Extrinsic Motivation

  • Excessive use of external reinforcers risks overshadowing internal drives, potentially undermining long-term engagement or creativity.
  • Balancing extrinsic rewards with support for learners’ own interests and personal goals can foster sustainable, self-driven behavior change.

Operant Conditioning Reinforcement 1

Negative Reinforcement

Negative reinforcement is the termination of an unpleasant state following a response.

This is known as negative reinforcement because it is the removal of an adverse stimulus which is ‘rewarding’ to the animal or person.

Negative reinforcement strengthens behavior because it stops or removes an unpleasant experience.

For example, if you do not complete your homework, you give your teacher £5. You will complete your homework to avoid paying £5, thus strengthening the behavior of completing your homework.

Skinner showed how negative reinforcement worked by placing a rat in his Skinner box and then subjecting it to an unpleasant electric current which caused it some discomfort.

As the rat moved about the box it would accidentally knock the lever.

Immediately, it did so the electric current would be switched off. The rats quickly learned to go straight to the lever after being put in the box a few times.

The consequence of escaping the electric current ensured that they would repeat the action again and again.

In fact, Skinner even taught the rats to avoid the electric current by turning on a light just before the electric current came on.

The rats soon learned to press the lever when the light came on because they knew that this would stop the electric current from being switched on.

These two learned responses are known as Escape Learning and Avoidance Learning.

Punishment

Punishment is the opposite of reinforcement since it is designed to weaken or eliminate a response rather than increase it. It is an aversive event that decreases the behavior that it follows.

Like reinforcement, punishment can work either by directly applying an unpleasant stimulus like a shock after a response or by removing a potentially rewarding stimulus, for instance, deducting someone’s pocket money to punish undesirable behavior.

Note: It is not always easy to distinguish between punishment and negative reinforcement.

They are two distinct methods of punishment used to decrease the likelihood of a specific behavior occurring again, but they involve different types of consequences:

Negative ReinforcementPunishment
Removing an unpleasant or aversive stimulus once the desired behavior occurs (e.g., turning off a loud alarm when the correct action is performed).Applying or removing a stimulus immediately after a behavior to reduce or eliminate it (e.g., giving a reprimand or taking away a privilege).
Increases the likelihood that the behavior will be repeated.Decreases the likelihood that the behavior will be repeated.
  1. Positive Punishment:

    • Positive punishment involves adding an aversive stimulus or something unpleasant immediately following a behavior to decrease the likelihood of that behavior happening in the future.
    • It aims to weaken the target behavior by associating it with an undesirable consequence.
    • Example: A child receives a scolding (an aversive stimulus) from their parent immediately after hitting their sibling. This is intended to decrease the likelihood of the child hitting their sibling again.
  2. Negative Punishment:

    • Negative punishment involves removing a desirable stimulus or something rewarding immediately following a behavior to decrease the likelihood of that behavior happening in the future.
    • It aims to weaken the target behavior by taking away something the individual values or enjoys.
    • Example: A teenager loses their video game privileges (a desirable stimulus) for not completing their chores. This is intended to decrease the likelihood of the teenager neglecting their chores in the future.

There are many problems with using punishment, such as:

  • Behavior Is Suppressed, Not Forgotten: Once the punishment stops, the undesired behavior often reemerges.
  • Risk of Fear and Aggression: Punishment may generate fear, avoidance, or aggression, potentially normalizing aggression as a way to cope with problems.
  • Lack of Guidance: Punishment only shows what not to do and does not teach or reinforce the desired alternative behavior

Examples of Operant Conditioning

1. Positive Reinforcement (Social Media ‘Likes’)

Users on social media platforms often post content in anticipation of receiving “likes,” comments, and shares. 

These notifications act as small rewards, encouraging users to continue posting and staying active on the platform.

2. Positive Reinforcement (Sports Coaching)

Suppose you are a coach and want your team to improve their passing accuracy in soccer. 

When the players execute accurate passes during training, you praise their technique.

This positive feedback encourages them to repeat the correct passing behavior.

3. Positive Reinforcement (Pet Training)

Training a cat to use a litter box can be achieved by giving it a treat each time it uses it correctly.The cat will associate the behavior with the reward and likely repeat it.

4. Negative Punishment (Teen Curfew)

If teenagers stay out past their curfew, their parents might take away their gaming console for a week.

Losing the console increases the likelihood they will respect curfew in the future to avoid losing something they value.

The scenario fits negative punishment because:

  • The parents are removing something pleasant (the gaming console)
  • This is done to decrease an undesired behavior (breaking curfew)
  • The teen is motivated to follow curfew in the future to avoid losing something they value

5. Negative Reinforcement (Teen Curfew)

If teenagers stay out past their curfew, their parents might stop nagging their teen (removing something unpleasant) when they came home on time (the desired behavior).

It would involve removing something unpleasant (naggint) to increase a desired behavior (respect curfew).

The scenario fits negative reinforment because:

  • Something unpleasant (nagging) is being removed
  • This removal occurs when the teen performs the desired behavior (respecting curfew)
  • The removal of the unpleasant stimulus increases the likelihood of the desired behavior continuing

5. Positive Punishment (to Discourage Tardiness)

A manager might require chronically late employees to complete extra paperwork or tasks each time they arrive past their start time.

This additional burden (the unpleasant stimulus) is introduced in direct response to lateness, aiming to discourage employees from repeating the behavior.

The scenario fits positive punishment because:

  1. Something unpleasant (extra paperwork or tasks) is being added or introduced
  2. This addition occurs as a direct consequence of an undesired behavior (arriving late)
  3. The purpose is to decrease the frequency of the undesired behavior (tardiness)

In behavioral psychology, positive punishment involves adding an unpleasant stimulus or experience following an unwanted behavior to make that behavior less likely to occur in the future.

The key components that make this positive punishment:

  • “Positive” in this context means addition (adding something)
  • “Punishment” means the goal is to decrease a behavior
  • The extra work serves as an aversive stimulus that employees will want to avoid

This differs from positive reinforcement (adding something pleasant to increase behavior), negative reinforcement (removing something unpleasant to increase behavior), or negative punishment (removing something pleasant to decrease behavior).

6. Ineffective Punishment (Child’s Eating Habits)

Your child refuses to finish their vegetables at dinner. You punish them by withholding dessert, but the child still refuses to eat vegetables next time.

The punishment seems ineffective in shaping the desired behavior.

Here’s why it’s ineffective:

  1. Lack of behavior change: Despite implementing a punishment (withholding dessert), the child continues to refuse eating vegetables in future meals.

    The key indicator of effectiveness in behavioral modification is whether the undesired behavior decreases over time.

  2. Punishment without learning: The child may be willing to accept the punishment (no dessert) rather than comply with eating vegetables. This suggests the aversiveness of eating vegetables outweighs the reward of dessert.
  3. Possible reinforcement: The child might actually be reinforced by the attention received during the power struggle over vegetables, or by successfully avoiding something they dislike (vegetables).
  4. Failure to address underlying issues: The punishment doesn’t address why the child refuses vegetables (taste preferences, texture sensitivities, desire for autonomy, etc.).

For punishment to be effective, it typically needs to be:

  • Immediate
  • Consistent
  • Proportionate
  • Paired with positive reinforcement of alternative behaviors
  • Accompanied by clear communication about desired behaviors

7. Premack Principle Application:

  • You could motivate your child to eat vegetables by offering an activity they love after they finish their meal.
  • For instance, for every vegetable eaten, they get an extra five minutes of video game time. They value video game time, which might encourage them to eat vegetables.

This is an example of the Premack Principle because:

  1. High-probability behavior as reinforcer: The Premack Principle states that a high-probability behavior (one that occurs frequently, like playing video games) can be used as a reinforcer for a low-probability behavior (one that occurs less frequently, like eating vegetables).

    In this case, video game playing (which the child naturally wants to do) is being used to reinforce vegetable eating (which the child is resistant to).

  2. Contingency relationship: The principle establishes a clear “first this, then that” relationship. The child must first complete the less preferred activity (eating vegetables) to gain access to the preferred activity (playing video games).

  3. Natural reinforcement: The reinforcer (video game time) is something the child already enjoys doing, making it a natural and meaningful reinforcement rather than an artificial one.

  4. Proportional reward: The example includes a specific ratio (each vegetable = 5 extra minutes of game time), creating a clear and proportional relationship between the behavior and the reward.

  5. Positive approach: Unlike punishment-based strategies, this approach focuses on encouraging desired behavior through positive means, which is typically more effective for long-term behavior change and avoids negative emotional associations with eating vegetables.

Other Premack Principle Examples:

  • A student who dislikes history but loves art might earn extra time in the art studio for each history chapter reviewed.
  • For every 10 minutes a person spends on household chores, they can spend 5 minutes on a favorite hobby.
  • For each successful day of healthy eating, an individual allows themselves a small piece of dark chocolate at the end of the day.
  • A child can choose between taking out the trash or washing the dishes. Giving them the choice makes them more likely to complete the chore willingly.

Skinner’s Pigeon Experiment

B.F. Skinner conducted several experiments with pigeons to demonstrate the principles of operant conditioning.

One of the most famous of these experiments is often colloquially referred to as “Superstition in the Pigeon.”

This experiment was conducted to explore the effects of non-contingent reinforcement on pigeons, leading to some fascinating observations that can be likened to human superstitions.

Non-contingent reinforcement (NCR) refers to a method in which rewards (or reinforcements) are delivered independently of the individual’s behavior. In other words, the reinforcement is given at set times or intervals, regardless of what the individual is doing.

The Experiment:

  1. Pigeons were brought to a state of hunger, reduced to 75% of their well-fed weight.
  2. They were placed in a cage with a food hopper that could be presented for five seconds at a time.
  3. Instead of the food being given as a result of any specific action by the pigeon, it was presented at regular intervals, regardless of the pigeon’s behavior.

Observation:

  1. Over time, Skinner observed that the pigeons began to associate whatever random action they were doing when food was delivered with the delivery of the food itself.
  2. This led the pigeons to repeat these actions, believing (in anthropomorphic terms) that their behavior was causing the food to appear.

Findings:

  1. In most cases, pigeons developed different “superstitious” behaviors or rituals. For instance, one pigeon would turn counter-clockwise between food presentations, while another would thrust its head into a cage corner.
  2. These behaviors did not appear until the food hopper was introduced and presented periodically.
  3. These behaviors were not initially related to the food delivery but became linked in the pigeon’s mind due to the coincidental timing of the food dispensing.
  4. The behaviors seemed to be associated with the environment, suggesting the pigeons were responding to certain aspects of their surroundings.
  5. The rate of reinforcement (how often the food was presented) played a significant role. Shorter intervals between food presentations led to more rapid and defined conditioning.
  6. Once a behavior was established, the interval between reinforcements could be increased without diminishing the behavior.

Superstitious Behavior:

The pigeons began to act as if their behaviors had a direct effect on the presentation of food, even though there was no such connection.

This is likened to human superstitions, where rituals are believed to change outcomes, even if they have no real effect.

For example, a card player might have rituals to change their luck, or a bowler might make gestures believing they can influence a ball already in motion.

Conclusion:

This experiment demonstrates that behaviors can be conditioned even without a direct cause-and-effect relationship. Just like humans, pigeons can develop “superstitious” behaviors based on coincidental occurrences.

This study not only illuminates the intricacies of operant conditioning but also draws parallels between animal and human behaviors in the face of random reinforcements.

Schedules of Reinforcement

Imagine a rat in a “Skinner box.”

In operant conditioning, if no food pellet is delivered immediately after the lever is pressed, then after several attempts, the rat stops pressing the lever (how long would someone continue to go to work if their employer stopped paying them?). The behavior has been extinguished.

Behaviorists discovered that different patterns (or schedules) of reinforcement had different effects on the speed of learning and extinction. Ferster and Skinner (1957) devised different ways of delivering reinforcement and found that this had effects on

1. The Response Rate – The rate at which the rat pressed the lever (i.e., how hard the rat worked).

2. The Extinction Rate – The rate at which lever pressing dies out (i.e., how soon the rat gave up).

How Reinforcement Schedules Work

Skinner found that variable-ratio reinforcement produces the slowest rate of extinction (i.e., people will continue repeating the behavior for the longest time without reinforcement).

The type of reinforcement with the quickest rate of extinction is continuous reinforcement.

(A) Continuous Reinforcement

An animal or human is positively reinforced every time a specific behavior occurs, e.g., every time a lever is pressed, a pellet is delivered, and then food delivery is shut off.

  • Response rate is SLOW
  • Extinction rate is FAST

(B) Fixed Ratio Reinforcement

Behavior is reinforced only after the behavior occurs a specified number of times. e.g., one reinforcement is given after every so many correct responses, e.g., after every 5th response. For example, a child receives a star for every five words spelled correctly.

  • Response rate is FAST
  • Extinction rate is MEDIUM

(C) Fixed Interval Reinforcement

One reinforcement is given after a fixed time interval providing at least one correct response has been made.

An example is being paid by the hour. Another example would be every 15 minutes (half hour, hour, etc.) a pellet is delivered (providing at least one lever press has been made) then food delivery is shut off.

  • Response rate is MEDIUM
  • Extinction rate is MEDIUM

(D) Variable Ratio Reinforcement

behavior is reinforced after an unpredictable number of times. For example, gambling or fishing.

    • Response rate is FAST
    • Extinction rate is SLOW (very hard to extinguish because of unpredictability)

(E) Variable Interval Reinforcement

Providing one correct response has been made, reinforcement is given after an unpredictable amount of time has passed, e.g., on average every 5 minutes.

An example is a self-employed person being paid at unpredictable times.

  • Response rate is FAST
  • Extinction rate is SLOW

Applications In Psychology

1. Behavior Modification Therapy

Behavior modification is a set of therapeutic techniques based on operant conditioning (Skinner, 1938, 1953). The main principle comprises changing environmental events that are related to a person’s behavior.

For example, the reinforcement of desired behaviors and ignoring or punishing undesired ones.

This is not as simple as it sounds — always reinforcing desired behavior, for example, is basically bribery.

There are different types of positive reinforcements. Primary reinforcement is when a reward strengths a behavior by itself.

Secondary reinforcement is when something strengthens a behavior because it leads to a primary reinforcer.

Examples of behavior modification therapy include token economy and behavior shaping.

Token Economy

Token economy is a system in which targeted behaviors are reinforced with tokens (secondary reinforcers) and later exchanged for rewards (primary reinforcers).

Tokens can be in the form of fake money, buttons, poker chips, stickers, etc. While the rewards can range anywhere from snacks to privileges or activities.

For example, teachers use token economy at primary school by giving young children stickers to reward good behavior.

Token economy has been found to be very effective in managing psychiatric patients. However, the patients can become over-reliant on the tokens, making it difficult for them to adjust to society once they leave prison, hospital, etc.

Staff implementing a token economy program have a lot of power. It is important that staff do not favor or ignore certain individuals if the program is to work.

Therefore, staff need to be trained to give tokens fairly and consistently even when there are shift changes such as in prisons or in a psychiatric hospital.

Behavior Shaping

A further important contribution made by Skinner (1951) is the notion of behavior shaping through successive approximation.

Skinner argues that the principles of operant conditioning can be used to produce extremely complex behavior if rewards and punishments are delivered in such a way as to encourage move an organism closer and closer to the desired behavior each time.

In shaping, the form of an existing response is gradually changed across successive trials towards a desired target behavior by rewarding exact segments of behavior.

To do this, the conditions (or contingencies) required to receive the reward should shift each time the organism moves a step closer to the desired behavior.

According to Skinner, most animal and human behavior (including language) can be explained as a product of this type of successive approximation.

2. Educational Applications

In the conventional learning situation, operant conditioning applies largely to issues of class and student management, rather than to learning content. It is very relevant to shaping skill performance.

A simple way to shape behavior is to provide feedback on learner performance, e.g., compliments, approval, encouragement, and affirmation.

A variable-ratio produces the highest response rate for students learning a new task, whereby initial reinforcement (e.g., praise) occurs at frequent intervals, and as the performance improves reinforcement occurs less frequently, until eventually only exceptional outcomes are reinforced.

For example, if a teacher wanted to encourage students to answer questions in class they should praise them for every attempt (regardless of whether their answer is correct).

Gradually the teacher will only praise the students when their answer is correct, and over time only exceptional answers will be praised.

Unwanted behaviors, such as tardiness and dominating class discussion can be extinguished through being ignored by the teacher (rather than being reinforced by having attention drawn to them).

This is not an easy task, as the teacher may appear insincere if he/she thinks too much about the way to behave.

Knowledge of success is also important as it motivates future learning. However, it is important to vary the type of reinforcement given so that the behavior is maintained.

This is not an easy task, as the teacher may appear insincere if he/she thinks too much about the way to behave.

Operant Conditioning vs. Classical Conditioning

Learning Type

While both types of conditioning involve learning, classical conditioning is passive (automatic response to stimuli), while operant conditioning is active (behavior is influenced by consequences).

  • Classical conditioning links an involuntary response with a stimulus. It happens passively on the part of the learner, without rewards or punishments. An example is a dog salivating at the sound of a bell associated with food.
  • Operant conditioning connects voluntary behavior with a consequence. Operant conditioning requires the learner to actively participate and perform some type of action to be rewarded or punished. It’s active, with the learner’s behavior influenced by rewards or punishments. An example is a dog sitting on command to get a treat.

Learning Process

Classical conditioning involves learning through associating stimuli resulting in involuntary responses, while operant conditioning focuses on learning through consequences, shaping voluntary behaviors.

1. Learning by Association (Classical Conditioning):

In learning by association, a person (or animal) learns to associate two stimuli, causing a behavior change. A neutral stimulus is paired with an unconditioned stimulus that naturally triggers a response.

Over time, the person responds to the neutral stimulus as if it were the unconditioned stimulus, even when presented alone. The response is involuntary and automatic.

An example is a dog salivating (response) at the sound of a bell (neutral stimulus) after it has been repeatedly paired with food (unconditioned stimulus).

2. Learning by Consequences (Operant Conditioning):

In learning by consequences, behavior is learned based on its outcomes or consequences. The learner is active, and the response is voluntary.

Behavior followed by pleasant consequences (rewards) is more likely to be repeated, while behavior followed by unpleasant consequences (punishments) is less likely to be repeated.

For instance, if a child gets praised (pleasant consequence) for cleaning their room (behavior), they’re more likely to clean their room in the future.

Conversely, if they get scolded (unpleasant consequence) for not doing their homework, they’re more likely to complete it next time to avoid the scolding.

Timing of Stimulus & Response

The timing of the response relative to the stimulus differs between classical and operant conditioning:

1. Classical Conditioning (response after the stimulus):

In this form of conditioning, the response occurs after the stimulus.

The behavior (response) is determined by what precedes it (stimulus). 

For example, in Pavlov’s classic experiment, the dogs started to salivate (response) after they heard the bell (stimulus) because they associated it with food.

2. Operant Conditioning (response before the stimulus):

In this form of conditioning, the response generally occurs before the consequence (which acts as the stimulus for future behavior).

The anticipated consequence influences the behavior or what follows it.

It is a more active form of learning, where behaviors are reinforced or punished, thus influencing their likelihood of repetition.

For example, a child might behave well (behavior) in anticipation of a reward (consequence), or avoid a certain behavior to prevent a potential punishment.

Summary

Looking at Skinner’s classic studies on pigeons’ and rats’ behavior, we can identify some of the major assumptions of the behaviorist approach.

Psychology should be seen as a science, to be studied in a scientific manner. Skinner’s study of behavior in rats was conducted under carefully controlled laboratory conditions.

• Behaviorism is primarily concerned with observable behavior, as opposed to internal events like thinking and emotion. Note that Skinner did not say that the rats learned to press a lever because they wanted food. He instead concentrated on describing the easily observed behavior that the rats acquired.

• The major influence on human behavior is learning from our environment. In the Skinner study, because food followed a particular behavior the rats learned to repeat that behavior, e.g., operant conditioning.

• There is little difference between the learning that takes place in humans and that in other animals. Therefore research (e.g., operant conditioning) can be carried out on animals (Rats / Pigeons) as well as on humans. Skinner proposed that the way humans learn behavior is much the same as the way the rats learned to press a lever.

So, if your layperson’s idea of psychology has always been of people in laboratories wearing white coats and watching hapless rats try to negotiate mazes to get to their dinner, then you are probably thinking of behavioral psychology.

Behaviorism and its offshoots tend to be among the most scientific of the psychological perspectives. The emphasis of behavioral psychology is on how we learn to behave in certain ways.

We are all constantly learning new behaviors and how to modify our existing behavior.

Behavioral psychology is the psychological approach that focuses on how this learning takes place.

Critical Evaluation

While operant conditioning explains a broad range of learned behaviors, it does not fully capture internal thought processes, inherent predispositions, or the social aspects of learning.

Below are key criticisms emphasizing these cognitive considerations:

1. Insight and “Aha!” Moments

Kohler (1924) conducted experiments with chimpanzees in which the animals solved problems, such as retrieving food placed out of reach, in sudden bursts of insight rather than through incremental trial-and-error reinforcement.

In one example, a chimp stacked boxes or combined tools to reach a banana hanging overhead, seemingly grasping the solution spontaneously rather than randomly stumbling upon it.

These observations suggest that learning can involve cognitive restructuring and problem-solving, elements that a purely operant framework does not fully address.

2. Observational Learning

Social Learning Theory (Bandura, 1977) posits that individuals can learn new behaviors simply by watching others, without experiencing direct reinforcement or punishment themselves.

For instance, in Bandura’s famous Bobo Doll experiment, children who watched adults behave aggressively toward a doll later imitated those actions.

Such modeling underscores cognitive factors like attention, memory, and motivation—all of which go beyond the scope of traditional operant conditioning.

By focusing on observation and imitation, Bandura’s theory reveals a more social dimension to how we learn, emphasizing that reinforcement is not the only driver of behavioral change.

3. Linguistic and Innate Structures

Noam Chomsky famously challenged behaviorist explanations of language acquisition by arguing that humans possess innate linguistic capacities—often referred to as a “language acquisition device.”

This critique highlights that children around the world rapidly master grammar and syntax in ways that can’t be solely explained by reinforcement.

Instead, the presence of universal grammar rules suggests a built-in framework guiding language development.

Such arguments point to the possibility that operant conditioning alone cannot account for the speed and complexity of acquiring language, signaling that inherited cognitive structures play a crucial role.

4. Overjustification Effect

In situations where individuals are repeatedly rewarded for engaging in tasks they might already find inherently enjoyable, they can experience the overjustification effect, leading to a decline in intrinsic motivation (Lepper, 1983).

For example, a child who loves drawing may draw less often once they start receiving a reward (such as candy or praise) for doing so—once the reward is removed, the child’s motivation to draw can diminish significantly.

This indicates that while extrinsic reinforcers can shape behavior in the short term, they sometimes undermine the internal satisfaction or meaning that originally drove the activity.

5. Animal-to-Human Extrapolation

Because much of operant conditioning research involves animal subjects—rats, pigeons, and other species—critics caution against generalizing these findings directly to humans.

Human cognition is shaped by complex language, self-awareness, culture, and nuanced emotional processes, which might not be captured in animal experiments.

While basic principles of reinforcement and punishment can indeed transfer across species, the depth and variability of human thought suggest that our behaviors may not always align with the straightforward conditioning patterns observed in lab animals.

Practical Applications

Operant conditioning helps explain a wide range of behaviors, including learning, addiction, and language acquisition, and it can be applied in settings such as classrooms, prisons, and psychiatric hospitals (e.g., token economies).

Researchers have also developed innovative ways to use operant conditioning for improving health and changing habits:

  • Stroke Rehabilitation (Kumar et al., 2019)
    By integrating virtual reality (VR), patients were rewarded with stars for shifting weight onto their weakened limb, reinforcing greater use of that limb during recovery.

  • Smoking Cessation (Dallery et al., 2017)
    Participants earned vouchers exchangeable for goods and services by reducing cigarette use, achieving higher rates of long-term abstinence through consistent reinforcement.

  • Exercise and Healthy Eating (Michie et al., 2009)
    Repeated reinforcement encourages the formation of habits; for example, someone might earn TV time for every 10 minutes of exercise or allow themselves a small treat for sticking to a nutritious meal plan.

  • Gamified Habit Tracking (Eckerstorfer et al., 2019)
    Apps like Habitica award points or virtual rewards for completing real-life tasks, effectively reinforcing positive behaviors.

  • ADHD and OCD Management (Rosén et al., 2018; Twohig et al., 2018)
    Rewarding focus and attention in children with ADHD can improve concentration, while reinforcing patients with OCD for resisting compulsions can reduce obsessive behaviors.

Ethical Implications

While operant conditioning is widely used in education, therapy, business, and technology, its ethical implications raise important concerns.

Critics argue that certain applications, particularly those involving punishment, animal research, and extrinsic reward, may have unintended negative consequences. 

1. The Use of Punishment: Fear, Aggression, and Emotional Harm

Punishment is often used to reduce undesirable behaviors, but it can lead to unintended emotional and psychological consequences.

Positive punishment (e.g., scolding, fines, physical punishment) can generate fear and anxiety rather than teaching constructive alternatives.

Similarly, negative punishment (e.g., loss of privileges) may create resentment rather than behavioral improvement.

  • Example: In a workplace, fining employees for late arrivals may increase stress rather than improve punctuality.
  • Potential Consequences: Punished behaviors may only be suppressed rather than eliminated, resurfacing in different contexts. Additionally, punishment can increase aggression, particularly if individuals learn that aggressive responses are a means of exerting control.
  • Ethical Debate: Some psychologists argue that punishment should be minimized or replaced with positive reinforcement to encourage desired behaviors instead of merely discouraging negative ones.

2. Ethical Concerns in Animal Research

Much of the foundational research on operant conditioning, particularly by B.F. Skinner, was conducted on animals, such as rats and pigeons in controlled lab environments. This raises two key ethical concerns:

  1. Treatment of Animals – Some experiments involved food deprivation, electric shocks, and confinement, raising concerns about the ethical treatment of research animals.
  2. Extrapolation to Humans – Critics argue that findings from animal studies cannot always be directly applied to human behavior, given the complexity of human cognition, emotions, and social influences.
  • Example: Skinner’s use of electric shocks in avoidance learning experiments has been criticized for causing distress to animals.
  • Ethical Debate: Should research that involves distressing procedures for animals be justified by potential human benefits? Today, research guidelines enforce stricter ethical standards, but past studies remain controversial.

3. Manipulation and Loss of Autonomy Through Rewards

A major ethical concern with operant conditioning is the potential for behavioral manipulation, particularly when individuals are unaware that they are being conditioned.

  • Extrinsic vs. Intrinsic Motivation – Over-reliance on external rewards (money, grades, promotions) can reduce intrinsic motivation, known as the overjustification effect. When rewards are removed, individuals may lose interest in previously enjoyable activities.

  • Unethical Use in Business & Technology – Companies use operant conditioning in advertising, workplace incentives, and social media engagement loops to subtly influence behavior.

  • Examples:

    • Workplace Manipulation: Employers may use performance-based bonuses to encourage overworking without considering employee well-being.
    • Social Media Addiction: Platforms like Instagram and TikTok use intermittent reinforcement (random “likes” and notifications) to keep users engaged, creating compulsive usage patterns.
    • Consumer Behavior: Loyalty programs (e.g., rewards points, discounts) reinforce purchasing habits, sometimes encouraging unnecessary spending.
  • Ethical Debate: When does conditioning cross the line into coercion? Are companies and institutions ethically responsible for ensuring that reinforcement does not lead to addiction, burnout, or financial exploitation?

4. The Ethics of Using Operant Conditioning in Education & Therapy

In settings where operant conditioning is used to shape behavior—such as classrooms, therapy, and mental health treatment—ethical considerations become particularly important.

  • Token Economies in Schools & Psychiatric Hospitals: Reward-based behavior systems can be effective, but some argue that they oversimplify human motivation and may not foster genuine personal growth.
  • Discipline Methods: Should children and patients be conditioned to behave in ways that primarily serve institutional convenience rather than their own well-being?
  • Long-Term Effects: While reinforcement strategies can improve behavior in the short term, reliance on external rewards may not foster self-discipline or critical thinking skills.

Conclusion: Balancing Effectiveness with Ethics

While operant conditioning is a powerful tool for shaping behavior, it must be applied with caution.

Minimizing harm, promoting autonomy, and ensuring that reinforcement strategies align with ethical principles are essential when implementing operant conditioning in real-world settings.

Encouraging transparency, informed consent, and a balance between extrinsic and intrinsic motivation can help ensure ethical applications of behavior modification techniques.

Further Reading

References

  • Bandura, A. (1977). Social learning theory. Englewood Cliffs, NJ: Prentice Hall.
  • Chomsky, N. (1965). Aspects of the Theory of Syntax. MIT Press.
  • Dallery, J., Meredith, S., & Glenn, I. M. (2017). A deposit contract method to deliver abstinence reinforcement for cigarette smoking. Journal of Applied Behavior Analysis, 50(2), 234–248.
  • Eckerstorfer, L., Tanzer, N. K., Vogrincic-Haselbacher, C., Kedia, G., Brohmer, H., Dinslaken, I., & Corbasson, R. (2019). Key elements of mHealth interventions to successfully increase physical activity: Meta-regression. JMIR mHealth and uHealth, 7(11), e12100.
  • Ferster, C. B., & Skinner, B. F. (1957). Schedules of reinforcement. New York: Appleton-Century-Crofts.
  • Kohler, W. (1924). The mentality of apes. London: Routledge & Kegan Paul.
  • Kumar, D., Sinha, N., Dutta, A., & Lahiri, U. (2019). Virtual reality-based balance training system augmented with operant conditioning paradigm. Biomedical Engineering Online18(1), 1-23.
  • Lepper, M. R. (1983). Extrinsic reward and intrinsic motivation: Implications for the classroom. In J. M. Levine and M. C. Wang (Eds.), Teacher and student perceptions: Implications for learning (pp. 281–317). Hillsdale, NJ: Erlbaum.
  • Michie, S., Abraham, C., Whittington, C., McAteer, J., & Gupta, S. (2009). Effective techniques in healthy eating and physical activity interventions: A meta-regression. Health Psychology, 28(6), 690–701.
  • Rosén, E., Westerlund, J., Rolseth, V., Johnson R. M., Viken Fusen, A., Årmann, E., Ommundsen, R., Lunde, L.-K., Ulleberg, P., Daae Zachrisson, H., & Jahnsen, H. (2018). Effects of QbTest-guided ADHD treatment: A randomized controlled trial. European Child & Adolescent Psychiatry, 27(4), 447–459.
  • Skinner, B. F. (1948). ‘Superstition’in the pigeon. Journal of experimental psychology38(2), 168.
  • Schunk, D. (2016). Learning theories: An educational perspective. Pearson.
  • Skinner, B. F. (1938). The behavior of organisms: An experimental analysis. New York: Appleton-Century.
  • Skinner, B. F. (1948). Superstition” in the pigeon. Journal of Experimental Psychology, 38, 168-172.
  • Skinner, B. F. (1951). How to teach animals. Freeman.
  • Skinner, B. F. (1953). Science and human behavior. Macmillan.
  • Thorndike, E. L. (1898). Animal intelligence: An experimental study of the associative processes in animals. Psychological Monographs: General and Applied, 2(4), i-109.
  • Twohig, M. P., Whittal, M. L., Cox, J. M., & Gunter, R. (2010). An initial investigation into the processes of change in ACT, CT, and ERP for OCD. International Journal of Behavioral Consultation and Therapy, 6(2), 67–83.
  • Watson, J. B. (1913). Psychology as the behaviorist views it. Psychological Review, 20, 158–177.

Citation

McLeod, S. (2025, March 17). Operant conditioning: What it is, how it works, and examples. Simply Psychology. https://www.simplypsychology.org/operant-conditioning.html

Olivia Guy-Evans, MSc

BSc (Hons) Psychology, MSc Psychology of Education

Associate Editor for Simply Psychology

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.


Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

h4 { font-weight: bold; } h1 { font-size: 40px; } h5 { font-weight: bold; } .mv-ad-box * { display: none !important; } .content-unmask .mv-ad-box { display:none; } #printfriendly { line-height: 1.7; } #printfriendly #pf-title { font-size: 40px; }