A Guide to Overcoming Plateaus

The cognitive science behind growth ruts, and how to break through to the next level

Jun 11, 2024

The map is not the territory
- Alfred Korzybski

Have you plateaued in your profession, skill, or hobby? Have you noticed that you’ve stopped growing, and aren’t making progress like you used to, even though you continue to put in effort and time? Do you feel like you’ve hit some kind of a ceiling to your competence? Do you find yourself in a creative rut, treading the same ground over and over again?

Don’t worry, you’re not alone. This happens to be the default state for most of us. We spend much of our lives toiling away in Sisyphean plateaus, struggling to make progress. As an inveterate dabbler with way too many interests and hobbies for my own good, I keep getting stuck in multiple plateaus, and find myself wishing I had a magic formula to make progress again.

The good news is: it is possible to fast-forward your way (somewhat) through plateaus. In this essay we will explore plateaus from a few different perspectives (cognitive science, psychology, science of expertise, machine learning & AI). With these lenses and intuitions in hand, we will compile a large number of tricks, hacks, and techniques to avoid and break out of plateaus.

You can jump straight to the list of 70+ tricks, but getting some intuition for why they work might help you modify or design your own tricks for your particular circumstances.

In the next few sections we will explore what plateaus are and we will talk about the nature of expertise and understanding - both legible and illegible. We will look at the “illusion of explanatory depth”, examine the nature of bottlenecks, and find analogies in machine learning & AI. I even reminisce about a course I taught decades ago that is partly the inspiration for this piece. Then we get to all the tricks. This is a self-contained piece but it may also help to read my previous essay on Effortless Mastery before or after this.

What is a Plateau?

Mastering any field turns out to be an unending series of hills and plateaus. Hills are where practice/effort leads to improvement in performance. Plateaus are where practice/effort results in little to no improvement. In fact, not only do you stop making progress when you’re in a plateau, you also get the feeling that you’re becoming worse. This is because improvements in your skill lag behind your “taste” or the ability to discern different and higher levels of skill. This false perception destroys any remaining motivation to continue your practice.

This is why most people end up quitting when they reach one of the many plateaus on their journey to expertise.

Credit: Marc Dalessio captures the unending sequence of hills, plateaus, and dips in this chart of his own journey as a painter

Seeing Plateaus Everywhere

Why do plateaus exist? What aspects of our cognition force our learning to always follow a sequence of steep hills followed by plateaus? How do some people overcome plateaus more easily than others? And where else can we see this pattern?

It turns out once you start looking for plateaus, you find them everywhere.

There seems to be some kind of natural law of change/progress across domains where long periods of stasis are interrupted by rapid bursts of innovation and change. This is not only true at the level of individual growth and mastery, but it also seems to be a staple of how change happens in cultures and societies, in science, technology, and the arts, and in entire fields and disciplines.

Take for example the confluence of events and conditions that led to the explosive flourishing of the arts in Renaissance Florence after centuries of stagnation, or the Scientific Revolution of the 16th & 17th centuries. The literary and artistic upheaval in Paris of the Années Folles, the birth of Rock & Roll or any major genre of music, and the genesis of pretty much everything of significance around us. They all showed up in the form of fast-paced change after long periods of stagnation.

We see this phenomenon even in evolutionary biology - in the “Punctuated Equilibrium” of the evolution of species. It turns out species don’t evolve gradually and continuously over time. Instead, species tend to remain relatively stable for long periods of time, with little or no evolutionary change (equilibrium), punctuated by brief periods of rapid evolutionary change.

Illustration depicting the concept of punctuated equilibrium in species evolution. The image should show two time periods: one with a stable environment where species remain unchanged, represented by identical animals or plants in a consistent pattern, and a second period showing a sudden burst of evolutionary change with distinctly different animals or plants emerging. The background should subtly shift to indicate environmental changes, aiding the visualization of this evolutionary theory. The style should be clear, educational, and suitable for a science textbook. — This funny bizarre image is GPT4 ‘s hallucination of the evolution of species

There are various theories exploring why this might be the case across domains - from Kuhn’s Paradigm Shifts to Gould’s Punctuated Equilibria. From the concept of Emergence in Complex Systems to the Disruptive Innovation seen in Christensen’s notion of the Innovator’s Dilemma.

While these theories aim to explain specific examples (for instance, Kuhn focuses on Scientific Revolutions), there seems to be something more fundamental and structural going on. Anywhere you see progress and change in natural or complex systems - the basic shape of it seems to be familiar - the dreaded plateau is ever present and rarely is improvement continuous and gradual.

Perhaps one day we will have a grand unified theory of progress that explains all plateaus at a basic structural or paradigmatic level, but our job in this essay is to focus on plateaus in individual growth.

Some Intuitions about Plateaus

A model is a lie that helps you see the truth.
— Howard Skipper

The basic intuition behind plateaus is simple: there is a bottleneck in your learning process preventing further growth. The straightforward solution is to address the bottleneck directly. But… if only things were that simple. More often than not, it’s not easy to figure out where/what the bottleneck is.

Even the psychological factors causing plateaus ultimately boil down to a lack of progress. These psychological reasons might include things like limiting beliefs, complacency, boredom, insufficient challenge, fear of failure, perfectionism, burnout, lack of recognition, lack of meaning, lack of accountability, etc. But if you think about it, most of these are countered by sufficient motivation.

How do you get motivated? Recall that intrinsic motivation has 3 key drivers: Purpose, Autonomy, and Growth. We assume that you’ve got purpose and autonomy covered otherwise you wouldn’t have started on your journey to mastery (of course, sometimes you have to revisit your ”why?” or purpose). The main reason why most people lose their motivation is because they stop progressing - that is, lose growth, the third pillar. Growth of the right kind: i.e., not too hard, but also not so easy that it gets boring. So, ultimately, making progress again is the best way to get your mojo back.

The vast majority of plateau-hopping tricks turn out to be techniques to unblock the brain’s learning process by introducing different ways for the brain to connect the dots. Most of these tricks are about varying and changing things up: whether it is modifying the practice routine, the learning content itself, or the learning modality. They may require leveraging other senses, changing the feedback system, changing the environment, taking adequate rest, and so on.

It seems obvious that if the same old practice routine stopped working, then maybe we should change it up. The question then becomes, why did it stop working, and what should we change?

A Personal Story and an A-ha Moment

Over twenty years ago, when I was a graduate student doing research in AI and machine learning, I found a job as a part-time faculty at my university’s Evening College. Due to some lucky serendipity I got the opportunity to design and teach an experimental course called “Creative Problem Solving”.

Loosely inspired by George Polya’s seminal work, “How to Solve It”, the course tried to provide students returning to college after a long gap with a list of tools, tricks, and heuristics to tackle the kind of problems they might encounter in college-level math courses.

My students ranged in ages from late-teen college kids to senior retirees in their 60s and 70s. Over a few iterations of the course, I watched them using the techniques I taught them to tackle and solve unfamiliar problems. I used this real-world feedback to modify and add to the list of tricks, ending up with about two dozen tricks and cognitive hacks.

In teaching them, I learned what kinds of cognitive aids and tricks could turn the blank stare of non-comprehension into the glimmer of hope that in front of them was a solvable problem. What kinds of tricks unblocked their thinking and helped them make the leaps it took to understanding, and then maybe even to solving the problem.

This direct hands-on experience gave me fresh ideas and perspectives for my own grad school research (which involved creating AI algorithms that could learn from data). I started to see analogies and found new intuitions for how machines could learn and solve problems. In a serendipitous way I had found the perfect cross-training I needed for my AI journey. Incidentally, cross-training in a related field is one of the best ways to avoid or fast-forward through plateaus.

One of my most interesting observations was the simple and universal power of reframing and restating problems in more familiar terms and using your own words to explain things to yourself. Also, reframing problems using contradictions and exaggerations, finding symmetries, extremities, opposites, and analogies. Restating the problem in the form of stories, explaining it to someone else using their vocabulary, etc. And how leveraging other senses, with their own distinctive learning and thinking modalities, had the ability to unblock System 2 & System 1 cognitive processes in the brain.

For example, I saw first hand how powerful it was to simply ask my students to draw a picture representing the problem. Simple line diagrams and stick figures worked magic across ages and experience levels.

The Power of Drawing a Picture

The picture drawing trick was especially interesting because everyone had a different way to visualize the same problem but it almost always unlocked something in their thought process. Even if they couldn’t solve it, they had a better handle, a better grasp on what the problem meant, moving them significantly closer to the solution.

“Just start drawing. Simply put pen to paper and sketch out your understanding of the problem, however simplistic you think it is. And then do it again and again,” I would tell them. And sure enough, after a few tries it would begin to click.

Drawing pictures engages a different part of the brain, the visual system, which possesses different “algorithms” for pattern recognition, understanding, and problem solving, with its own built-in biases, approaches, and foundational puzzle pieces. It’s not just about “looking” at a pre-drawn picture, the super power is in the very act of drawing. It’s not only the fresh perspective that drawing unlocks in other parts of the brain, but it, like writing, is a terrific and underutilized tool for thinking - a computational surface for imagineering, for connecting the dots in new and different ways.

Leonardo da Vinci - Breaking down complex ideas until they are simple. — Da Vinci’s drawing of his idea for a Hygrometer in one of his journals

Leonardo Da Vinci, for instance, made an estimated 28,000 drawings in his little notebooks to help him think through, understand, design, and invent in fields ranging from painting and art to architecture, engineering design, biology, war machines, philosophy, and so much more. You can literally see his mental gears spinning as he sketched out his ideas. Or, more accurately, the magic of sketching helped shape his ideas and make his conceptual breakthroughs.

Or consider the drawings and pictorial graphics in Dostoevsky’s notebooks and manuscripts. They seemed to have helped him ideate and think about his characters and settings. Perhaps doodling served as a kind of inner creative monologue for Dostoevsky as he worked on such novels as Crime and Punishment, The Idiot, The Demons, and The Karamazov Brothers.

Similarly, Feynman diagrams are likely the most powerful example of how the act of drawing pictures can vastly simplify the understanding of extremely complex topics. Before Feynman, the mathematical expressions required to describe sub-atomic particle interactions in physics were incredibly arcane, complex, and difficult to interpret.

Feynman's diagrams offered a new way to represent these interactions graphically, making it significantly easier for physicists to understand and manipulate the underlying mathematical expressions. It not only led to many a-ha moments of understanding for students and physicists, but also created new intuitions that led to new Nobel Prize winning discoveries.

Perspective Shifts Lead to Creative Leaps

The term I coined in my mind was “Perspective Shifts” because most of the tricks I devised or taught for Creative Problem Solving were about forcing your brain to look at things from different perspectives. And to do that required varying things - the input data, the frame of reference, the thinking modality, the sensory modality, the analogy, etc. We often do this instinctively, but by systematizing it into a set of tricks or hacks allows us to use this power more consistently.

As you may have guessed, my experience compiling and teaching the list of tricks for Creative Problem Solving more than two decades ago is partly the inspiration for this piece, and for the list of tricks at the end of this piece.

But first, let’s examine what expertise is, the nature of bottlenecks, legible and illegible understanding, and take a few other fun detours.

How do Bottlenecks Occur?

Expertise as Deeper & Broader Understanding

You can think of learning a new concept or skill as building a mental representation (or model) of it. The brain learns to represent a concept by connecting the new concept/skill to existing, older concepts, modifying existing concepts to fit with (and therefore create) the new concept. All of understanding is a set of connections/relationships between concepts.

In fact, through synaptic plasticity, the neural network in the brain literally gets rewired as it gets exposed to, and learns new things. The dense synaptic connections between neurons associated with the concepts get strengthened or weakened. New synaptic connections get created and others get pruned.

Over long periods of time, parts of the brain can even grow physically larger to accommodate all these new connections (often at the expense of other parts of the brain). For example, MRI studies of London Taxicab drivers show that their Posterior Hippocampi (the area of the brain responsible for spatial memory and navigation) become significantly larger over the course of their 3-4 year training. Presumably to accommodate the knowledge necessary to navigate the labyrinthine tangle of 25,000 streets that cross each other within a 10km radius of Charing Cross station.

Aerial view illustration of the labyrinthine jumble of streets that make up Charing Cross in London. The image should depict a detailed and intricate network of streets, showcasing the complex urban layout from above. The streets should be depicted in varying widths, with some landmarks subtly included to give a sense of location. The style should be clean and detailed, suitable for a map or urban planning document. The illustration should use a monochrome palette to emphasize the patterns and structure of the street layout.

Gaining expertise is (by definition) the process of acquiring better, richer, deeper, and broader understandings (or models) of the required concepts and skills.

An understanding of something is richer or better or deeper to the extent that it is useful - i.e., predictive of reality, of the unseen future, thereby giving us the ability to manipulate reality according to our goals. An understanding of something is a mental model of that thing, and so that classic saying about models can be rephrased to apply to understanding as well: all understandings are wrong, but some are useful.

Legible and Illegible Understandings

Understanding isn’t just an analytical or intellectual thing. A lot of our understanding hides beneath the covers, in the unconscious brain, and is encoded in our automatic habits, in our muscle memory, our “gut-feelings”, emotions, intuitions, instincts, and pattern-matching.

Borrowing a pair of terms from James Scott, our mental models or understandings can either be “legible” or they can be “illegible”. For example, if you learn the physics of projectile motion from a textbook then you have a legible understanding of it. You can use a pen and paper with this understanding to calculate the trajectory of a basketball in flight and precisely predict where it will land.

On the other hand, if you practice throwing a basketball and are able to land it in the hoop consistently, then the hand-eye coordination and neuromuscular control you’ve acquired in the process is an example of an illegible understanding. You cannot explain exactly how you’re able to land the ball in the basket. You just do it, without conscious, analytical thought.

Simply by adjusting the synaptic connections between neurons in your brain and associated nervous systems, you have implicitly learned to model the kinematics and dynamics of flight. The connections between abstract concepts - how to grasp a ball, how to move and accelerate your arm, how to jump, how far from you is the hoop as deduced through stereo vision, etc - as modeled by the underlying neural connections give you the sensory and muscular coordination to throw a ball accurately. You have learned a useful model of reality - as encoded in the synaptic connections of the neurons - sufficient for you to hit the target.

Illustration of a stick figure throwing a basketball into a hoop. The stick figure should be depicted in a dynamic throwing pose, with one arm extended towards the hoop. The basketball should be shown mid-air, clearly aimed at the hoop. The hoop should be visible with a net, and the scene should be simple, emphasizing the action of the throw. The style should be minimalistic and clear, focusing on the movement and interaction between the stick figure, the basketball, and the hoop, set against a plain background.

But your brain’s model of reality (and its connection to motor control) is nowhere near the model that a pro athlete, say Steph Curry, has encoded in his synapses through thousands of hours of deliberate and active learning practice. And even Steph Curry’s unconscious mental representation is nowhere close to base reality.

Everything we seem to do on auto-pilot - from juggling to swimming - also rely on useful yet hidden models of reality. That is, unconscious understandings or illegible understandings.

Intuitions and instincts are also examples of illegible understandings. Almost all concepts & understandings have a legible side to them as well as an illegible side to them. These are related to, but not the same as, System 1 (Fast) and System 2 (Slow) thinking. They are related in the sense that System 1, because it is unconscious, likely utilizes Illegible understandings to drive its cognition, whereas the reflective, deliberative System 2 utilizes legible understandings to do its thinking and extrapolations.

Experts, like everyone else, seamlessly use a mix of both legible and illegible understandings of the elements of their fields in practice. Note that, as we saw in our previous essay, the “effortlessness” in Effortless Mastery comes from pushing a lot of the System 2 thinking down the brain stack to the automatic System 1. That is, by connecting the legible and illegible understandings of the world so that you don’t have to spend a lot of effort in thinking. It just comes naturally and effortlessly and automatically.

So in this framework, “muscle memory” is just an illegible understanding/model of the world - connecting muscular control in the presence of sensory feedback with motion through space and time. If you can play a piece of music on the guitar, it is due to an illegible understanding of how the fine-motor movements in your fingers (mediated by your senses) translate into sounds you expect to hear. If you can consistently return a serve in tennis, it is because of the illegible understanding you possess of the visuomotor coordination and muscular control, and how it relates to the legible intent of where you want to place the ball.

Illegible understanding is part of the reason why it’s so hard to find the bottleneck that prevents you from reaching the ‘a-ha’ moment. The bulk of your understanding (aka model of reality) is hidden from your conscious awareness.

In this essay I will use the terms understanding (legible or otherwise), models, and representations interchangeably, almost to a fault, just to make this point.

Bottlenecks are Limitations to Gaining Understanding

Bottlenecks in learning (of the kind that cause plateaus) occur because the new concepts/understandings, whether legible or illegible, aren’t quite ready to be fit nicely into the existing tapestry of concepts and associations. This can be due to a number of reasons. If concepts depend on each other, there may be a missing link, or the foundational concepts are weak/incorrect to begin with.

The most extreme version of this is where a critical and foundational skill is missing. This happens a lot in childhood where we’re still building foundational understandings which stack on top of each other.

For example, Jean Piaget's Three Mountain Task is a classic experiment in developmental psychology that assesses a child's ability to understand and consider perspectives different from their own. In this experiment you show a child a model of three mountains of different heights - say from left to right the mountains are short, medium, and tall - and ask the child to draw the mountains. Now place a doll on the opposite side of the mountains and ask the child how the doll might draw the mountains from its perspective.

Children under 4 typically struggle with this task because they lack the foundational understanding that others may see the world differently due to their spatial viewpoint. Children between 4 and 7 understand that the doll might see the mountains differently but they might still struggle to actually imagine what the doll might see or to draw the mountains from the doll’s perspective. Only after the age of 7 are children consistently able to accurately describe the doll's viewpoint, demonstrating the development of spatial perspective-taking skills.

This is an extreme form of a plateau where the understanding required to draw from the doll’s perspective is bottlenecked by severe gaps in the child’s foundational understanding of the world, beginning with their lack of “theory of mind”. No amount of practicing drawing, not even 10,000 hours, is sufficient to fill that gap without the missing link - i.e., the understanding that others might see things differently from a different view point.

Thankfully for adults, most bottlenecks are neither this severe nor this foundational.

The Lazy, Satisficing Brain

Bottlenecks occur because of foundational and other gaps as we saw with the 3 Mountain Task, but also because the brain tries to cheat all the time. It tries to “satisfice” and find cheap and dirty understandings that produce instant gratification but may not generalize fully. These cheap, low quality models/ understandings come in handy in the moment but when you want to get to the next level, the conceptual language or framework they provide proves inadequate and sometimes leads you in the wrong direction.

This is because understanding something deeply and broadly can be very expensive and effortful for the brain. It is expensive in terms of energy expenditure (the brain uses 20% of our energy despite being only 2% of our body weight). And it can be effortful because you often have to engage System 2 Thinking, that is, to pay focused attention during learning. The new model of reality we learn may also require us to rewire many old/inferior understandings. To restitch the fabric of connections that make up our tangled web of meaning takes effort and cuts into the body’s energy budgeting.

The tricks the brain uses to “satisfice” or find cheaper ways to do the job range from simply memorizing things to coming up with the simplest, dirtiest, and cheapest possible superficial understandings, even if they don’t explain the full picture, while avoiding and glossing over anything that’s not required in that moment.

Simple illustration of clouds in the sky, depicting a serene and peaceful daytime scene. The image should feature a few fluffy, cumulus clouds scattered across a clear blue sky. The style should be minimalist and clear, ideal for a children's book or an educational material about weather. The drawing should be in soft colors, emphasizing the lightness and fluffiness of the clouds.

For example, predicting rain when you see tall clouds in the sky is a type of useful, but shallow understanding of weather. For most people, this is sufficient. But it doesn’t always work. When you reach the next level of understanding, you may notice that despite the size of the clouds, it rarely rains when the tops of clouds are shaped like cauliflower florets.

And still you don’t know exactly why this happens. The next level of understanding is reached when you realize that floret tops are an indication that ice is not yet forming on the top of the clouds and that ice crystals on the higher levels are necessary for precipitation. Whereas, if the tops of the clouds lose their definition and become wispy, and if the bottoms begin to bulge or become ruffled, it means ice is forming at the highest levels, and the weather may deteriorate rapidly. But this still isn’t the final understanding of clouds. As you may have guessed, there is no final understanding.

In most cases, there is no additional utility in going any deeper and understanding the physics of cloud formation and precipitation. All the brain wants, if it can get away with it, is a simple thumb-rule: floret tops = no rain.

There are other reasons why the brain stays shallow - and those have to do with procedural reasons and psychological (e.g. motivational) reasons. Memories are consolidated and models are integrated during sleep and periods of rest. Sometimes, we just need to give the unconscious time to work.

We overestimate our understanding

Over two decades ago cognitive psychologist Rebecca Lawson gave about 200 people a simple, partially completed drawing of a bicycle and asked them to fill it in with the pedals, chain, brakes, and other functional elements. Most completions looked like this:

Image Credit: The Science of Cycleology by Rebecca Lawson...

Despite bicycles being familiar everyday objects and most participants either having one or knowing how to ride one, Lawson observed, “It seems that many people have virtually no understanding of how bicycles work.” Moreover, the participants greatly overestimated their understanding of how bicycles work and felt highly confident in their sketches. Lawson called it the “Illusion of Explanatory Depth” Most of our understandings are like this. And unless we challenge them, they will remain as such.

This is also what Piaget noticed and incorporated into her theory of Constructivism - that knowledge acquisition starts with shallow memorization. The way to create deeper understanding, Piaget found, was through active learning: structured and focused thinking about the topic, repeated exposure, and leveraging direct and consequential feedback - all of which, by the way, are the hallmarks of Deliberate Practice.

We saw this pattern of going from shallow memorization to generalizable understanding happening in AI, specifically deep learning artificial neural networks too, in the phenomenon of grokking, in our previous essay on effortless mastery.

The Plurality and Non-Linearity of Understanding

While it seems like the process of understanding should ideally be a linear, structured path - from shallow to deep, and building on solid foundations, it doesn’t work like that. In reality much of understanding comes in non-linear leaps of imagination and logic. A lot more like the grokking we see in artificial neural networks.

Also, leaps of insight come through a plurality of methods and often without the solid foundations one might presume is required. As Feyerabend argues in his book, Against Method, some of the greatest insights and breakthroughs have come despite eschewing method.

During the height of the Scientific Revolution, the great Descartes, who strongly believed in rigorous foundations, theoretical grounding, and systematic approaches, became highly critical of the intuitive and empirical leaps of his contemporary, the equally great or perhaps greater Galileo’s non-methods.

Descartes wrote, rather aghast: “It seems to me that Galileo suffers greatly from continual digression, and that he does not stop to explain all that is relevant at each point; which shows that he has not examined them in order; and that he has merely sought reasons for particular effects, without having considered… first causes… and thus that he has built without a foundation”

And yet, Galileo’s leaps have become mankind’s leaps. The true shape of the scientific revolution is that of creative and wild leaps of imagination and possibility.

Historical illustration of Galileo Galilei, the famous astronomer, looking through a telescope in an observatory during the early 17th century. The scene should depict Galileo, a bearded man in period clothing, standing beside a long, antique telescope aimed towards the night sky through an open dome of the observatory. The background should include starry skies and the interior details of the observatory with ancient astronomical tools. The style should be detailed and realistic, suitable for a history textbook.

No wonder that Albert Einstein, unlike Descartes, was critical of the traditional inductive model of science - the notion that scientific knowledge is built gradually through the accumulation of empirical observations and the generalization of laws from them. Instead, Einstein argued that major scientific breakthroughs often occur through leaps of intuition, imagination, and creative thinking, even from aesthetic considerations, rather than through a linear process of incremental improvement.

In his Autobiographical Notes, Einstein writes, “I really could have gotten a sound mathematical education. However, I worked most of the time in the physical laboratory, fascinated by the direct contact with experience.”

“Direct contact with experience” - remember this phrase as we get into the tricks later (there’s a reason why language learning apps like Duolingo only get you to a certain level before leveling off).

So while we speak of foundational gaps and structured deliberate practice, and conceptual bottlenecks, it is important to recognize that there are many ways to get around these bottlenecks and our brain has several thinking, reasoning, pattern-matching, and understanding modalities, very few of which are linear, rational, deductive processes.

Creative Plateaus vs Learning Plateaus

While we won’t get into Creative Plateaus in this essay, I see a lot of similarities in both why they occur and also how you can get out of creative ruts.

The “a-ha” moment of understanding, of connecting the dots, is at its core a creative moment. Creativity itself is the act of seeing and therefore making new connections between old concepts, thereby generating new ideas.

It’s a form of generating a novel model/understanding of reality resulting in the creation of artifacts (whether it is a scientific theory, a song, a painting, or a dance). It’s the ability to take the concept of gold and the concept of a mountain and writing a song about a golden mountain, even if such a thing doesn’t exist.

Furthermore, the act of creativity is itself a journey of transformation. Each thing you create ends up changing you in some small way, by updating your mental model of reality to include the new thing you just created and the associated remaking of the fabric of meaning that it is now part of.

Plateaus in Machine Learning & AI

Speaking of the universality of plateaus, in machine learning too, AI models often get stuck in learning plateaus where they can stubbornly remain until something helps them break out and reach the next level of understanding (more accurate model of reality).

Getting stuck in a “local minima” is one example of a machine learning plateau. There are other plateau modes in AI, such as “overfitting” which is akin to memorization or shallow understanding.

Even more interestingly, the techniques & principles used to get these AI models out of their plateaus are eerily similar to how we humans get out of plateaus.

Here’s a simplification of the basic idea behind one form of machine learning: Let’s take a deep ANN (Artificial Neural Network) which is basically layers of densely connected “artificial neurons”. The ANN very roughly simulates the neural structure of the brain. The “connections” between neurons, which are somewhat analogous to biological synapses, comprise of simple numerical weights that simulate synaptic strength by give lesser or greater importance to that connection. By simply adjusting the weights (i.e., synaptic connection strengths) the neural network can be made to do all kinds of things that seem intelligent. They can learn to recognize cats in images, drive cars, compose music, write poems, predict disease, answer questions, and so on, all from seeing patterns in the data.

Now imagine hundreds of billions of neurons and hundreds of trillions of connections!

For example, you can train a neural network to recognize cats in pictures by simply showing it millions of labeled images, some of which have cats and some of which don’t, and adjusting its neural synapse weights appropriately whenever it makes an error until it can accurately identify which images contain cats and which don’t.

The most commonly used machine learning technique to adjust the synapse weights is called Gradient Descent. We literally follow a gradient down a mathematical hill made up of the errors in the ANN’s output (prediction of the label “cat”).

You can see this represented as hills and valleys and plateaus in the image below. Each point in the landscape is a model (or understanding) of cats, as represented by the collection of synapse weights making up the neural net. A hill is where the error in recognition of cats is large and a valley is where the error is small. By calculating the mathematical derivative to find the slope of the error at each learning instance, the algorithm knows how to adjust each weight. Thus slowly the neural network learns to model an understanding of what makes a cat a cat, and its error in recognizing cats is minimized.

Image Credit: https://medium.com/analytics-vidhya/journey-of-gradient-descent-from-local-to-global-c851eba3d367

The “global minimum” of errors is essentially the deepest valley. It is the best possible understanding of how a cat is represented in an image subject to the available data and the model architecture, etc. However, there are many local minima possible which are all the little dips and smaller valleys in the error landscape. The models representing these dips might make some accurate identifications, but they are dead-ends and traps: they are sub-optimal understandings of what cats are.

The Satisficing Algorithm

Just like the brain, the artificial neural network (or more accurately, the learning algorithm) is also prone to cheat and satisfice.

It tries to latch on to the simplest explanation (i.e., understanding of cats) that can help it reduce the error. For example, through some labeling error in your data, if several of the cat images actually had a visible label called “cat” on them, it would simply learn to recognize the label instead of the cat itself. If most of the cats were of a particular color, say brown, it might simply cheat by looking for the color brown and call it a day. If a number of the non-cat images were dogs, and it’s easier to learn how a dog looks then it might learn to recognize dogs instead of cats and say that anything that isn’t a dog is a cat. Or if the majority of your cats are photographed sitting in a particular pose then it might cheat and look only for that pose. And so on. Of course, these examples are a bit of a simplification of what local minima are and conflate a few other concepts, but they serve the purpose for our analogy.

Each of these cheats is a local, shallow valley in the picture of the gradient. Imagine that in training, a ball (representing our current model/understanding in the landscape of all possible understandings of cats) is rolling down the error hill and getting trapped in a shallow pit and not reaching the deepest part of the landscape which contains the best possible understanding of cats.

Illustration of a grid of cat photos suitable for a convolutional neural network training dataset. The image should feature nine different cats, each in a distinct pose. The cats should vary in breed, color, and size, and each should be displayed in a separate square of the grid. Poses should include sitting, lying down, stretching, and jumping. Each photo should have a plain background to emphasize the cat and its pose clearly. The overall style should be clean and precise, ideal for an artificial intelligence training set.

To avoid getting trapped in local minima requires us to do a few things. One of the things Machine Learning researchers do is to add some stochastic randomness to the process, or even some mathematical “momentum” to the ball so that it rolls past small dips but settles in deeper valleys. Momentum is just a mathematical artifact but the analogy in human learning is to not be satisfied by shallow explanations and to keep persisting, by asking “why?” again and again.

If you extend this analogy of a model landscape as a search space across different learning algorithms and subsets of data and features, then you will see the same pattern of shallow traps. And the way to avoid plateaus is through variation - more and different data distributions to learn from, different algorithms & models, different initial parameters, randomizing features and samples, and so on.

In fact, one of the most successful techniques in machine learning is to create an ensemble of models (mixture of experts), each of them trained on a different, often randomized set of features and data and other parameters so that they all see different “perspectives” of reality. Combining these different “understandings” of reality gives the overall ensemble a more accurate prediction of reality. Even GPT-4 is an ensemble of multiple models.

Even though our brain doesn’t use Gradient Descent to learn and seems to have multiple learning and modeling modalities or “algorithms” (such as the conscious ability to search a space of possibilities), and therefore is very different from the simple artificial neural network I described above, it’s striking that the way ANNs plateau and the techniques we use to help them recover is so similar to our own.

The Bag of Plateau Hopping Tricks

Illustration of a 'bag of tricks', depicting a whimsical and colorful bag that is slightly open, with various magical items peeking out. The bag should appear old and patched, symbolizing mystery and magic. The items visible might include a wand, a sparkling potion, a book with mysterious runes, and a few shimmering lights suggesting more hidden wonders. The background should be minimal to focus attention on the bag and its contents, rendered in a playful and enchanting style, suitable for a storybook or fantasy theme.

OK, so finally, we come to our handy compilation of hacks, tricks, and heuristics to break out of plateaus. Most of these are staples of Deliberate Practice but skewed towards plateau hopping. Some are more applicable to certain types of skills/fields than others, but hopefully they will serve as inspiration for you to devise your own.

Pre-work (If you can identify the bottleneck)

Of course, if you can actually identify your bottleneck you should try to directly address it first - either by creating the appropriate drills for it, or seeking the appropriate course/material/teacher/coach. For example, if you’re playing a complex piece of music, breaking it down into smaller pieces or phrases, slowing it down, and creating drills to practice each section (or learning the associated scales) is a well known technique. If you’re learning the basics of machine learning and your bottleneck is a limited understanding of linear algebra, then directly address it by studying linear algebra to unlock your next level. Or if poor grip strength is preventing you from lifting heavier, you know you should add exercises to increase your grip strength first.

If you can’t identify the bottlenecks or weakest links, the following tricks might help. I’ve organized them into 3 major categories attacking each of the primary pillars of plateauing: 1. Tricks for Perspective Shifting, 2. Hacking our psychology (addresses lazy brain), and 3. Meta-learning hacks (addresses peculiarities of our learning modalities).

I. Tricks to Shift Your Perspective

The following tricks are all about helping the brain see the problem from different angles and help shake it out of its local minima.

Cross-Training: This is probably the single most effective trick at consistently providing perspective leaps. Take up a complementary yet different skill - like a taekwondo practitioner playing soccer, a programmer learning design, a musician improving their sense of groove by taking dance classes, a graphic designer or painter learning photography, a blues purist learning jazz idioms, a literary fiction writer experimenting with genre fiction or poetry, and so on.
Meditation: Starting a Mindfulness Practice helps in a number of ways: it improves the ability to sustain focus (critical for deliberate practice), helps with relaxation and reducing anxiety and stress, opens up mental space for creativity and new ideas and approaches to emerge, cultivates patience and persistence, improves mind-body connection, harnesses the power and hidden intelligence of your emotions and feelings, increases self-awareness, and so much more. Get into the habit of short mindfulness breaks between practice sessions.
Find your Next Teacher: Every teacher brings their own unique background, teaching style, and areas of expertise to the learning process, and sometimes a fresh voice or approach can be just what we need to jumpstart our progress and deepen our understanding. When we work with the same teacher for an extended period, we become accustomed to their way of explaining concepts, demonstrating techniques, or providing feedback, leading to complacency and stagnation. By seeking out a different teacher, we expose ourselves to new ways of thinking about and approaching our skill, which can challenge our assumptions, fill in gaps in our knowledge, and expand our repertoire of strategies and techniques. Finding a different teacher doesn't necessarily mean abandoning our current mentor or learning relationship altogether. Often, the ideal scenario is to work with multiple teachers concurrently or in rotation, allowing us to benefit from a diversity of perspectives and approaches.
Teach Someone Else: There’s nothing more humbling and perspective shifting than trying to teach what you think you know to a peer or someone else with a slightly different background or experience, especially if you’re teaching at the edge of your expertise. When you teach, you are forced to clarify your thoughts, organize your knowledge, and find ways to communicate complex ideas in a clear and accessible manner. It helps identify gaps in your own understanding, prompting you to research and learn more to fill those gaps. It also exposes you to different questions, perspectives, and challenges. It can also be motivating and rewarding, adding fuel to your journey. Writing out and publishing/blogging your ideas is a version of this.
Find a Younger Mentor (Reverse Mentoring): There’s a reason why older professors surrounding themselves with younger/inexperienced graduate students in their research labs have more breakthroughs. Partner with a younger or less experienced practitioner to exchange skills and knowledge, teaching them the fundamentals while seeing the world from their fresh, uncorrupted eyes, learning newer methods/technologies or fresh perspectives in return. Be open to them challenging your assumptions about the "right" way to do things.
Feedback/Perspective Diversity: Actively seek out alternative viewpoints and honest critiques from a diverse range of sources, including experts, novices, clients, or even skeptics, to gain a well-rounded understanding of your strengths and weaknesses.
Create Concrete/Toy Examples: Do this for every new concept you learn. Feynman was legendary at this, always making up toy examples from everyday life to make theoretical concepts more concrete and accessible. For example, to explain gravity and spacetime curvature, Feynman used the analogy of a rubber sheet. He would imagine placing objects on a stretched rubber sheet, causing it to curve and create a visual representation of how mass affects the curvature of spacetime.
Draw a Picture - Create a Visual Aid: The act of drawing out the concept (or pieces of it) in the form of picture, or creating visual aids yourself, is an excellent way to force the brain to find ways around plateaus. Think about Leonardo Da Vinci’s 28,000 sketches and drawings that helped him think about and innovate in all the various fields that interested him - from engineering and architecture to philosophy and painting. Or the drawings and pictorial graphics in Dostoevsky’s notebooks and manuscripts helping him ideate and think about his characters and settings. Clearly, drawing and writing are incredible thinking tools and great for enabling leaps of imagination. Some examples: if you’re a beginning musician, draw out the circle of fifths or create a paper model of it and play with it. If you’re a writer blocked on writing a chapter or scene, try to sketch it out as a sequence of comic book panels with word balloon dialogue. Draw, draw, draw as many pictures as you can, however simple or terrible they end up being. Think of sketching as a gymnasium for your mind.
Multimodal Learning & Expression: Similar to drawing a picture, use other senses and learning modalities. Even simply turning off a sense helps you see differently (for example listen to music with your eyes closed, or practice with your non-dominant hand). Try listening to complex topics via audio-books rather than reading them. Try building a physical model to demonstrate a complex system or process. Watch movies with subtitles to learn a new language. Use a different form of expression: when we dance, sing, or draw, we often enter a state of flow or spontaneity that can bypass our conscious, analytical mind and allow us to access deeper insights or make more intuitive leaps.
Analogical Reasoning: Similar to the toy problems above, seek out analogies and parallels between the skill you're learning and other related domains, especially things you understand better.
Mental Rehearsal: Use visualization and mental rehearsal techniques to practice skills in your mind, reinforcing neural pathways and building confidence. Examples include visualizing yourself delivering a speech with clarity, confidence, and enthusiasm. Visualizing yourself executing the perfect golf swing, focusing on the details of your stance, grip, backswing, and follow-through.
Collaborative Teaching: Partner with a peer to co-teach a workshop or seminar on the skill, taking turns explaining concepts and demonstrating techniques to challenge each other and expanding your repertoire.
Gain Direct Real-World Experience: There is a reason why most people who learn languages through apps like DuoLingo get a sense of progress but never actually master the language. There is no substitute for the kind of real-time feedback and learning involved in direct experience so find opportunities to apply your skill in a real-world context. Some examples: actually engage in conversations with native speakers (immersion), perform science experiments by hand (a la Einstein), play your music in front of an audience (even family and friends), learn programming by actually building an app, volunteer to design a logo for a local non-profit, etc.
Experiment with Unconventional Approaches: Try unorthodox or new-to-you techniques, approaches, tools, or technologies to see if they spark new progress. Example: a painter using sponges and leaves instead of brushes, a guitarist using an unconventional tuning, a novelist with writer’s block writing in a completely different genre or style forcing them to think outside their comfort zone and potentially sparking new ideas for their original project. Think about what Van Gogh did with his experimentation.
Thought Experiments: Conduct thought experiments by imagining hypothetical scenarios, such as "What if I had unlimited resources?" or "How would I approach this problem if I were an expert in a different field?".
Aesthetic Considerations: Explore the aesthetic dimensions of your skill or domain, such as the beauty, elegance, or harmony of a particular technique or solution, trying to transcend purely functional considerations. Emphasize form in different ways - minimalism, baroque ornamentation, eye pleasing candy, etc.
Environmental Variation: Regularly change your practice environment or context, such as training in different locations, conditions, times, or with diverse partners. For example, if you’re a writer who generally works from home, try to spend a day writing in a bustling café or a quiet park. If you use a certain IDE for programming, try a different editor. Changing your environment doesn't always require a dramatic shift – even small changes like rearranging your workspace, or exploring a different section of your local library can provide novel stimuli and shake up your routine.
Variable Practice and Routine: Introduce variability into your practice routine, like changing the order of exercises, using different equipment/instruments/tools, varying the intensity of practice between sessions, or practicing at different times of day, practice with your non-dominant hand or foot, practice when you’re fatigued, etc.
Empty Your Cup: Approach the skill with a beginner's mindset, letting go of preconceptions and assumptions to stay open to new ideas and feedback, even if you're an experienced practitioner. What if everything you believed about a practice was wrong and you had to start from scratch?
Constraint-Based Creativity: Impose creative constraints on your practice. For example, improvise using only 4 notes of the scale. Use only a limited color palette in your painting. Take pictures only in B&W. Use only short sentences in your writing. Try to make medical diagnoses without looking at a few of the clinical test results. Etc.
Cross-Industry Coaching: Seek advice from a successful professional in a related but different industry.
Problem Inversion: Reframe the problem or challenge you're facing by inverting the goal or constraints, such as aiming to minimize errors instead of maximizing performance, to gain new insights and approaches.
Environmental Priming: Change your practice environment to simulate real-world conditions or to introduce novel stimuli, like practicing a speech in a noisy café or studying in a museum surrounded by inspiring art.
II. Hacking your Psychology & the Lazy Brain
The tricks below try to address the second pillar - i.e., the loss of extrinsic and intrinsic motivation, limiting beliefs, boredom, insufficient challenge, complacency, fear of failure, perfectionism, burnout, lack of accountability, etc.
Growth Mindset Shift: Embrace plateaus as learning opportunities. Reframe them as periods of consolidation before significant jumps in skill. Append “yet” to “I can’t do this” so it becomes “I can’t do this yet”. Celebrate effort over outcomes, and relax and enjoy the journey. Look for techniques including mindfulness to silence your inner-critic and reframe negative self-talk.
Mindset Priming: Prime your mindset before practice sessions using techniques like affirmations, gratitude, or inspiration to cultivate a positive, growth-oriented attitude.
Revisit your “Why”: Revisiting the original reasons that made you want to master the skill can reignite your passion when progress feels stuck.
Reward yourself: celebrate achievements, no matter how small. For example: if you’re a photographer, treat yourself to a new lens or a photography book after completing a project.
Set Smaller & More Manageable Goals: Break your main goal into smaller, manageable tasks. For example, if improving at chess, focus on mastering specific openings one at a time.
Revisit your role models: Read biographies to see how your idols overcame similar challenges in their own journeys.
Find an accountability and celebration partner: Share your goals and progress with someone who will hold you accountable. And who will force you to celebrate the small wins.
Progress Tracking: Break down the skill into micro-goals and create a visual progress tracker, like a habit tracker or skill tree. Use this to celebrate small wins.
Leverage Your Peak States: Identify where you naturally excel and channel that to conquer new challenges. Explore ways to leverage and amplify your unique strengths, talents, or peak states to create motivation and momentum.
Expectation Adjustment: Be open to modifying unrealistic goals or expectations that may be hindering your progress, adopting new, somewhat challenging objectives that align with your current skills and aspirations, providing a renewed sense of direction.
Expectation Experimentation: Deliberately shake up your expectations by setting audacious goals, trying new roles or positions, or exploring alternative domains or styles within your skill area.
Micro-Goal Setting: Break down skills into smaller, achievable goals to maintain a sense of progress.
Practice in Short Bursts: Set strict time limits with a timer for a short practice session (e.g., 15 minutes) and focus intensely on a specific aspect of the skill to maintain enthusiasm and prevent burnout. Don’t exceed the time limit even if you feel like continuing. Your mind is wary, don’t let it suspect you’re trying to trick it into forming a habit.
Skill Gamification: Design a game or challenge around the skill, with levels, rewards, and competition, to make practice more engaging and motivating. This works especially well with peers and friends. Be curious and treat learning like play.
Deliberate Discomfort: Embrace discomfort by regularly pushing yourself outside your comfort zone, setting challenges that stretch your abilities and build resilience for failure.
Identity Sculpting: Cultivate a strong identity around continuous learning and improvement, using self-affirmations and visualization to reinforce this mindset. It’s psychologically superior to say “I am the kind of musician who loves to practice every day” than to say “I have to practice every day to become a better musician”.
Set a performance deadline: Create a sense of urgency by committing to a showcase or demonstration of your skills.
Habit Stacking: Integrate short skill practice sessions into your daily routine by stacking them onto existing habits, like practicing guitar for 10 minutes after brushing your teeth or reviewing flashcards during your commute.
III. Meta-Learning Hacks
Most of these hacks are typical of deliberate practice and try to use what we know about the peculiarities of the brain’s learning systems.
Spaced Repetition and Scheduled Review: The brain’s learning systems, especially memory and consolidation, unconsciously pay more attention to new concepts when they are repeated at regular intervals (starting with smaller intervals and slowly increasing the time between sessions). Tools like Anki are great for assimilating large quantities of new knowledge using spaced repetition techniques. The actual sessions themselves can be short. So for example, instead of practicing a musical phrase 50 times in a single session, practice it 10 times each session but spread it out over 5 sessions throughout the day.
Feedback Loops & Self-Analysis: Establish tight feedback loops in your practice, using techniques like real-time data tracking or video analysis to get immediate, actionable insights. Record yourself, dissect your performance, and seek detailed critique from an instructor or mentor. This is also valuable in understanding potential causes of the plateau. Regularly record your practice sessions and review them to identify areas for improvement. Create specific drills for those areas that need improvement.
Mini Test Runs to leverage real-world feedback - even great comedians regularly test their material in small clubs and use the feedback to refine their shows.
Deliberate Rest and Strategic Recovery: Plan regular rest and recovery periods into your practice schedule, using techniques like active rest, massage, power naps, yoga, or meditation to prevent burnout and optimize performance. Much of our memory consolidation and model integration happens during periods of rest. Make sure you get adequate sleep.
Take a Longer Break: Take a week or even a month off from time to time. You could use this time to get into your cross-training skill or just take your mind off things.
Walking Reflection: There is something magical about how walks and showers trigger hidden processes in the brain. Take a walk or a break in nature between practice sessions to reflect on your progress, generate new ideas, and return to practice with renewed focus and energy.
Deliberate Experimentation: Conduct targeted experiments in your practice sessions, testing different techniques, strategies, or equipment to identify areas for improvement and optimize your approach.
Calculated Risk-Taking: Embrace small, calculated risks in your practice or performance, such as trying a new technique or approach.
Deliberate Disruption: Intentionally step outside your comfort zone and disrupt your usual flow by introducing novelty, complexity, or unpredictability into your practice routine.
Targeted Feedback: Invest in a single coaching session with a top expert in the field, focusing on a specific aspect of the skill or challenge you're facing to get targeted feedback and guidance.
Feedback and Collaboration: Actively seek out alternative perspectives and honest critiques from others, leveraging their insights and expertise to identify blind spots and refine your approach.
Self-Generated Problem Sets: Use active learning by creating sets of problems, tasks, or challenges for you to solve or that demonstrate your ability or test your comprehension.
Try Immediate Recall: read a section and then try to write down everything you recall on a blank piece of paper. Even better try to draw it out, use diagrams, flowcharts, and visual ways to represent what you read.
Notes as Questions - instead of simply taking notes, rephrase the material as questions to be answered later. Example: instead of writing down that the Aeolian mode starts with the 6th degree of the major scale, write in your notes: “what mode starts with the 6th degree of the major scale?”
Expert Deconstruction: Choose a top performer in the field and break down their technique, studying their methods, decisions, and thought processes to extract key insights and strategies.
Micro-Summaries: Summarize key concepts or techniques of the skill in a series of tweets or short messages to distill your understanding and identify areas for further exploration.
Project Based Learning - always learn major new concepts by applying them in projects. For example: if you’re learning about databases, build a small CRUD app that leverages them.
Flash Presentations: Challenge yourself to explain a complex concept or technique in a 5-minute lecture or presentation, using clear examples and analogies to communicate efficiently and effectively.
Mental Contrasting: Visualize your ideal future performance and contrast it with your current reality to identify gaps and generate motivation for improvement.
Interleaved Practice: Alternate between different skills or sub-skills during practice sessions to improve your ability to switch tasks and apply knowledge in varied contexts.
Journey Mapping: Map out your ideal path to mastery of the skill, identifying key milestones, challenges, and resources along the way to create a personalized learning journey.
Metacognitive Reflection: Regularly reflect on your learning process and progress, identifying patterns, obstacles, and opportunities for optimization.
Failure Analysis: Treat failures and setbacks as valuable data points, conducting a thorough analysis of what went wrong, what you learned, and how you can apply those insights to improve future performance.
Collaborative Challenge: Partner with a peer or mentor to take on a challenging project or competition that pushes both of you to level up your skills and learn from each other.
Skill Deconstruction: Break down a complex skill into its core components and practice each component separately, gradually combining them to master the full skill.
Performance Simulation: Create realistic simulations of high-pressure performance situations, like mock competitions or presentations, to build confidence and identify areas for improvement.
Time-Blocking Techniques: Use time management techniques like the Pomodoro method, where you alternate focused work sessions with short breaks, to maintain motivation and avoid burnout during intensive practice.
Grease the Groove / Integrate Learning into Daily Life: Find ways to make the skill a natural part of your day, such as changing language settings on your devices to the language you’re learning, or hang a pull-up bar over a door and use it every time you pass through the door, or do a plank while waiting for the coffee to brew.
Overkill in Practice - learn above the required skill level, or practice beyond perfect - for example play a musical piece at 110% of the required tempo so that on stage it becomes super easy to play at regular tempo.

Whew. And that’s a wrap. Mastery is a Hero’s Journey and few of us are brave enough and lucky enough to be able to persist in the face of all the hurdles on the way. I hope this collection of tricks are useful in your journey, and wish you luck crossing those pesky plateaus - you will surely encounter many!

Shapes of Things

Discussion about this post