With the fall of expert systems in the late 1980’s, pessimism in AI ran rampant.

The state of AI research receeded back into what’s dubbed the "AI Winter," a period from 1974 to 1993 where pessimism in research lead to pessimism in the press, ultimately ending in slashed funding and collapses in the industry, for which expert systems were for but a brief respite.

So what happened?

Problems with Expert Systems

If you tune into AI discussions today, one of the problems you’ll find to be rampant is that of interpretability. Neural networks — the currently most hyped AI architecture — are a black box. It’s a feat to behold to explain how they arrive at their results[1], and some of the recent, most impressive papers in AI are those that find new methods to do so.

That’s not a problem with expert systems. Expert systems are a white box: they’re completely transparent. Every single rule in an expert system was specified by an expert — a human — that knew what they were doing.

That expert systems are a white box is definitely a pro.

But it’s also its greatest weakness.

That expert systems are a white box means that every single rule in it must be specified by someone else.

What does that mean for problems like computer vision? What kind of rules can tell you if you’re looking at a lemon or a tractor-trailer?

What does that mean for problems like language processing? What kind of rules do you need to explain that "the spirit is willing, but the flesh is weak" shouldn’t be translated into "the vodka is good, but the meat is rotten"?

And just because a system is transparent doesn’t mean that it’s easily-understood. Source code is entirely transparent, with absolutely-defined behaviours — and yet, software bugs are a fact of life.

Maintaining the system and fixing bugs is at least as difficult as debugging software, and every time it needs an update, more rules will be added and more bugs introduced. By the early ninetys, the R1 proved too expensive to maintain, as unusual inputs resulted in grotesque outputs.

Refining an expert system is simply the process of adding more rules. The higher-refined an expert system, the higher the proliferation of rules, the lower the maintability, and the higher the cost of supporting it.

The fundamental problem with expert systems is its nature as a white box: every part of it is designed and implemented by a human. Expert systems may provide short-term gains, but as maintenance times lengthen, so does the cost.

The whole reason why expert systems were built was because human labour is vastly more expensive than computation. And yet, building an expert system requires extensive human maintenance. Expert systems are self-defeating.

A Program that Learns to Play Checkers

Since the beginning of code compilation, one little experiment was trucking along, gliding under the radar of the general public.

In the late 1940’s, IBM engineer Arthur Samuel an interesting side project: writing a program that learns to play checkers. Samuel often tested his program on IBM 701 computers that were still on the production line.

After a while, it turned out that his program was able to do something remarkable: it could beat all but the best checkers players in the world without fail, something that an expert system could never achieve[2].

What’s interesting here is his approach. An expert system for playing checkers would be a program that plays checkers. But Samuel wrote a program that learns to play checkers.

Samuel called his approach a term that we’re all very familiar with today: machine learning. His program played games of checkers over and over with themselves, learning optimal strategies[3] until a few were able to prevail.

It took a while to find a way to structure his learning paradigm so that it’d be able to learn so well, but the success of machine learning planted a seed that began to sprout and grow explosively after the decline of the expert system in the nineties that would, arguably, become responsible of a bubble of its own — even after the project was shut down after then-IBM-CEO Thomas J. Watson, Jr., felt that the public would feel threatened by the prospect of intelligent machines.

Something interesting emerges when you compare the definitions of intelligence and learning:

intelligence

n. the ability to acquire and apply knowledge and skills.

learning

ger. to acquire knowlege or skills.

Notice any similarities?

With that in mind, it’s time to leave behind expert systems in pursuit of making machines that can learn.


1. don’t worry, we’ll explain that in a later lecture
2. an expert system would perform at best as well as the best in the world; considering the amount of rules and strategies that could go into playing checkers, it would have horrible bugs; considering that it’d be quite deterministic, it’d be quite easy to exploit
3. 'optimality' was measured by a heuristic of Samuel’s design; we’ll see how this becomes important later.