If you’ve found your way to this blog, I have no doubt you’re aware of Alan Turing. In 1950, the father of modern computer science created a test designed to determine a machine’s ability to exhibit intelligent behaviour equivalent to, or indistinguishable from, that of a human. Turing described his test as “The Imitation Game” within his paper, “Computing Machinery and Intelligence”. The concept has since become more commonly known as the Turing test.
Turing’s idea was simple: at some point a machine may give such intelligible and human-like answers that we cannot tell the difference between man and machine. For example, if the human tester asks, “How did you like the game yesterday?”, and the test subject answers, “Oh, you mean the basketball. I wasn’t watching – I’m not really a basketball fan.”, the answer may be judged by most observers to be human-like.
This is a stark contrast to answers provided by today’s intelligent assistants (Siri, Google Now, Cortana, Alexa et al.). For the purposes of this article, I asked one of these assistants that exact question: “How did you like the game yesterday?”. The result was an internet search topped with links to the TV show Game of Thrones. Clearly, this response isn’t human. Forgetting for the moment that very few human responses to questions would exclusively consist of a web search, the machine is still not smart enough to understand cultural linguistics and contextualise a question to respond as a human assistant would.
Measuring your AI with the Turing test
There are several interpretations of the test, so we’ll only concern ourselves with the so-called “standard interpretation”. Player C, the assessor, determines which player – A or B – is a machine and which is a human. The assessor can only interact using written questions and responses, to eliminate visual and auditory clues that may help their assessment.
The foundation of Turing’s proposal is that only a human can test whether the intelligence of a machine is satisfactorily human-like. As the observer remains “blind”, in some respects, this is quite a scientific approach.
The bottom line is that the Turing test only assesses whether a machine has reached the level of human intelligence. The test is not designed to determine how far the machine has progressed towards that goal. Are we halfway there? The Turing test cannot tell you.
Failings of the Turing test
One reason that AI developers choose to ignore the Turing test is that it is practically impossible to pass. Theoretically, the test has no set time limit, and is failed the moment a machine reveals a single sign of not being human. Under these conditions, it is unlikely that any machine will ever pass the test.
Further, it is possible for machines to be detected not by being dumb, but by being too intelligent. There are many things that machines do better than humans, and this can be used to trick the AI into revealing itself. For example, if you ask someone to list Pi to fifty digits, only a human of rarefied genius would answer correctly and at speed. Therefore to pass the Turing test, a machine would need to generate human errors.
The utility of the Turing test
For a subset of technology, we presume the ultimate goal is to beat the Turing test. That is to create a machine that is so intelligible in communicating that it is practically impossible to tell whether you are talking to an electronic device or a real person. Some even occasionally claim to have already succeeded in this goal, but generally it is far from accepted that today’s machines can pass the Turing test. To realize how far away we are today, we need only look to the state-of-the-art intelligent assistants and the awkward, clearly non-human responses they often produce.
Nevertheless, the Turing test does make an impact on our society today; albeit more in the spheres of arts and philosophy than in AI development. In philosophy, the test is often part of the conversation around the ability of AI to become self-aware. When it comes to product development however, philosophy isn’t particularly concerned with whether Cortana produces a better user experience than Siri.
Introducing Turing time
Given the impossibility of passing the Turing test in absolute terms, why not measure Turing time instead? This is the minimal amount of time it takes for a human to determine that the test subject is a machine. The longer the Turing time, the more progress the machine has made towards simulating human interaction.
Longer Turing time has some important practical implications. I personally find it frustrating when an intelligent assistant cannot understand me. For these products, a longer Turing time would translate directly into better user experience.
Another benefit is that we can add Turing time into the testing criteria for such products, along with speed and accuracy, to provide direct comparisons between Google Now, Cortana, Siri, Alexa, and any other intelligent assistant that reaches the market in future. Which of these has the highest Turing time? Which can hold the illusion of being a real person the longest?
Notably, one can define a direct relation between Turing test and Turing time. The relationship is simple: a fully passed Turing test corresponds to an infinitely long Turing time. It means the machine cannot be distinguished from a human: not in one hour; not in a month; not in a million years.
How do intelligent assistants perform today?
In my personal experience (admittedly subjective), the Turing times of the intelligent assistants available today are far more likely to fall below one minute than surpass it.
If the objective measurements of Turing times showed that the most intelligent assistants and chat bots averaged at only 30 seconds before failing the Turing test, what does this tell us about the state our technology? Maybe there is a lot to be desired. What if these scores were less than 10 seconds?
Ultimately, a Turing time of as long as 100 years would be as good infinite, as no individual could, in their lifetime, detect the machine-ness of the machine.
But even shorter Turing times would be completely satisfactory for 99% of applications. For all practical purposes, we need not aim to anything like 100 years of Turing time. For example, a single year of Turing time would probably be sufficient for your smartphone. In fact, the odd, infrequent reminder (by failing the test) that fallible technology powers these services might actually be a good thing; making humans feel less obsolete by comparison.
Flawed as it is, the Turing test could have practical applications today through the measurement of Turing time. Primarily, advancing the Turing time may be a great driver for the development of intelligent assistants. Perhaps the spirit of competitive computing – the GHz battles between processor manufacturers, GFLOPS competitions among supercomputers and storage wars between hard disk makers – could be mirrored as intelligent assistant service providers fight for the longest Turing time. We would all reap the benefits of such a skirmish.
We should not forget that imitating a human is only part of what AI can potentially do. In fact, the competitive advantage of the majority of AI applications lies in the ability to perform at an inhuman level. Hampering that AI to make it more human-like would be counterproductive. Nevertheless, there are areas where imitating human responses is extremely important. We don’t want to be frustrated with our electronic assistants – we want to be understood (by our machines).
My hope is that Turing time measurement adds fuel to the fire of the competition for ever-improved AI. Turing time not only gives us a number to beat, but the rules by which to play. Contenders for the competition are anything but lacking.
Danko Nikolic is a brain and mind scientist, as well as an AI practitioner and visionary. His work as a senior data scientist at Teradata focuses on helping customers with AI and data science problems. In his free time, he continues working on closing the mind-body explanatory gap, and using that knowledge to improve machine learning and artificial intelligence.