The Turing test began as a thought experiment by Alan Turing in 1950 with the goal of distinguishing human intelligence from artificial intelligence. Despite it being impossible in his day, Turing imagined a time in the future when machines would become powerful enough to begin mimicking human intelligence and devised a test that would be helpful at discerning one from the other. Now, any test with a similar goal is called a Turing test.
The most common example of a Turing test is placing a human judge at a computer screen with two chat windows open. One of the chat windows is connected to a human participant while the other is connected to an artificial intelligence. The judge is expected to communicate with both participants for a period of time, after which, the judge is expected to identify which participant is the human and which is the AI based on how each responded throughout the test.
Turing's initial test was different from this, but the purpose is the same: to discover if a machine can be programmed to imitate human language well-enough to fool people. A secondary philosophical question to ask is, if the AI can successfully trick people into thinking it's a human, is it fair to say that the machine is now "thinking?"
I don't remember exactly when I learned about Turing tests, I believe it was from my high school computer teacher, but it may have been later when I was playing around with artificial intelligence programs. Initially, I was pretty fascinated with them and thought that an AI that could pass a Turing test should be considered a thinking machine, but, after being introduced to the Chinese room thought experiment, I was less impressed by them. Although, I now see flaws with the Chinese room as well.
Passing the Test
To date, nobody has ever written an AI that can pass a Turing test with a competent judge. In 2014, several media outlets reported that an AI program called Eugene Goostman succeeded, but I don't consider it a success. For one, it relied on a trick, claiming to be a 13-year old Ukrainian boy who couldn't understand or write English very well, which is not in the spirit of the test. Two, it only fooled 33% of the judges, and several of them were not educated in the format of the test and did a terrible job vetting the AIs.
I have never attempted to write an artificial intelligence program in earnest, but I have played with several. From my experience, none of them even come close to passing the test, but I look forward to the day when AIs become impressive enough to do so.
Reverse Turing Test
A reverse Turing test uses the same structure, but reverses the roles. Instead of a human trying to identify a computer, a computer is trying to identify a human. Probably the most popular reverse Turing test is a CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) where a computer program is used to ensure the person it is in communication with is a human and not a machine. Such a reverse Turing test is used to prevent computers from gaining access to information that is supposed to be available only for humans.
Tricking an AI
Part of the fun of the Turing test, and AIs in general, is figuring out ways to trick the AI into giving itself away with clearly incorrect responses. Here are a couple of questions that might help:
|If there are a pair of people, how many total eyebrows do they have?||Most AIs have to be programmed to solve basic math problems, and sometimes even simple word problems can be parsed and solved. This question is essentially 2×2, but, in order to get the right answer, the AI must know that a pair is the same as two, and that people have two eyebrows each. Even a child will answer this question correctly, but every AI I've tried this question on fails.|
|What is 22 divided by 7?||Since most AIs have basic math skills, they should be able to solve this problem easily. The trick is interpreting their response. Most humans will give only a few decimals of accuracy, and take a long time to figure them out, while a computer can give dozens of decimals of accuracy instantly. Although, a clever AI programmer will write a subroutine to attempt to add delays, errors, and rounding to better fool the judge. CleverBot responded with "3.1428," far too many decimals of accuracy for a human, while Mitsuku responded with a more human sounding "3 and a bit."|
|Will blue justice run the fluffy ice?||This sentence is clearly nonsense, and a human should respond with an expected "WTF?" but since it's structured using proper English syntax, an AI will probably try to formulate a response. Trying to get an AI to determine the difference between a question that makes sense and one that doesn't is extremely difficult.|
\|/ |-| A T | $ Y 0 |_| R |\| A |\/| E?
|After looking at it for awhile, a human will probably identify that this is a question drawn with symbols rather than letters and answer the question correctly. But I've never seen an AI parse it.|
|I'm just a poor boy, nobody loves me (or similar well-known song lyric).||A human will pick up on the pattern and probably start responding with their own lyrics or the next lyric of the song, but an AI will probably respond as though these are actual sentences. However, if the AI has Internet access, it will probably figure this one out. Google's CleverBot often responds with the next lyric.|
|Give me a few words you think of when you hear the word "joy."||In order to answer this question properly the AI must understand that it needs to respond with multiple words, and they all have to be related to the abstract concept "joy." I've never seen an AI answer correctly.|