I. What Artificial Intelligence is. 1. Introduction. 2. A Short Historical Account. – II. What Artificial Intelligence does. 1. The Basic Architecture of Artificial Intelligence Systems. 2. Expert Systems. 3. Games. 4. Mathematical Demonstrations and Programming Languages. 5. The Process of Learning. 6. Neural Networks. 7. Present Frontiers and New Targets. – III. What Artificial Intelligence cannot do. 1. Weak AI and Strong AI. 2. What Weak AI cannot do. 3. The Intrinsic Limitations of Mathematics and Logic. 4. Strong AI: Syntactic Elaboration and Semantic Content. – IV. Artificial Intelligence between Science and Conscience. 1. The Criticism of Reductionism. 2. What Artificial Intelligence must not do.
I. What artificial intelligence is
1. Introduction. “Artificial Intelligence,” from now on AI (Artificial Intelligence), is the set of studies and techniques aiming at the production of machines, electronic calculators in particular, capable of solving problems and reproducing activities proper to human intelligence. The expression “Artificial Intelligence” is an evident oxymoron, attributing, as it does, to the “artificial,” what is most “natural,” since we are speaking of the highest quality of human nature: intelligence. This oxymoron is in itself provocative, since several people seriously ask themselves if a machine can actually be “intelligent,” in the sense this word is applied with to the human mind (see below, IV). On the other hand definitions like this are too generic, being easily adaptable to indicate the whole of informatics, and similarly “automation techniques,” both disciplines which are in no way part of AI. In the next paragraphs (cf. below, n. 2) we shall try to outline an historical account and relations between these fields shall be clarified, but it is objectively difficult to reach precise definitions, because on one side this subject is developing at a very fast pace, thus definitions delimiting its field of competence could exclude future developments which could belong to it a priori, and on the other, because it is at one time science and technique, and a frontier discipline, a fascinating “multiple point” where different areas of knowledge come together: logic, informatics, psychology, neurosciences and philosophy. Therefore, instead of delimiting, it is better to list and describe its main characteristics and the main areas of application. Several attempts were made before, with more specific statements, and being discordant the ones from the others, they came up with extremely different definitions. We can now take a look at some of them.
Russel and Norvig (1994) ask for two fundamental distinctions to be made. The first is between “thinking” machines and machines which limit themselves to “operating” in a manner in some ways similar to human beings. The second refers to the term of comparison to be used to assess the performance of their capabilities, which can be either the human being itself or its rationalised idea. Technical applications (see below, II) are mainly a matter of “rational operating,” while the philosophic debate, the object of the last section, insists on the possibility of “human” performances and especially of “human thought.” Another fundamental distinction enlivening the philosophical debate is between so-called “weak AI” and “strong AI”: supporters of weak AI are content with considering machines behaving “as if” they were intelligent; strong AI supporters consider machines similar to man to the point of having self-consciousness possible. It is easy to understand how these distinctions intertwine: strong AI refers to “machines thinking in human way,” while weak AI refers to “operating machines.”
Finally, with regard to technical realisations, we can make the distinction between a “functional,” or “behavioral,” approach, indifferent as to the structure provided with the elaborator seat of the “intelligence,” and a “structural,” “constructionist” or “connectionist” approach, intending to achieve the very performance of the human mind by reproducing in some way its structure. With a slight shift of perspective, the first approach was called “emulationist,” the second “simulationist”: the supporters of the latter believe that only by reproducing as faithfully as possible the human mind it is possible to achieve a level of performance comparable to it; the supporters of the former, on the contrary, believe that mind functions are not a matter of structure, but of performance, and performance may be achieved even through completely different structures, and perhaps with even greater success. Both solutions provide a wealth of results, but the second, even though in minority, bears a special importance, since it brought to the creation of “neural networks,” which are imitations of the animal mind, extremely rough, but extremely interesting from a cognitive and technical point of view for having established connections with neuroscience which bore fruits of great significance for AI and for neuroscience itself. In the end the two approaches converge, because after some attempts at producing “dedicated” physical structures (at a hardware level), neural networks are now realised mainly within calculation programs (at a software level) and then executed on normal calculators.
2. A Short Historical Account. The idea of delegating some operations typical of the human mind to mechanic devices finds its roots in ancient history, like arithmetical calculations done with the abacus, probably invented around 5000 B.C. by the Chinese, probably the first also to make use of the first automatic control device, used to control the level of the water in rice-fields: a float used to move a floodgate, reducing the discharge of water as soon as it tended to increase. In modern times the first calculating machines were created by Pascal (1623-1662) who built a mechanic adding machine in the seventeenth century (the pascaline), and by Leibniz (1646-1716) who further developed it at the turn of the century to enable it also to perform multiplications and divisions. The first “programmable” machine, capable of automatic execution of series of operations, was conceived by Charles Babbage (1792-1871) around 1830, but was never built because of the mechanic difficulties implied. With regard to the field of automatics we must mention James Watt’s velocity regulator (1736-1819), who paved the way to industrial automation in mid eighteenth century. This information reveals our interest in transferring to machines not only material work —requiring the use of physical energies— but also intellectual fatigue, necessary to perform tiring calculations, but also to check and control the correct functioning of other devices (automation was defined as “machines controlling other machines”).
But developments more specifically interesting for the birth of AI come in mid 20th century. They were due to Alan Turing (1912-1954), who gave two fundamental contributions. In 1936 he proposed an ideal model for a “universal” automatic calculator (known as “Turing’s machine”): it is the prototype of all electronic computers developed in the mid forties. In 1950 Turing proposed the “imitation game,” a paradigm to establish if a machine is “intelligent” or not. In a well known article of his, Computing Machinery and Intelligence (1950), he suggests to place an observer in front of two teleprinters. One of the two teleprinters is run by a man, the other by a woman. The observer, who does not know which is commanded by the man and which by the woman, can try and understand by asking them any kind of question. One of the two interlocutors must say the truth, the other should pretend to be of the other sex. At a certain point the interlocutor who is lying is substituted with a calculator programmed to “pretend” to be a human being. When the number of mistakes made in trying to identify the computer is the same as those made in trying to identify the lying interlocutor, then the computer can be called “intelligent.” The imitation game was actually performed several times, with quite disappointing results. In Russell and Norvig’s classification Turing’s conceptual experiment provides a good example of “machine behaving in a human way”; it would represent a “behaviorist” position, considered however insufficient by some other AI scholars.
Actually, the first work attributed to artificial intelligence dates back to 1943, when Warren McCulloch and Walter Pitt planned a neural network. But the most important developments —at a theoretical level and in creating computing programs with the value of prototypes for successive experiences— are those attained in the ten years following Turing’s provocation. In particular, in 1956, another pioneer, John McCarthy, gathered the main scholars of the time in Dartmouth (among which were Marvin Minsky, Allen Newell, Claude Shannon and Herbert Simon) for a seminary, where he proposed the name “artificial intelligence.” 1958 was a year with a wealth of results: McCarthy produced the Lisp, a high level programming language dedicated especially to AI (later followed by Prolog in 1973), and he started to develop general problem resolution programs. Other researchers began to study what we call today “genetic algorithms,” programs capable of modifying themselves automatically in order to improve their performance. In the following decades the research continued, with differing results. The sixties of the 20th century were characterised by results that may not be considered spectacular compared with today’s means, but which were promising at the time, considering the limits of the instruments for calculation they were achieved with, and for their importance in systematically disproving those critics who used to say “such thing cannot be done.” In those years mainly theoretical, but interesting developments came also from the research made on neural networks. At the same time the first difficulties were also coming up, and researchers found themselves facing barriers still insurmountable today. One serious problem is the “combinatory burst,” the sudden increase of calculation time when the number of variables increase in a problem; another limitation which we must return to is that computers can only consider “syntactic” connections, and not “semantic” contents, such as the meaning of the variables it is operating with. The seventies brought to the creation of “expert systems” and their first application to medical diagnostics, and carried to the first attempts to “understand” natural language (in the restrictive sense of giving prearranged answers to a limited number of questions).
In 1980 AI came out from the scientific laboratories and found significant practical applications, some of which shall be described in the next section. At the same time, as a consequence, industrial companies especially in America and Japan started to put on the market programs focusing on expert systems, on configuration recognition and so on, building microcircuits and whole computers specialised in AI applications. After little less than twenty years of nearly complete indifference, neural networks received renewed attention in 1985, mainly thanks to the definition of new and more powerful optimisation algorithms. Besides the improvement of neural networks, the last decade of the century saw the development of new calculating procedures, deriving from the theory of probabilities and decisions in particular, and in the field of applications, the development of efficacious methods for the construction of expert systems and the recognition of speech and shapes, the latter used especially in robotics and artificial vision.
II. What Artificial Intelligence does
From an engineering and strictly pragmatic point of view, AI is valued simply for its capabilities and performance, independently from the methods and mechanisms used to produce them. The point of view is therefore “emulationist” and not “simulationist”: the idea behind it is to build machines that do not necessarily “simulate” and reproduce the behavior of the human mind, but are simply able to “emulate” it selectively, as the final result of several operations. This is the thesis supported by A. Turing in the imitation game described: he proposes to assess the “intelligence” of a machine only by its capability of showing a communicating behavior indistinguishable from a speaking human being. This approach has certainly been dominant in the history of AI and it has brought to the production of programs which reach a high level of competence in knowledge and resolution of problems considered complex. Such programs are built as “manipulators” of formal not-interpreted symbols, so the machine can be conceived simply as a syntactic transformer with no effective “semantic” understanding of the problem (see below, III.4).
1. Basic Architecture of Artificial Intelligence Systems. Software applications behind AI systems are not sets of unchangeable information representing the solution to a problem, but an “environment” where a basic knowledge is represented, used and modified. The system examines a broad range of possibilities and it builds a solution dynamically. Every system of this kind must be able to express two kinds of knowledge in a separate and modular way: a knowledge base and an inference engine.
For “knowledge base” we mean the “module” that collects the knowledge on a “domain,” meaning the problem. We can detail the knowledge basis by dividing it into two separate blocks: a) the block of statements and facts (temporary or short term memory), b) the block of relations and rules (long term memory). Temporary memory contains “declarative knowledge” on a particular problem to solve. On one side we have a representation built by true facts introduced at the beginning of the consulting or proved to be true by the system during the work session. On the other side, the long-term memory stores the rules providing advice, suggestions and strategic guidelines to build up the store of knowledge available to solve a problem. The rules are built by statements composed of two units. The first is called “antecedent” and it expresses a condition, the second is called “consequent”, and it starts the action to be applied in case the antecedent is proved true. The general syntax is therefore: “if ‘antecedent’, then ‘consequent’.”
The “inference engine” is the module which uses the knowledge base to reach a solution to the problem and to provide explanations. The inference engine is delegated the choice of which knowledge should be used in every specific moment of the process of solution. Therefore, where separately they would seem to have limited use, the knowledge is combined to reach new conclusions and express new facts. Each rule of the set representing the domain of knowledge, to be proven valid in a specific situation, must be compared with a set of facts representing present knowledge on the current case, and thus be satisfied. This is done through a matching operation where the antecedent of the rule and the different facts present in the temporary memory are compared. If the matching succeeds, the system proceeds with the series of actions listed in the consequent. If the latter contains a conclusion, the satisfaction of the antecedent enables to confirm such a statement as a new fact in the short-term memory. The matching operation creates “inferential chains,” indicating the way the system uses the rules to perform new inferences. These chains also provide the user with an explanation onto how conclusions are reached. There are two main ways of creating inferential chains starting from a set of rules: a) forward chaining. Such a method strives to reach a conclusion starting from the facts present in the temporary memory at the beginning of the process and applying the production rules forward. Inference is said to be guided by the antecedent, since the research for rules to be applied is based on matching different facts present in the memory and those logically combined in the antecedent of the active rule; b) backward chaining. In this case the processor proceeds by reducing the main goal into smaller problems. Once the thesis to prove is identified, production rules are applied backwards, in search of coherence with the initial data. The interpreter researches a rule, if such a rule exists, containing as its consequent the statement which it must test for truth, then it checks the smaller sub-goals constituting the antecedent of the rule identified. In such case the inference is guided by the consequent.
2. Expert Systems. Expert systems are the most well known example of an application deriving from this approach. An expert system, a “system based on knowledge,” is an instrument capable of solving problems within a limited domain, but with a performance similar to that of a human being expert in that domain. This means that the main task of an expert system is to assist the activity of professionals, where the consultancy of a human specialist with expertise and intelligence is usually necessary. AI researches have identified the problems arising in creating such instruments, and they have stated that it is necessary to restrict the field of application as much as possible. Therefore, compared to a human expert, these applications are more limited and superficial, lacking the completeness represented by a competent person’s cultural knowledge. Moreover, an expert system cannot be expected to reach solutions through intuition or by skipping some logical passages, trusting in “common sense” or analogy mechanisms as a human being would do. In final stance a human being is simulated, with more or less detailed characteristics, and is provided with the capability of solving a limited range of temporary or secondary tasks. The first and best known of these systems is Mycin, which E.M. Shortleffe started to develop in 1972, and which was applied in medicine. With regard to the kind of problems which an expert system can solve we can make a list of subjects, obviously far from being complete: a) “diagnosis”: these identify the possible causes for a “malfunction” by recognising a set of symptoms and suggest possible treatment; b) “monitoring”: these follow the development of a process through time; they control the acquisition and elaboration of different kinds of data, providing the output of synthetic information on general conditions and estimates on the evolution of the process; c) “planning and scheduling”: once the resources/actions available are known, these identify their best usage in order to reach a certain goal in a certain time; at the same time they press for the acquisition of new resources; d) “data and signal interpretation”: with the input of a set of data referring to a certain issue, it is possible to make a general assessment in order to recognize the forthcoming of some predetermined situations.
3. Games. Another field of application where this symbolic and engineering approach has found success is that of games. Artificial intelligence generally considers games of two players moving one after the other. It considers the development of the game as a “tree” where the “roots” are the starting position and the “leaves” the final position (winning or losing). Obviously because of the complexity of the games considered, it would be unthinkable even for a very powerful computer to develop the whole tree completely in order to chose the “best” move. For this reason it was necessary to apply some heuristics to “cut down” some of the branches and make the problem feasible. Just think of the game of chess, where the size of the problems faced are huge. At the beginning of the game there are 400 possible moves, which become 144,000 at the second move. Developing the tree of the game we would have 35100 knots. Applying techniques of symbolic manipulation and using powerful means to reduce the bulk of the research, otherwise quite unfeasible, systems capable of playing chess better than a human being were produced anyway, though obviously using very different techniques from the ones applied by human beings: in May 1997, in New York, a machine (Deep Blue) defeated in a six game match the world champion Kasparov. It is interesting to underline how such a machine, designed at a hardware level to develop and examine research spaces in parallel very quickly (Deep Blue can examine 1011 positions in about three minutes), uses “brute strength” more than refined heuristic techniques to reach the best solution quickly.
4. Mathematical Demonstrations and Programming Languages. The use of logic and automation for mathematical demonstrations is another field of application where AI has found success. Logic is certainly one of the most ancient, stable and rigorous instruments used to formalise and explain human way of reasoning. It is semantically well defined, highly declarative, and it has an extremely broad-based deductive apparatus. All this explains why classical logic (especially of the first order) are so much used in AI to represent knowledge on a problem, even though this choice does have evident limitations (see below, III.3) and does not enjoy unanimous consensus. With regard to this Minsky states that logical formulas and deduction methods are not the most natural way of thinking, nor are they methods the human mind uses to organise his knowledge and to act with an intelligent behavior. In this case the knowledge basis becomes a collection of statements of predicates of first order logics. Inference rules allow to deduce new statements (“theorems”) not explicitly contained in the starting knowledge basis. The sequence of inference rules used in the derivation of the theorem is called “theorem proof.” Obviously in order to automate the procedure, proof efficiency becomes a crucial requirement. Most of the logic using programs in AI are based on studies on the automatic demonstration of logical theorems, and especially on the resolution method defined by J.A. Robinson in the sixties and the development of strategies implemented to improve the efficacy of demonstrations. The product of this study are also the programming in logic and language called Prolog (from PROgramming in LOGic), which is becoming one of the most interesting and innovative programming paradigms for the development of “intelligent” applications.
The idea of “programming in logic” was born at the beginning of the seventies, thanks especially to some researchers of the universities of Edinburgh and Marseille. Robert Kowalski, at the University of Edinburgh then, was responsible for the definition of the theoretical basis for programming in logic and especially for his proposal of interpreting programs as composed of two parts: logic plus control. Computation is reduced to theorem proving. In 1972 Alan Colmerauer’s group in Marseille was the first to create an “interpreter” for Prolog, proving that programming in logic was possible. This is radically different from programming techniques normally used to write programs in traditional languages. The most wide-spread programming languages, Fortran, Pascal, C are all based on the imperative paradigm, defining the program as a series of commands specifying in detail which operations a computer should perform to solve a specific problem. Vice versa, with programming in logic a problem is described in very abstract terms with a series of logical formulas. This way of representing programs enables a declarative understanding of knowledge, which in turn enables the description of a problem without the detailed explanation of how to achieve the solution. In other words, programming in logic and automatic theorem proving share the use of logic to represent knowledge and the use of deduction to solve problems. However, programming in logic shows how logic can be used to express programs and that special demonstration techniques can be used to execute programs.
5. The Process of Learning. Besides the necessary list of systems considered successful from an applications point of view, with their doubtless limits revealed at a more general level, it is about universally acknowledged that machines cannot be considered intelligent until they are able to increase their knowledge and improve their abilities. According to Simon (1981), learning consists in adaptation changes in a system, in the sense that they make the system able to perform a given task in a more efficient and effective way every successive time. One way to solve this problem, if only partially, is to provide symbolic machines with inductive and deductive reasoning means. Inductive reasoning proceeds from single statements referring to particular facts or phenomena (“examples”) up to universal statements expressible through hypothesis and theories which explain the facts provided and may be able to predict new ones. While deductive inference preserves the “truth” though (in the sense of logical correctness), inductive inference does not guarantee it, and so such systems may tend to an excessive generalisation and finally produce mistakes. Their approach remains symbolic, since the results of such process are new theories, new rules and generally a new or updated knowledge basis. One of the most well known programs capable of learning from examples is ID3, developed by J. Ross Qunlan (between 1979 and 1983), which gave birth to commercial products capable of automatic classification. ID3 and its “descendants” have explored thousands of data bases, producing identification rules in different areas (for example disease diagnostics). Presently, learning programs are used mainly in practical operations in order to meet the necessity of making use of the wealth of information contained in great data collections accessible through the net, or in company data basis, to reveal a pattern within the data, to extract information and hidden knowledge (data mining).
6. Neural Networks. Neural networks represent an approach significantly different from the symbolic one previously analysed, and they are considered to be part of what we have called “structural” or “connectionist” AI (see above, I.1). The basic idea is to reproduce intelligence and learning in particular by simulating the neural structure of an animal brain on the computer. Operating in nanoseconds, calculators can memorize huge amounts of information very easily, and they can perform huge amounts of arithmetical calculations without mistake, where human beings are nowhere near such performances. It is beyond any doubt though that a human being perform certain “simple” tasks such as walking, speaking, interpreting a visual scene or understanding a sentence, thinking about common events, treating uncertain situations, and it does this in a manner more clever and more efficient than the finest and most expensive AI programs using symbolic and functional approaches.
The idea of building an intelligent machine starting from artificial neurons can be traced back to the origins of AI, and some results were already attained by McCulloch and Pitts in 1943 when the first neural model was born; in time these were further developed by other researchers. In 1962 Rosenblatt proposed another neuron model, the “perceptron,” capable of learning through examples. A perceptron reproduces a neuron’s activity by making a weighed sum of its inputs and emitting in output “1” if their sum is above a modifiable threshold value, or “0” if otherwise. A learning process of this kind implies the modification of the value of the weights. The great enthusiasm initially showed for this approach was subject to a harsh decline a few years later, when Minsky and Papert underlined the perceptron’s great limitations in learning. In more recent times new architectures, called “connectionist,” of neural networks were proposed, no longer subject to the theoretical limitations of the perceptron, which made use of powerful learning algorithms (backward propagation). This reawakened a strong interest for neural networks and enabled their development for successful applications. The “connectionist” architecture, alternative to Von Neumann’s, is characterised by: a) a large number of very simple elaboration elements, similar to neurons; b) a large number of connections (synapses) weighed between the elements; c) a highly parallel distributed control. Weights codify de facto the knowledge of a network. Variations in learning can be considered dynamic variations of connection weights. Learning can take place in different ways with respect to the way the net is “trained.” Learning paradigms can be divided into three main classes: a) supervised learning through examples where a teacher provides the network with the answers the neurons should produce after the learning phase; b) unsupervised learning: neurons specialize through an internal competition in discriminating stimulations produced upon entry; c) reinforcement learning: the network is provided only a qualitative answer according to the correctness of its answer; a reviewer assesses the network’s answer and it sends the neurons a positive reinforcement signal if assessment is positive, a negative one if otherwise.
In connectionist systems, learning systems seem more easy to realise, but such a learning remains hidden behind the variations of real numeric values, wired within the network and it cannot be made explicit in symbolic form. Neural networks are therefore more suitable for tasks implying classification and perception of “low level” concepts, though perhaps technically arduous, as, for example, recognition of spoken language, process control and image recognition, while conceptually complex problems, such as planning, diagnosis and projecting, remain a dominion of symbolic AI. While neural network models are based on the simulation of the human brain, many other AI techniques are inspired to the evolution of the animal world and of social grouping. Genetic algorithms, for example, are algorithms based on evolution, where learning takes place through selective processes starting in a vast population of casual programs.
7. Present Frontiers and New Targets. Present AI systems have been severely criticised and they certainly are poor and disappointing if compared to the beginning expectations in AI. It is true that there have been no giant steps and the most serious problems, such as, for example, the process of learning and the representation of common sense, even though faced and partially solved, are still far from finding a complete solution. At a functional level, despite many strong points of the architectural models of knowledge based systems, such as architecture modularity and the chances for incremental knowledge growth, only a few commercial expert systems are effectively operational, a very low percentage compared to the other conventional programs. A very tough obstacle to their distribution is certainly knowledge acquisition. Indeed it is very complex to completely extract an expert’s knowledge and formalise it within the knowledge basis. Moreover, such systems have high maintenance and updating costs. On the other side, the alternative to the functional approach, connectionism and neural networks, may even find successful applications, but often they are limited to solving low level problems, such as perception and recognition.
With regard to new targets, the present technological revolution pressing a society of information gives the chance to access a huge amount of information, which must be managed and interpreted correctly, ranging from huge company archives to online information, updated in “real time”, from the capability to grasp what can be understood through its most practical manifestations —like the experience acquired by the specialist “on the field”— to detailed surveys in search of more and more detailed results. Every future development is faced by an unstructured amount of heterogeneous and redundant data. Instruments for information extraction and analysis should not only be strengthened, but revolutionized, in order to use this huge amount of knowledge at the best of its potentiality. The use of methodologies applying techniques of symbolic learning and neural networks to extract such knowledge, therefore, becomes of crucial importance.
Furthermore AI systems today are required an integrated approach to their tasks, in particular of expert systems and the remaining range of information engineering, in particular, with current technologies such as object oriented programming, data base construction and graphic user interfaces, which were at least in part born in AI context originally. Another important phenomenon taking place is the expert systems’ tendency to extinction, wherever they are intended as a separate application, favouring a more integrated vision of these applications: its result is to create modules creating intelligent tasks, strictly integrated in software applications and general informative systems. In this sense the idea is to build intelligent agents with deductive and inductive reasoning capabilities, responsible for particular tasks and capable of co-ordinating themselves with other agents in a distributed environment in order to reach common single goals. The functions carried out by such intelligent agents must be integrated with functions carried out by other modules, perhaps pre-existent the system itself, like the operator’s interface, archive management systems (DBMS), data acquisition systems and graphic systems. In these fields AI is attaining great success, and it may even more in the future.
III. What Artificial Intelligence cannot do
In the last two decades of the 20th century and even today, the debate on AI is one of the hottest and most animated in philosophical research. This can actually be considered natural, since AI provokingly re-opens the problem of what the mind, intelligence and conscious intelligence are. The debate deals with two major issues: what AI can do and what is right for AI to do.
1. Weak AI and Strong AI. We should first of all make a distinction between “weak AI” and “strong AI”: we have already described a little about the meaning of these two terms (see above, I.1), but it is now necessary to re-examine it and specify it more in depth. Weak AI’s goal is to build machines behaving “as if” they were intelligent: machines capable of solving “all” the problems human intelligence can solve. Strong AI wants to do more: it states that machines acting in an intelligent manner necessarily have a “conscious intelligence,” a conscious mind indistinguishable from the human mind. Weak AI deals with the concrete construction or feasibility of “heavy” machines, while strong AI wants to give an answer to the abstract problem of what their “thinking” actually is. Therefore, as Russell and Norvig observe (1994), one can believe in strong AI and be a sceptic with regard to weak AI: thinking that if they were built intelligent machines would have conscious intelligence, but believing they cannot be built.
Weak AI does meet some opposition, but the most extreme is met by strong AI. The debate takes fire from statements such as: “the brain is a machine, therefore, in principle, one can build a machine that may perform the same activities the brain does.” Applied to the proprieties of the mind, this sentence is clearly reductionist in meaning, since it maintains implicitly that “mind” is the same as “brain”. It is an evidently materialistic sentence and it is extremely questionable but since the whole debate is centred on it, we shall speak about it further (see below, IV.1).
2. What Weak AI cannot do. Much of the opposition to weak AI may be applied to strong AI as well. A first group of objections consists in apodictic statements such as: “a machine will never be able to do such thing”. Objections of this kind have always preceded technical innovations, manifesting the psychological refusal and the panic groping human beings when they are faced with “new” incomprehensible elements: they must therefore be looked upon with great distrust. Generally speaking such statements have never aroused any debate: they are just about always disproved by facts. More exactly, technicians have collected the challenge, and they endeavoured to build machines doing exactly what was deemed impossible. The chess game described before is a very clear example (see above, II.3), also due to the fact that the certainty that a machine would not have been able to beat the chess champion lasted more than any other, and it certainly did challenge technicians for a long time. It is however interesting to see “how” the computer beat the master: the latter “sees” the right move on the keyboard, he “senses” it in a manner similar to the artist, through a mental process which we give the name “genius” to, but which we know nothing about. The machine contrasts this with the “brute strength” of a huge number of extremely fast circuits, enabling it to make a huge number of attempts in search of the move that assures the greatest chance of victory. Here we begin to see a fundamental difference between human being and the machine, ultimately defying any reductionist approach.
At the same time, in the same article where he proposes the “imitation game” (where he anticipates many of the matters which would have spurred philosophers in the coming decades), Turing also presents and argues a bizarre list of operations that, according to his adversaries, “a machine would never be able to perform”. Some of the elements on the list, like “learning from experience” have been accomplished: at least to a certain level Turing’s foresight has already beaten AI’s incompetent adversaries. Other elements simply do not apply to the machine as a subject provided with AI, like “to look good,” or “to make someone fall in love with it”: science fiction books speak about beautiful humanoid robots and human beings falling in love with them, but this refers rather to robotics, and certainly not to AI. Another element is “to enjoy strawberries and cream” and one can devise (and something in this sense has been built) a robot with taste and smell sensors, and a program identifying a pleasant taste. Even in this case the matter pertains to robotics first of all, but with an extra problem: “to enjoy” implies an activity typical of the human mind, thus entering the boundaries of strong AI’s main problem. The same applies to the most disquieting operation: “to be the object of one’s thought,” the beginning of self-consciousness.
A variant of these objections maintains that “a machine can do only what we can order it to do,” therefore it does not have a free will of its own and its choices are conditioned. There are two opposite answers. On one side genetic algorithms and many of the applications described (see above, II) show how machines can both widen and modify the range of their broaden quite a bit and modify the range of their options. On the other though these modifications could be considered somehow provided for and potentially included in the original programming. This stresses the absolute dependence of machines from the human being, but it also introduces an element of absolute novelty: the violation of the fundamental methodological paradigm of engineering, that is to project. The paradigm states that every technical element should be designed down to its last part before construction begins. Learning systems though, and especially neural networks, are initially quite shapeless objects: the technician defines their formal structure, but the “weight” of the connections, characterizing that particular neural network for it to perform a specific task, are specified during the learning process, in a manner dependant not from the intention of the designer, but from the information provided; in the end (if there is an end, that is, if the learning process does not last for the whole life of the system), neural networks assume numerical values which are totally unforeseeable, and in any case lacking all interest for the technician who designed the net and uses it. In this sense AI techniques seem to anticipate tendencies now spreading to many other fields of engineering, especially to information engineering: the generation and use of “not designed” systems. This is the case, for example of the Internet.
These objections are similar to “quantitative” ones, which may take the form of: “there will never be a machine powerful enough to solve such problem”. Most of these objections too have been disproved by facts, but the matter is not totally overcome. They were raised especially during the sixties, when the solution of mathematical problems came to clash with the previously described “combinatory explosion.” In applied mathematics “unmanageable” problems arise when calculation time grows at least exponentially with the number of variables. This therefore represents a “practical” impossibility: the same problem may be resolved if the unknown quantities were few, but it becomes unsolvable (in a reasonable amount of time) as soon as their number increases. To this regard Turing often objected that “quantity” becomes “quality,” in the sense that above certain dimensions a system’s behavior may change radically, and suddenly make the impossible possible. Turing made the example of nuclear reactors, which go from sub-critical to critical above certain dimensions, producing energy. This is simply a particular case of the well known property of most dynamic systems, of going from stability to instability when variations occur in their parameters. The matter would therefore be to find a “structure” which becomes capable of dominating the combinatory explosion upon reaching certain dimensions —for instance those of a network with a sufficient amount of neurons.
Another problem referring to “dimensions”, much discussed by Hubert Dreyfus (1979), is connected to the huge amount of information (knowledge basis) necessary, for example, to “contextualize” spoken language, and thus eliminate its unavoidable ambiguities. This knowledge basis is no else than what we accumulate through learning. Hence, the problem is twofold: to create “memories” of the right size, and to place the information in it. The input problem can be divided into several sub-problems: a) how to build a “background knowledge” to set up the learning on; b) how to organize the learning process (which generally speaking should be a reinforcement one) in such way as to optimize its results; c) how to realise inductive procedures generating knowledge from experience; d) how to control the acquisition of sensorial data. Dreyfus answered these questions negatively, strongly pessimistic on the chances of resolving them, but his objections have resolved in a powerful stimulation to find a solution to them.
3. The Intrinsic Limits of Mathematics and Logic. The realisation of AI, even weak AI, meets severe difficulties at a theoretical level, and generates a large number of objections. There is, for instance, the “termination problem”: will the execution of a certain program end, or could it theoretically go on endlessly? This problem has no solution: Turing has proved that every algorithm charged with proving the program should end is counterbalanced by a program where the same algorithm will give no answer. The greatest difficulty comes from Gödel’s “incompleteness theorem,” which has often generated a somewhat harsh debate. The incompleteness theorem states that in any (sufficiently powerful) formal logical system, it is possible to formulate true propositions, which the instruments of the system can nevertheless not prove true. In a famous article, Mind, Machines and Gödel (1961), John Lucas had observed that Gödel’s theorem apparently demonstrates that mechanism is wrong and minds cannot be considered equivalent to machines.
There is therefore something machines cannot do: decide the truth of undecidible propositions. On the contrary, since he is capable of “placing himself outside the system,” man can, for instance by applying Gödel’s theorem to the system itself. Douglas Hofstadter (1979, 20002), even though quoting this article at length and with admiration, sarcastically objects by proving that a man “cannot” place himself outside the system, since this would lead to infinite regression. From a reductionist point of view such a statement is correct, but it does contradict common experience, which reveals how we are capable of overcoming the limits of pure logic and looking at logical problems “from the outside.” The question was analysed also by Roger Penrose (1989), who proposed a way out. Penrose, a well known physicist, observes in support to Lucas’s thesis that, first, if the mind is able to understand non computational mathematics it cannot be only a formal logical system. Then he adds another argument typical of his own field. Beginning by observing the radical dichotomy between the mathematical description of quantum mechanics and that of classical physics, still true at a macroscopic level, he then considers the fact that the laws of physics are reversible with respect to time, that is, they do not take into account time’s irreversibility proved in the second principle of thermodynamics, and recognizable also in our own experience. Hence Penrose suggests that it could be possible to discover “new,” more complete and profound physics, which may bring together classical physics and the quantum world, and be “asymmetric” with respect to time, enabling the understanding of the nature of the mind from a physical point of view. These physics, though, would imply also new mathematics, “containing essentially not computable elements” (cf. G. Piccinini, 1994). These non computational mathematics would overcome the limits of AI, founded on computational mathematics, and could include operations which Gödel’s theorem forbids AI, but not the human mind, from doing.
4. Strong AI: Syntactic Elaboration and Semantic Content. In 1980 John Searle brought a totally different objection to strong AI, giving it the form of a witty example: the “conceptual experiment of the Chinese room.” It says: suppose I am in a room full of Chinese ideograms and since I do not know the language I were provided with a handbook of rules on how to associate ideograms with other ideograms. The rules identify ideograms according to their shape with no ambiguity, and they do not require me to understand them. Now suppose that Chinese speaking people start introducing groups of ideograms in the room and I reply, using the rules in the handbook, by manipulating these ideograms and sending back different groups of them. If the rules in the handbook specify which groups of ideograms may be associated to the ones introduced accurately enough, so that the “answers” may have a meaning and may be coherent with the questions, the people outside the room may erroneously suppose the person inside understands Chinese. Better, that he or she has performed a “syntactic” elaboration of the message based on his or her “semantic” understanding of it, while “semantics” were actually left outside the room (cf. Searle, 1992). This is what happens in all calculators (not only in AI): they perform syntactic operations on the messages introduced, totally independent from their semantic value. Semantics remain behind the message’s entrance in the calculator and they are given back to message by those who receives it at the exit. With this example Searle faces one of the problems which, as explained (see above, I.2), had minimized people’s enthusiasm at the beginning of AI studies twenty years before.
Perhaps because of its strength, his argument has received a huge number of objections, sometimes quite picturesque. Paul and Patricia Churchland (husband and wife, Searle’s colleagues at the University of California, Searle at Berkeley and the Churchlands at San Diego), in underscoring the fact that they considered the experiment of the Chinese room a specious syllogism, opposed it with the “experiment of the luminous room” (cf. Churchland and Churchland, 1990). But the aim of their opposition consists in negating the essential and qualitative distinction between syntactic and semantics: since every mental process is situated in the brain, both would be closely interconnected aspects of cerebral activities; and since semantics are situated in the brain, their apparent distance from syntactic elaborations would be connected with the extreme complexity of structure of the brain. Therefore semantics too may be transferred to machines, if provided with sufficiently complex circuits and algorithms. From a reductionist point of view this position is irreproachable too, but it is opposed, with several arguments, by Hubert Dreyfus (1979), who denies calculators the capability of having semantic intelligence, and even high level syntactic capabilities, the kind enabling a "Heideggerian thematization of their presence in the world, in other words, putting oneself at stake to the point of going beyond one’s initial context, and placing oneself in other contexts of reality sometimes containing the first and always remaining conscious of it. From this point of view, AI’s limit is not simply, […] not noting that it is necessary to use non computable quantum-relativistic physical structures to produce computers effectively simulating intelligence and conscience, on the contrary it consists in the much more crucial fact that real, and not artificial, intelligence and conscience have the capability of connecting different logical, syntactic and semantic levels, and of putting them continuously at stake, like no computer conceivable on physical (or physico-chemical, or biological, or in any case artificial) grounds seems possibly able to do" (Rossi, 1998, pp. 90-91).
IV. Artificial Intelligence between Science and Conscience
The previous considerations lead us to a crucial matter, the “problem of conscience.” Searle explains it with the conceptual experiment of the “brain prosthesis” by considering that an operation of extremely refined microsurgery may be able to substitute one by one all the neurons of a brain with electronic microcircuits working exactly like neurons, and reproduce all synaptic connections. What would be then of human conscience? According to Searle it would vanish. On the other side John Moravec reconsidered the matter eight years later from a “functionalistic” point of view, and he believes it would remain unaltered. But the question is basically a matter of defining what conscience is, and this, obviously, is not something solvable with scientific methods.
1. Criticism to Reductionism. We spoke of a “reductionist perspective” (see above, III.1), meaning the mind’s identification with its material “support,” the brain, seen as a “machine”, fully reproducible by artificial devices. Hence, if there is a difference between the mind and a machine it should be attributed either to the machine’s temporary insufficiency, remediable in the long run, or to limitations the mind still does not know it has. More generally speaking, by reductionism we mean the idea that the human mind may be simulated, at least in principle, by artificial systems capable of reproducing its performance in a way perfect to the point of making the two indistinguishable (cf. Rossi, 1998, pp. 43-44). Some authors believe simulation may reach the point when artificial systems possess properly human characteristics, such as conscience and intentionality. If this is their position in principle, nearly none of the authors quoted here specifies which are the actual diversifications within it and which reasons stand behind it: it appears “natural,” as if it were the only possible one. Jerry A. Fodor prefers to call this position “materialism,” and he opposes it to the Cartesian “dualism” he rejects because of "its failure to account adequately for mental causation. If the mind is non-physical, it has no position in physical space. How, then, can a mental cause give rise to a behavioural effect that has a position in space? To put it another way, how can the non-physical give rise to the physical without violating the laws of conservation of mass, of energy and of momentum?" (Fodor, 1981, p. 114). Within materialism there are also “behaviorist” positions, deriving from psychology and other fields favouring neuro-physiological aspects instead. According to Fodor, separate from both materialism and dualism, there are “functionalistic” positions setting aside the brain’s structure or of that of systems simulating it, and instead focusing their attention on the "possibility that systems as diverse as human beings, calculating machines and disembodied spirits could have all mental states" (ibidem). But Fodor does not consider the possibility that “mental states” and the intellectual faculties that produce them are essentially elements very different the ones from the others.
The reader may attempt to recognize these different positions within the previous discussion. These considerations are totally consistent with the “anti-metaphysic” attitude ruling the whole post-Galilean and post-Cartesian scientific research. But upon reaching this frontier land between body and mind, between physical and metaphysical, it attains somewhat clashing results, at times reaching the paradox. With a series of bold and fascinating considerations Hofstadter denies the fact that everybody may make experiments and that the human mind cannot be stopped by Gödelian indecidibility. Oppositions to the paradox of the Chinese room consider there is no difference between syntactic propositions and semantics, and from a practical point of view the “aesthetic” ability of the chess master is defeated by the toilsome research for the right path in a terribly complicated tree of decisions. Generally, these positions favour the rational-deductive activity of the mind rather than other faculties. In particular they ignore “intuitive” intelligence, which the experience of the chess player should raise the case for. It is true, there are a few generous attempts to overcome the barrier of ratio (in the etymological sense of calculation) and the consequent aporia of the incompleteness theory. Such attempts can be found especially in Searle and in Penrose, who do consider “non-algorithmic rationality,” but they are immediately exhausted, perhaps for their insufficient openness to the metaphysical perspective, but certainly also because of the furious reaction of opponents forcing the tiresome defence of rearguard positions.
Here we want to quote an enlightening passage of Searle’s aforementioned article, because it sheds light on the gist of the question, the diverse nature of the brain and the machine, of “natural” and “artificial”: "Computer simulations of brain processes provide models of the formal aspects of these processes. But the simulation should not be confused with duplication. The computational model of mental processes is no more real than the computational model of any other natural phenomenon. One can imagine a computer simulation of the action of peptides in the hypothalamus that is accurate down to the last synapse, but equally one can imagine a computer simulation of the oxidation of hydrocarbons in a car engine or the action of digestive processes in a stomach when it is digesting pizza. And the simulation is no more the real thing in the case of the brains than it is in the case of the car or of the stomach. Barring miracles, you could not run your car by doing a computer simulation of the oxidation of gasoline, and you could not digest pizza by running the program that simulates such digestion. It seems obvious that a simulation of cognition will similarly not produce the effects of neurobiology of cognition" (Searle, 1990, p. 23). Out of any vis polemica, AI is only a “simulation model” for natural intelligence (and at present of only some of its aspects), which like all simulation models is very useful for practical ends, but nothing more and, if we were actually able to produce something similar to conscience and intentionality, we would still have to say that it is simply a simulation of conscience and intentionality. So artificial intelligence must play the role of a “sign of contradiction” for scientific research: on one side it reveals the “meta-scientific” character of the choice that Fodor vividly calls “materialistic,” and it proves how vain is any attempt to justify it remaining within the limits of positive science, as considered above; on the other side it reveals the intrinsic limits of such a choice, potentially a source for unsolvable aporias and results in contrast with experience. Other positions, not programmatically closed to a metaphysical perspective, could perhaps overcome these limits.
2. What AI must not do. The moral problem of the limits within which it should move and the ways artificial intelligence techniques should be used is nothing else than a detail of the more general problem of the correct use of technological instruments. Here, however, it acquires a special meaning since it seems to be a matter of attributing choices which man should deal with to the machine. Just think, for instance, about the bioethical implications of automatic therapies “decided” by an expert system. In other words, while in general the use of technical instruments meets quantitative limits (for example: “you should not use too much energy, for it exhausts reserves and devastates the environment”) here limitations are apparently of a more qualitative kind, being a matter of knowing “which” kinds of interventions artificial intelligence should be entrusted with, and which not. The whole matter should probably not be dramatised. If the paradigm of reductionism were true, and machine were provided with intentionality as well, then such a delegation would be deeply disquieting; but if, on the contrary, more reasonably, one keeps in mind that machines are programmed by human beings, and from human beings it depends even when provided with “genetic” algorithms developing the programming in a non prearranged manner, then the problem is only a matter of establishing what level of confidence to give the “intellectual prosthesis” in planning therapies or some other operation economically or socially relevant.
The problem thus returns once again to be a quantitative matter once again: the prudent use of technical instruments. Nevertheless, we must not deny that the availability of programs which “decide for us” could lead operators to assume a less responsible attitude and give up their responsibility by “delegating” it to the machine; if this were the case we could no longer say that machines depends from us. In general terms we can instead say what Romano Guardini sensed many years ago, that all instruments apparently provided with “autonomous capabilities” require man to possess in every moment the complete “moral” dominion over the technological systems he makes use of (cf. The End of the Modern World, 1951).
Technical aspects: I. BRATKO, Prolog Programming for Artificial Intelligence (1986) (Harlow: Addison-Wesley, 2001); E. CHARNIAC, D. MCDERMOTT, Introduction to Artificial Intelligence (Reading, MA: Addison-Wesley, 1985); L. CONSOLE, E. LAMMA, P. MELLO, M. MILANO, Programmazione Logica e Prolog (Torino: Utet, 19972); R. DAVIS, B. BUCHANAN, E. SHORTLIFFE, “Production Rules as a Representation for Knowledge-based Consultation Program,” Artificial Intelligence 8 (1977), pp. 15-45; R. DAVIS, D.B. LENAT, Knowledge-based Systems in Artificial Intelligence (New York: McGraw-Hill, 1982); J. DOYLE, T. DEAN, “Strategic Directions in Artificial Intelligence,” ACM Computing Surveys 28 (1996), n. 4, pp. 653-670; U.M. FAYYAD, G. PIATETSKY-SHAPIRO, P. SMYTH, R. UTHURUSAMY, Advances in Knowledge Discovery and Data Mining (Menlo Park, CA: AAAI Press, 1992); J.A. FREEMAN, D.M. SKAPURA, Neural Networks, Algorithms Applications and Programming Techniques (Reading, MA: Addison-Wesley, 1991); M. GINSBERG, Essentials of Artificial Intelligence (San Mateo, CA: Morgan Kaufman, 1993); D. GOLDBERG, Genetic Algorithms in Search, Optimization and Machine Learning (Reading, MA: Addison-Wesley, 1989); S. GROSSBERG, Studies of Mind and Brain. Neural Principles of Learning, Perception Development, Cognition and Motor Control (Boston: Reidel, 1982); S. HAYKIN, Neural Networks (New York: Macmillan, 1994); J.A. HERTZ, A. KROGH, R.G. PALMER, Introduction to the Theory of Neural Computation (Redwood City, CA: Addison-Wesley, 1991); K. KNIGHT, E. RICH, Artificial Intelligence (New York: McGraw-Hill, 1991); N.R. JENNINGS, M.J. WOOLDRIGE (eds.), Agent Technology (Berlin-New York: Springer, 1998); J. LUCAS, “Minds, Machines and Gödel,” Philosophy 37 (1961), pp. 37-39; K. POPPER, W.S. MCCULLOCH, W. PITTS, “A Logical Calculus of the Ideas Immanent Inneural Nets,” Bullettin of Mathematical Biophysics 5 (1943), pp. 115-137; R.S. MICHALSKI, J.G. CARBONELL, T.M. MITCHELL (eds.), Machine Learning: An Artificial Intelligence Approach (Berlin-New York Springer, 1984); M. MINSKY, S. PAPERT, Perceptrons (Cambridge, MA: MIT Press, 1969); M. MINSKY, The Society of Mind (New York Touchstone Editions, 1988); T.M. MITCHELL, Machine Learning (New York: McGraw Hill, 1997); J.R. QUINLAN, “Induction of Decision trees,” Machine Learning 1 (1986), n. 1, pp. 81-106; J.A. ROBINSON, “A Machine-Oriented Logic Based on the Resolution Principle,” Journal of ACM 12 (1965), n. 1, pp. 23-41; F. ROSENBLATT, Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms (Washington D.C.: Spartan Books, 1962); S.J. RUSSEL, P. NORVIG, Artificial Intelligence. A Modern Approach (Englewood Cliffs, NJ: Prentice Hall International, 1994); E.H. SHORTLIFFE, Computer Based Medical Consultation: Mycin (New York: Elsevier, 1976); H.A. SIMON, The Sciences of the Artificial (Cambridge, MA: MIT Press, 1981); P.H. WINSTON, Artificial Intelligence (Reading, MA: Addison-Wesley, 1992);
Interdisciplinary aspects: G. BASTI, Il rapporto mente-corpo nella filosofia e nella scienza (Bologna: Edizioni Studio Domenicano, 1991); F. BERTELÈ, A. OLMI, A. SALUCCI, A. STRUMIA, Scienza, analogia, astrazione. Tommaso d’Aquino e le scienze della complessità (Padova: Il Poligrafo, 1999); M. BUNGE, The Mind-Body Problem. A Psychobiological Approach (Oxford: Oxford Univ. Press, 1980); J.-P. CHANGEAUX, L’homme neuronal (Paris: A. Fayard, 1983); P.M. e P. CHURCHLAND, “Could a Machine Think?,” Scientific American, (1990), n. 262, pp. 26-31; H. DREYFUS, What Computers can't do. The Limits of Artificial Intelligence (New York: Harper & Row, 1979); J. ECCLES, The Self and its Brain. An Argument for Interactionism (Berlin: Springer International, 1977); J.A. FODOR, “The Mind-Body Problem,” Scientific American, (1981), n. 244, pp. 114-123; D.R. HOFSTADTER and THE FLUID ANALOGIES RESEARCH GROUP, Fluid Concepts and Creative Analogies. Computer Models of the Fundamental Mechanisms of Thought (London: Penguin, 1998); D.R. HOFSTADTER, Gödel, Escher, Bach. An Eternal Golden Braid (1979) (London: Penguin, 2000); E. NAGEL and J. R. NEWMAN, Gödel’s Proof (London: Routledge, 1989); R. PENROSE, The Emperor's New Mind. Concerning Computers, Minds, and the Laws of Physics (Oxford: Oxford Univ. Press, 1989); R. PENROSE, Shadows of the Mind. A Search for the Missing Science of Consciousness (Reading: Vintage, 1995); G. PICCININI, “Su una critica dell’intelligenza artificiale ‘forte’,” Rivista di filosofia 85 (1994), n. 1, pp. 141-146; A. ROSSI, Il fantasma dell’intelligenza. Alla ricerca della mente artificiale (Napoli: Cuen, 1998); R.J. RUSSELL et al. (eds.), Neuroscience and the Person. Scientific Perspectives on Divine Action (Vatican City-Berkeley, CA: Vatican Observatory Publications - Center for Theology and the Natural Sciences, 1999); J. SEARLE, Intentionality. An Essay in the Philosophy of Mind (New York: Cambridge Univ. Press, 1983); J. SEARLE, “Is the Brain's Mind a Computer Program?,” Scientific American, (1990), n. 262, pp. 20-25; J. SEARLE, The Rediscovery of the Mind (Cambridge, MA – London: MIT Press, 1992); J. SEARLE, The Mystery of Consciousness (New York: New York Review of Books, 1997); A. TURING, “Computing Machinery and Intelligence,” Mind 49 (1950), pp. 433-460.