Machine Intelligence

Machine Intelligence is advanced computing that enables a technology (a machine, device, or algorithm) to interact with its environment intelligently, meaning it can take actions to maximize its chance of successfully achieving its goals. The concept of machine intelligence highlights the intersection of machine learning and artificial intelligence, as well as the broad spectrum of opportunities and approaches in the field.[1]

To understand machine intelligence better, it is good to look at this term within the context of two other terms that are proliferating in today’s tech world – “artificial intelligence” and “machine learning.” Artificial intelligence is composed of systems that allow computers to imitate human cognitive processes or perform tasks that used to be done by humans. Machine learning is defined as systems that enable a computer system to learn from inputs, rather than being directed only by linear programming.

In this context, another way to explain “machine intelligence” is that through a basis of machine learning and artificial intelligence, the machine learns to work proactively. Theoretically, if a machine learns to extract various kinds of data to put together its own processes and arrive at its own conclusions, you could say that that constitutes machine intelligence based on both machine learning and artificial intelligence functionalities.[2]

Machine intelligence is what’s created when machines are programmed with some (but not all) aspects of human intelligence, including learning, problem solving and prioritization. With these (limited) abilities, a machine can tackle a complex set of problems.

Machine intelligence by necessity involves deductive logic. For example, systems exhibiting true machine intelligence come to understand when they’ve made mistakes, watch out for similar data that could lead to a similar mistake the next time, and avoid doing so.

This means that machine intelligence will have a suite of different machine learning methods available to it, as well as a battery of automation techniques, and will smartly prioritize and deploy a sequence of them in the right order, with the right timing to achieve specific business goals. You can think of machine intelligence as a higher evolution of machine learning with prioritization and goals added in - a stepping stone on the path to true AI.[3]

The Reason Behind Machine Intelligence[4]
Machine intelligence is taking place through a basis of machine learning and artificial intelligence. The machines basically learn to work proactively. Practically saying, when a machine learns to extract a variety of data, putting together its own processes and arriving at its own conclusions, you can say that it constitutes machine intelligence on the basis of both machine learning as well as artificial intelligence capabilities.

Programming along with some aspects of human intelligence, that includes learning, problem solving and prioritization, gives rise to machine intelligence. It’s indeed surprising how with these limited abilities, a machine is able to tackle a complex set of problems.

In order to make Machine intelligence work, deductive logic comes into the picture. For instance, systems, practicing true machine intelligence, are able to understand when they’ve made mistakes. Likewise, they watch out for similar data that may lead to the same mistake later, and avoids doing so.

This is how intelligent machines are born. If you make a good one, it will have a variety of diverse machine learning methods available to it. It will also include a variety of automation techniques, and will smartly prioritize and deploy a sequence of them in the right order. This will ensure the right timing to achieve specific business goals. Machine intelligence is, therefore, a higher evolution of machine learning. Along with priorities and goals added in, it is truly a stepping stone on the path to true Artificial Intelligence.

Tests of Machine Intelligence[5]

  • Turing Test: The classic approach to determining whether a machine is intelligent is the so called Turing test which has been extensively debated over the last 50 years. Turing realized how difficult it would be to directly definite intelligence and thus attempted to side step the issue by setting up his now famous imitation game: If human judges can not effectively discriminate between a computer and a human through teletyped conversation then we must conclude that the computer is intelligent. Though simple and clever, the test has attracted much criticism. Block and Searle argue that passing the test is not sufficient to establish intelligence. Essentially they both argue that a machine could appear to be intelligent without having any “real intelligence”, perhaps by using a very large table of answers to questions. While such a machine might be impossible in practice due to the vast size of the table required, it is not logically impossible. In which case an unintelligent machine could, at least in theory, consistently pass the Turing test. Some consider this to bring the validity of the test into question. In response to these challenges, even more demanding versions of the Turing test have been proposed such as the Total Turing test, the Truly Total Turing test and the inverted Turing test. Dowe argues that the Turing test should be extended by ensuring that the agent has a compressed representation of the domain area, thus ruling out look-up table counter arguments. Of course these attacks on the Turing test can be applied to any test of intelligence that considers only a system’s external behavior, that is, most intelligence tests. A more common criticism is that passing the Turing test is not necessary to establish intelligence. Usually this argument is based on the fact that the test requires the machine to have a highly detailed model of human knowledge and patterns of thought, making it a test of humanness rather than intelligence. Indeed, even small things like pretending to be unable to perform complex arithmetic quickly and faking human typing errors become important, something which clearly goes against the purpose of the test.
  • Compression Tests: Mahoney has proposed a particularly simple solution to the binary pass or fail problem with the Turing test: Replace the Turing test with a text compression test. In essence this is somewhat similar to a “Cloze test” where an individual’s comprehension and knowledge in a domain is estimated by having them guess missing words from a passage of text. While simple text compression can be performed with symbol frequencies, the resulting compression is relatively poor. By using more complex models that capture higher level features such as aspects of grammar, the best compressors are able to compress text to about 1.5 bits per character for English. However humans, which can also make use of general world knowledge, the logical structure of the argument etc., are able to reduce this down to about 1 bit per character. Thus the compression statistic provides an easily computed measure of how complete a machine’s models of language, reasoning and domain knowledge are, relative to a human. To see the connection to the Turing test, consider a compression test based on a very large corpus of dialogue. If a compressor could perform extremely well on such a test, this is mathematically equivalent to being able to determine which sentences are probable at a give point in a dialogue, and which are not (for the equivalence of compression and prediction see). Thus, as failing a Turing test occurs when a machine (or person!) generates a sentence which would be improbable for a human, extremely good performance on dialogue compression implies the ability to pass a Turing test. A recent development in this area is the Hutter Prize. In this test the corpus is a 100 MB extract from Wikipedia. The idea is that this should represent a reasonable sample of world knowledge and thus any compressor that can perform very well on this test must have have a good model of not just English, but also world knowledge in general. One criticism of compression tests is that it is not clear whether a powerful compressor would easily translate into a general purpose artificial intelligence. Also, while a young child has a significant amount of elementary knowledge about how to interact with the world, this knowledge would be of little use when trying to compress an encyclopedia full of abstract “adult knowledge” about the world.
  • Linguistic complexity: A more linguistic approach is taken by the HAL project at the company Artificial Intelligence NV. They propose to measure a system’s level of conversational ability by using techniques developed to measure the linguistic ability of children. These methods examine things such as vocabulary size, length of utterances, response types, syntactic complexity and so on. This would allow systems to be “. . . assigned an age or a maturity level beside their binary Turing test assessment of ‘intelligent’ or ‘not intelligent’ ”. As they consider communication to be the basis of intelligence, and the Turing test to be a valid test of machine intelligence, in their view the best way to develop intelligence is to retrace the way in which human linguistic development occurs. Although they do not explicitly refer to their linguistic measure as a test of intelligence, because it measures progress towards what they consider to be a valid intelligence test, it acts as one.
  • Multiple Cognitive Abilities: A broader developmental approach is being taken by IBM’s Joshua Blue project. In this project they measure the performance of their system by considering a broad range of linguistic, social, association and learning tests. Their goal is to first pass what they call a “toddler Turing test”, that is, to develop an AI system that can pass as a young child in a similar set up to the Turing test. Another company pursuing a similar developmental approach based on measuring system performance through a broad range of cognitive tests is the a2i2 project at Adaptive AI. Rather than toddler level intelligence, their current goal to is work toward a level of cognitive performance similar to that of a small mammal. The idea being that even a small mammal has many of the key cognitive abilities required for human level intelligence working together in an integrated way.
  • Competitive Games: The Turing Ratio method of Masum et al. has more emphasis on tasks and games rather than cognitive tests. Similar to our own definition, they propose that “. . . doing well at a broad range of tasks is an empirical definition of ‘intelligence’.” To quantify this they seek to identify tasks that measure important abilities, admit a series of strategies that are qualitatively different, and are reproducible and relevant over an extended period of time. They suggest a system of measuring performance through pairwise comparisons between AI systems that is similar to that used to rate players in the international chess rating system. The key difficulty however, which the authors acknowledge is an open challenge, is to work out what these tasks should be, and to quantify just how broad, important and relevant each is. In our view these are some of the most central problems that must be solved when attempting to construct an intelligence test. Thus we consider this approach to be incomplete in its current state.
  • Collection of Psychometric Tests: An approach called Psychometric AI tries to address the problem of what to test for in a pragmatic way. In the view of Bringsjord and Schimanski, “Some agent is intelligent if and only if it excels at all established, validated tests of (human) intelligence.” They later broaden this to also include “tests of artistic and literary creativity, mechanical ability, and so on.” With this as their goal, their research is focused on building robots that can perform well on standard psychometric tests designed for humans, such as the Wechsler Adult Intelligent Scale and Raven Progressive Matrices. As effective as these tests are for humans, we believe that they are unlikely to be adequate for measuring machine intelligence. For a start they are highly anthropocentric. Another problem is that they embody basic assumptions about the test subject that are likely to be violated by computers. For example, consider the fundamental assumption that the test subject is not simply a collection of specialized algorithms designed only for answering common IQ test questions. While this is obviously true of a human, or even an ape, it may not be true of a computer. The computer could be nothing more than a collection of specific algorithms designed to identify patterns in shapes, predict number sequences, write poems on a given subject or solve verbal analogy problems — all things that AI researchers have worked on. Such a machine might be able to obtain a respectable IQ score [SD03], even though outside of these specific test problems it would be next to useless. If we try to correct for these limitations by expanding beyond standard tests, as Bringsjord and Schimanski seem to suggest, this once again opens up the difficulty of exactly what, and what not, to test for. Thus Psychometric AI, can be considered, at least as it is currently formulated, to only partially address this central question.
  • C-Test. One perspective among psychologists who support the g-factor view of intelligence, is that intelligence is “the ability to deal with complexity”. Thus, in a test of intelligence, the most difficult questions are the ones that are the most complex because these will, by definition, require the most intelligence to solve. It follows then that if we could formally define and measure the complexity of test problems using complexity theory we could construct a formal test of intelligence. The possibility of doing this was perhaps first suggested by Chaitin. While this path requires numerous difficulties to be dealt with, we believe that it is the most natural and offers many advantages: It is formally motivated, precisely defined and potentially could be used to measure the performance of both computers and biological systems on the same scale without the problem of bias towards any particular species or culture. The C-Test consists of a number of sequence prediction and abduction problems similar to those that appear in many standard IQ tests. The test has been successfully applied to humans with intuitively reasonable results. Similar to standard IQ tests, the C-Test always ensures that each question has an unambiguous answer in the sense that there is always one hypothesis that is consistent with the observed pattern that has significantly lower complexity than the alternatives. Other than making the test easier to score, it has the added advantage of reducing the test’s sensitivity to changes in the reference machine. The key difference to sequence problems that appear in standard intelligence tests is that the questions are based on a formally expressed measure of complexity. To overcome the problem of Kolmogorov complexity not being computable, the C-Test instead uses Levin’s Kt complexity. In order to retain the invariance property of Kolmogorov complexity, Levin complexity requires the additional assumption that the universal Turing machines are able to simulate each other in linear time. As far as we know, this is the only formal definition of intelligence that has so far produced a usable test of intelligence. Our main criticism of the C-Test is that it is a static test limited to passive environments. As we have argued earlier, we believe that a better approach is to use dynamic intelligence tests where the agent must interact with an environment in order to solve problems.
  • Smith’s Test. Another complexity based formal definition of intelligence that appeared recently in an unpublished report is due to W. D. Smith. His approach has a number of connections to our work, indeed Smith states that his work is largely a “. . . rediscovery of recent work by Marcus Hutter”. Perhaps this is over stating the similarities because while there are some connections, there are also many important differences. The basic structure of Smith’s definition is that an agent faces a series of problems that are generated by an algorithm. In each iteration the agent must try to produce the correct response to the problem that it has been given. The problem generator then responds with a score of how good the agent’s answer was. If the agent so desires it can submit another answer to the same problem. At some point the agent requests to the problem generator to move onto the next problem and the score that the agent received for its last answer to the current problem is then added to its cumulative score. Each interaction cycle counts as one time step and the agent’s intelligence is then its total cumulative score considered as a function of time. In order to keep things feasible, the problems must all be in the complexity class P, that is, decision problems which can be solved by a deterministic Turing machine in polynomial time.

See Also


  1. What Does Machine Intelligence Mean? AMII
  2. Explaining Machine Intelligence Techopedia
  3. Understanding Machine Intelligence Forbes
  4. The reason behind Machine Intelligence WAC
  5. Tests of Machine Intelligence Shane Legg, Marcus Hutter