In my PhD project, I do research in the area of “concept formation”. Before starting to talk about my PhD research in more detail, I would like to use this post to give a quick introduction into the area of concept formation.
A concept could be defined as follows:
A concept is an abstract idea representing the fundamental characteristics of what it represents. (Wikipedia)
In practice, a concept is a representation of a type of thing in the world: The concept of an apple is an abstract description of all apples (containing information about an apple’s typical size, color, taste, etc.) and is different from the concept of a dog or the concept of a book.
In one of my previous posts, I gave a brief introduction into the conceptual spaces framework that tries to represent such concepts in a geometric way – every concept corresponds to a region in a space.
The area of “concept formation” is located within the overall field of AI. Let’s assume that we want our artificial agent to store its knowledge in the form of concepts. Because we don’t want to create all these concepts by hand (which would be a lot of work given the large number of existing concepts), we need some principled way of deriving these concepts. This is what concept formation is all about.
It seems obvious that little children are not born with a plethora of concepts in their minds already. Actually, it they most likely acquire these concepts over time by observing the world and interacting with it. Although they receive helpful guidance from their parents, this parental feedback is quite scarce: Usually, parents don’t constantly provide labels for every object the child sees (“This is a ball. This is an apple. This is a dog.”), but only every now and then.
Clearly, if we want to build an intelligent machine that also makes use of concepts, we need to find a program that can discover concepts based on its interactions with the real world. The input to such a program would be the different objects that are being observed one after another. Its output should consist in a set of concepts, i.e., generalizations of the inputs. It is important that we usually assume that the input to this program comes without labels. This means that the program only observes objects, but it is not told which category they belong to. In other words, we are talking about an unsupervised machine learning problem. If we take into account some scarce feedback (like the feedback provided to children by adults), we end up with something very similar to semi-supervised clustering.
I would like to point out that in concept formation, one usually assumes that the obervations are made one after another. This differs from most machine learning approaches where one typically assumes that a large set of observations is given at once. The latter case could be called “batch processing”, whereas in concept formation, incremental processing is taking place: After each observation (or handfull of observations), the system needs to update its conceptualization. One could say that the size of the memory for old observations is limited. Assuming a limited memory size is one way of imposing constraints onto the concept formation problem that is hoped to yield a cognitively plausible program (i.e., a program that is similar to a process occuring in the human mind).
My PhD research is basically all about finding such a concept formation program that works within the conceptual spaces framework, i.e., that uses the framework of conceptual spaces in order to represent the concepts it discovers. I’ll explain my approach in more detail in future blog posts.