About half a year ago, I mentioned “Logic Tensor Networks” in my short summary of the Dagstuhl seminar on neural-symbolic computation. I think that this is a highly interesting approach, and as I intend to work with it in the future, I will shortly introduce this framework today.
Logic Tensor Networks [1] provide a nice way of combining neural networks with symbolic rules by using fuzzy logic. Well, what does that mean? Let’s approach this step by step.
As I already mentioned in one of my first posts, one of the current challenges in AI is to combine explicit, symbolic knowledge in the form of rules (like “red apples are sweet”) with implicit, subsymbolic knowledge like the weights of a neural network (e.g., of a neural network that can decide whether there is an apple in an image or not). Logic Tensor Networks (or LTNs, for short) gives us a way to combine both types of knowledge with each other. Here is how this roughly works:
Every concept is represented by a logical predicate. For instance, the concept “apple” is written as a function “apple(x)” which takes as an argument any object x and returns true if this object is an apple and false if it is not. Abstract rules like “red apples are sweet” can then be formulated as logical formulas involving these predicates:
∀x: apple(x) ∧ red(x) → sweet(x)
This statement can be interpreted as follows: “For all possible objects, the following is true: If the object is both and apple and red, then this object must also be sweet.”In other words: all red apples are sweet.
This by itself is not really interesting yet – that’s the standard way of encoding symbolic knowledge, after all! So let’s take this one step further by introducing fuzzy logic:
I’ve already explained earlier that in fuzzy logic, you assume that statements don’t have to be either completely true or completely false. Instead, you allow for statements to be partially true by defining a “degree of truth” – usually a number between 0 and 1 (where 0 corresponds to “completely false” and 1 corresponds to “completely true”). Applying this to our concepts/predicates, we get a degree of membership, similarly to what I proposed in my formalization of conceptual spaces. So apple(x) will now return a number between 0 and 1 which indicates how much the object x can be considered to be an apple. This way, we can express imprecise knowledge and imprecise concept boundaries.
Of course, one can also combine multiple fuzzy truth values with each other. For instance, let’s say that apple(x) = 0.8 and red(x) = 0.5 for some object x. Now the question is: What is the degree of truth for a formula like apple(x) ∧ red(x)? The most common definition is to just take the minimum of both individual truth values. That is, the degree of truth for apple(x) ∧ red(x) would be the minimum of apple(x) = 0.8 and red(x) = 0.5, which is 0.5. Similarly, one can also define degrees of truth for the operations of disjunction, conjunction, implication, etc.
Okay, but again: Why is this special? That’s just standard fuzzy logic up to now! All right, here comes the final twist:
Every one of our concepts/predicates is now represented by a neural network and objects are represented by points in a feature space. The input to the neural network for “apple” is a point in the feature space and its output is a number between 0 and 1 – the degree of membership of this object to the “apple” concept! The neural network is therefore used to learn the membership function for the given concept. The underlying feature space can be based on features extracted from images or other sensory data. Why is this useful?
Well, we now just found a way to connect subsymbolic information (weights in a neural network that classifies for instance images) with symbolic logical information (like the rule “all red apples are sweet”). Whenever we observe a new object (e.g., based on an image), we can use the neural networks to classify it, and then we can use our formulas to do reasoning on it. That’s quite nice.
Moreover, we can use symbolic rules to guide the process of learning the neural networks’ weights. We can optimize the weights of the “apple” network not only such that it correctly classify images as being apples, but also such that the fuzzy formula apple(x) ∧ red(x) → sweet(x) is true for all points x in our feature space. The symbolic rules can therefore give us additional constraints for the classification.
Furthermore, after we have learned the weights of our neural networks, we can also check whether other rules that we didn’t use during the learning process (like “apple(x) ∧ green(x) → sour(x)“) are true. So we can query our networks to find additional rules that we didn’t have before.
In my opinion, some quite nice opportunities arise from this approach.
How does all of this relate to my research on conceptual spaces? What are my concrete plans for using this framework? I’ll address these questions in one of my next blog posts – so stay tuned!
References
[1] Luciano Serafini and Artur d’Avila Garcez: “Logic Tensor Networks: Deep Learning and Logical Reasoning from Data and Knowledge” arXiv 2016. Link
I like where this is going. While I am still too naive about first-order logic to comment intelligently about possible directions, I do know from my background reading that a few labs have attacked the analogy/metaphor problem from a symbolic point of view. Once we can use conceptual spaces to glue together the connectionist level and the symbolic level, a whole world of opportunity opens to compute metaphor (besides just basic vector subtraction, like what Word2Vec uses). For example, see Selmer Bringsjord’s AI and Reasoning Lab at Rensselaer.
http://rair.cogsci.rpi.edu/