Wednesday, February 24, 2010

Jena: Node versus RDFNode

The Jena code has two representations for nodes in an RDF graph. One is the class Node, which has several subclasses: Node_Variable, Node_Literal, Node_URI, etc. The other is the interface RDFNode, which has many subinterfaces: Literal, Resource, Property, etc.

These two node representations have very different roles and very different idiomatic usages, and this doesn't appear to be spelled out in the Jena documentation anywhere. RDFNode is in the com.hp.hpl.jena.rdf.model package, where Node is in the com.hp.hpl.jena.graph package, but I don't think the packaging by itself is a big enough hint.

The Jena tutorials mostly talk only about the RDFNode variants, usually instantiating them by calling a "create" method on the Model. The poorly documented distinction between RDFNode and Node extends to the distinction between Model and Graph, and between Statement and Triple.

Since this information didn't appear in the documentation, we need to look at the Jena mailing list to find it.
A key difference between Resource and Node is that Resources know which model they are in, and Nodes are general. That's what makes resource.getProperty() work. Now in a query that is not a concept that has any meaning in the general case and patterns can span graphs.

We have found that Model/Statement/RDFNode (the API) works as an application interface but it's not the right thing for storage abstractions and the Graph/Triple/Node (the SPI) works better where the regularity is more valuable. That is, we have split the application-facing design from the sub-system-facing design.
So an instance of RDFNode is associated with a specific Model, where an instance of Node is free-floating, and is used to build Rules, which are also model-independent. The two representations can be connected by URIs. If you have a Node and a Model, and you want the corresponding RDFNode, do this (or use createProperty or createLiteral as needed):
Resource r = model.createResource(uri1);
and if you have an RDFNode, you can do this to get a Node:
Node uriNode = Node.createURI(
        ((Resource)rdfNode).getURI());
So I can understand that there are two very different appropriate interfaces for writing Jena apps and for interfacing to a storage system. What I don't get is why I would ever see the latter while writing an application. If I define a Rule, I need to deal in Nodes. Presumably this is because I've been constructing Rules programmatically rather than just reading them in from a file. Maybe I should stick with the latter.

1 comment:

shellac said...

To convert an RDFNode to a node use:

Node node = rdfNode.asNode();

the reverse operation is:

RDFNode rdfNode = model.asRDFNode(node);