Will Ware's blog: February 2010

Saturday, February 27, 2010

Learning Ruby on Rails

I'm learning Ruby on Rails to help a friend with his website and to be able to put it on my resume, and keeping notes as I go.

I've gotten the thing to do typical CGI script stuff, and now I'm figuring out how database access works. One big surprise is that as Rails advanced to version 2.0, one of the basic commands for setting up database access changed. Google "rails 2.0 scaffolding" for details.

Wednesday, February 24, 2010

Jena: Node versus RDFNode

The Jena code has two representations for nodes in an RDF graph. One is the class Node, which has several subclasses: Node_Variable, Node_Literal, Node_URI, etc. The other is the interface RDFNode, which has many subinterfaces: Literal, Resource, Property, etc.

These two node representations have very different roles and very different idiomatic usages, and this doesn't appear to be spelled out in the Jena documentation anywhere. RDFNode is in the com.hp.hpl.jena.rdf.model package, where Node is in the com.hp.hpl.jena.graph package, but I don't think the packaging by itself is a big enough hint.

The Jena tutorials mostly talk only about the RDFNode variants, usually instantiating them by calling a "create" method on the Model. The poorly documented distinction between RDFNode and Node extends to the distinction between Model and Graph, and between Statement and Triple.

Since this information didn't appear in the documentation, we need to look at the Jena mailing list to find it.

A key difference between Resource and Node is that Resources know which model they are in, and Nodes are general. That's what makes resource.getProperty() work. Now in a query that is not a concept that has any meaning in the general case and patterns can span graphs.

We have found that Model/Statement/RDFNode (the API) works as an application interface but it's not the right thing for storage abstractions and the Graph/Triple/Node (the SPI) works better where the regularity is more valuable. That is, we have split the application-facing design from the sub-system-facing design.

So an instance of RDFNode is associated with a specific Model, where an instance of Node is free-floating, and is used to build Rules, which are also model-independent. The two representations can be connected by URIs. If you have a Node and a Model, and you want the corresponding RDFNode, do this (or use createProperty or createLiteral as needed):

Resource r = model.createResource(uri1);

and if you have an RDFNode, you can do this to get a Node:

Node uriNode = Node.createURI(
        ((Resource)rdfNode).getURI());

So I can understand that there are two very different appropriate interfaces for writing Jena apps and for interfacing to a storage system. What I don't get is why I would ever see the latter while writing an application. If I define a Rule, I need to deal in Nodes. Presumably this is because I've been constructing Rules programmatically rather than just reading them in from a file. Maybe I should stick with the latter.

Fixing a versioning problem with CUDA 2.3

In an earlier posting, I observed that CUDA 2.3 wants to use GCC 4.3, which is a problem for Fedora 11 and Ubuntu 9.10. I've been itching to upgrade my distribution on my NVIDIA Linux box, and particularly itching to move to Ubuntu. I found some help on Thomas Moelhave's blog. Thanks, Thomas!

In addition to his instructions, I needed to install some stuff.

sudo aptitude install freeglut3 \
   freeglut3-dev libXmu-dev libXi-dev

Once I did that and completed his instructions, everything worked great. The rest of my Ubuntu 9.10 installation is completely intact and happy.

Wednesday, February 17, 2010

Telomeres and aging

Recently I became aware of Sierra Sciences, a startup founded by William Andrews, previously of Geron. Andrews had done a lot of research on telomeres and telomerase.

Your cells have nuclei in them where your DNA is wadded up into packets called chromosomes. On the ends of the DNA strands there's a thing called a telomere. It protects the DNA from unravelling, like the little plastic tube on the end of your shoelace. Our telomeres shorten as we get older, and longer telomeres are strongly correlated with youth and vigor and health. There are many contributors to ageing but telomere length is currently regarded as one of the most urgent and one of the best understood.

Our reproductive cells do not suffer this effect. If we passed on shorter telomeres to our kids, they wouldn't live long, and they probably couldn't have kids of their own. To accomplish this, our reproductive cells produce stuff called telomerase which protects the telomeres from shortening. Here's the cool part: the gene for producing telomerase is present in ALL our cells, but it's only switched on in the reproductive cells. So there's a research push to find a telomerase activator that switches on the gene in all our cells. Sierra Sciences is one of the companies involved in this research.

You can buy a telomerase activator today, called TA-65. It's expensive, about $1500 per month, I think. But I haven't yet found any compelling evidence that it's a scam or a significant health risk. So I'm toying with the idea of trying it for a few months and see if I feel any different.

There is also a clinical test to measure the length of your telomeres. I know it exists but I don't know much more about it at the moment.

Friday, February 05, 2010

Playing with the Jena semantic web framework

I've begun tinkering with Jena, a semantic web framework written in Java. It embodies a lot of ideas and technologies that were once considered AI or expert systems, and which I neglected at the time (1980s).

Jena homepage: http://jena.sourceforge.net/
Docs: http://jena.sourceforge.net/documentation.html
Javadoc: http://jena.sourceforge.net/javadoc/index.html

The semantic web frames a body of knowledge as a collection of three-word sentences called triples. These can be diagrammed as a directed graph such as the one below.

The corresponding three-word sentences appear below, written in N3, a human-readable formal language used by the semantic web community.

@prefix :  <#> .
:Cat :has :Fur .
:Bear :has :Fur .
:Cat :is-a :Mammal .
:Bear :is-a :Mammal .
:Mammal :has :Vertebrae .
:Whale :is-a :Mammal .
:Whale :lives-in :Water .
:Mammal :is-a :Animal .
:Fish :is-a :Animal .
:Fish :lives-in :Water .

In Jena, a Model is one of these things. Having constructed it (or having loaded it from either a file on the internet or a file on your computer), you can apply rules that allow you to draw conclusions. So let's step through that proecess. First we need to read in the file.

private static final String baseUri =
    "file:///home/wware/wware-autosci/" +
    "semweb/java/simpleNet.n3#";
private static void modelReadFile(String filename,
                                  Model model) {
    try {
        File f = new File(filename);
        FileReader fr = new FileReader(f);
        model.read(fr, baseUri);
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    }
}

To print the contents of a model, we can use the SPARQL query language, which looks a lot like SQL.

private static void printModel(Model model) {
    String queryString = 
        "SELECT ?x ?y ?z " +
        "WHERE {" +
        "    ?x ?y ?z . " +
        "}";
    Query query = QueryFactory.create(queryString);
    QueryExecution qe =
      QueryExecutionFactory.create(query, model);
    ResultSet results = qe.execSelect();
    ResultSetFormatter.out(System.out, results, query);
    qe.close();
}

and we'll call that method from our main method. I personally find it appalling that the graph above fails to recognize that fish have vertebrae, so we'll add a triple for that.

public static void main(String[] args) {
    Model model = ModelFactory.createDefaultModel();
    modelReadFile("simpleNet.rdf", model);
    model.createResource(baseUri + "Fish")
         .addProperty(model.createProperty(
                         baseUri + "has"),
                      model.createResource(
                         baseUri + "Vertebrae"));
    printModel(model);
}

and the RDF file that imports the model was translated from the N3 above, using CWM.

<rdf:RDF
xmlns="file:///home/wware/wware-autosci/semweb/java/simpleNet.n3#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Description rdf:about="#Bear">
        <has rdf:resource="#Fur"/>
        <is-a rdf:resource="#Mammal"/>
    </rdf:Description>
    <rdf:Description rdf:about="#Cat">
        <has rdf:resource="#Fur"/>
        <is-a rdf:resource="#Mammal"/>
    </rdf:Description>
    <rdf:Description rdf:about="#Fish">
        <is-a rdf:resource="#Animal"/>
        <lives-in rdf:resource="#Water"/>
    </rdf:Description>
    <rdf:Description rdf:about="#Mammal">
        <has rdf:resource="#Vertebrae"/>
        <is-a rdf:resource="#Animal"/>
    </rdf:Description>
    <rdf:Description rdf:about="#Whale">
        <is-a rdf:resource="#Mammal"/>
        <lives-in rdf:resource="#Water"/>
    </rdf:Description>
</rdf:RDF>

There is a Model.write(OutputStream) method, so we can write a model directly to a file instead of stepping through the triples explicitly.

So how about some actual reasoning? We should be able to conclude that a cat is an animal, and has vertebrae. This will require that we define two rules of inference, rule1 ("is-a" is transitive) and rule2 (a member of a species has the things the species has).

String rules =
        "[ rule1: (?x " + baseUri+"is-a ?y) " +
                 "(?y " + baseUri+"is-a ?z) -> " +
                 "(?x " + baseUri+"is-a ?z) ] " +
        "[ rule2: (?x " + baseUri+"is-a ?y) " +
                 "(?y " + baseUri+"has ?z) -> " +
                 "(?x " + baseUri+"has ?z) ]";
Reasoner reasoner = new
    GenericRuleReasoner(Rule.parseRules(rules));
reasoner.setDerivationLogging(true);
InfModel inf =
  ModelFactory.createInfModel(reasoner, model);
printModel(inf);

Simply creating the InfModel is enough to draw all the relevant inferences. The Reasoner's setDerivationLogging method tells the model to remember the derivations that led to any new conclusions, and these derivations can be examined for debugging purposes.

PrintWriter out = new PrintWriter(System.out);
for (StmtIterator i = inf.listStatements();
             i.hasNext(); ) {
    Statement s = i.nextStatement();
    for (Iterator id = inf.getDerivation(s);
             id.hasNext(); ) {
        Derivation deriv = id.next();
        deriv.printTrace(out, true);
    }
}
out.flush();