2. Sunflower Foundation Programmer’s Guide

Author:Daniel Elenius <elenius@csl.sri.com>
Date:2018-09-07

2.1. Introduction

Sunflower Foundation is the “core” or “backend” component of the Sunflower software suite. It was originally design to serve the needs of the Sunflower Studio IDE, but has since evolved to serve as a generic API for working with Flora in Java. The relationship between Sunflower Foundation and Flora is similar to that between Jena and RDF/OWL.

This Programmer’s Guide is written for Java programmers. The reader is also assumed to have some familiarity with Flora. This is not a Flora tutorial. However, we recommend several other sources as tutorial and reference on the Flora language:

  • The Sunflower Studio User Manual includes a quick Flora language tutorial, which is kinder and gentler than the official Flora-2 material mentioned below.
  • The Flora website also has some tutorials.
  • The Flora manual is the official reference on the Flora language. Get the latest version in the Flora-2 svn on sourceforge, because it is updated frequently as the language evolves.
  • The XSB user manual can also be helpful, given that Flora-2 builds on XSB, and the Flora manual assumes knowledge of Prolog/XSB. Get it from the XSB svn on sourceforge.

This Guide is not intended to cover every class and method in Sunflower Foundation. Instead, all the public classes and methods are extensively documented using JavaDoc. The JavaDoc should be considered the primary, authoritative API documentation, whereas this Guide serves as an tutorial and introduction to help the user get started.

We also strongly recommend downloading Sunflower Studio and using it for creating Flora content. Experimenting with the features of Sunflower Studio can also help understand the motivation behind design decisions in Sunflower Foundation.

2.2. Parsing Flora Content

In order to syntactically analyze and manipulate Flora content, we need to parse it into a Java representation, known as a Flora Abstract Syntax Tree (AST). In Sunflower Foundation, the AST of a flora file is contained in a Flora Document. To produce a Flora Document, we need to use the Flora Parser.

To parse an entire Flora document, you need to feed it an InputStream containing the content:

File floraFile = new File("example.flr");
FloraDocument doc = FloraParser.parse(new FileInputStream(file));

or:

String floraContent = "p(?x) :- q(?x).\n";
FloraDocument doc = FloraParser.parse(new ByteArrayInputStream(floraContent.getBytes()));

You can also give the parser a String directly:

String floraContent = "p(?x) :- q(?x).\n";
FloraDocument doc = FloraParser.parse(floraContent);

Often, Flora files import other Flora files. In such cases, the imported files also need to be parsed, in order to get a correct Namespace Mapping and correct parsing of user-defined operators that are defined in imported files. For these reasons, the Flora Document Service is normally used to produce Flora Documents. The Document Service uses the parser under the hood, but takes care of finding and parsing the relevant imported files.

It is also possible to parse just one flora term. Flora terms are the basic units of the AST. Parsing flora terms is useful for example when parsing user string input which is to be used as a query (see Reasoner Interface).

To parse a term, use one of these methods:

String str = "p(?x)";
FloraTerm t = FloraParser.parseTerm(str);

or:

String str = "foo#bar(?x)";
NamespaceMapping nm = ...
FloraTerm t = FloraParser.parseTerm(str);

There are several ways to get the appropriate NamespaceMapping, discussed in the next sub-section.

2.2.1. Namespace Mapping

The NamespaceMapping class defines a mapping between namespace prefixes and the corresponding namespace URIs.

There are some rules for how to use prefixes and namespaces in Flora:

  1. All prefixes used in a file should be defined in that file.
  2. The same prefix should not be used for a different namespace in an imported file (this rule is stricter than in RDF/OWL).

Note that when using the OWL Importer (see Importing and Exporting), the appropriate namespace declarations are added to all files.

When parsing an entire Flora file, there is no need for the user to provide a NamespaceMapping since it can be generated from the prefix declarations in the file. However, when parsing individual terms, which may contain namespace prefixes, a NamespaceMapping has to be provided to the parser to give it this contextual information.

There are several ways to produce a NamespaceMapping:

  1. Creating one manually, using the constructor and the addPrefixNS() method. This is not recommended, and probably only useful for other parts of Sunflower Foundation.
  2. Getting it from a FloraDocument, using the getNamespaceMapping() method. Warning: This only returns the NamespaceMapping for the document in question, not including prefixes defined in imported files!
  3. Getting it from the Flora Document Service.
  4. Getting it from the Flora Ontology Service.

Methods #3 and 4 are the recommended ones, and are one of the main reasons to use the document and/or ontology services. See those sections for more details.

2.2.2. Syntax Errors

When parsing Flora content, the Flora Parser may discover a syntax, and throw a FloraParserException. This exception stores the file position where the problem occurred, and a description of the problem.

There is a difference in how these syntax errors are handled when parsing a flora document, versus parsing a single term:

  • When parsing a Flora document, the errors are not thrown. Instead, the parser tries to continue parsing after the error if possible. All the errors are stored in the FloraDocument. After parsing, the caller should check FloraDocument.getErrors() to retreive these errors, if any (or use FloraDocument.hasErrors() first).
  • When parsing a Flora term, any exception is thrown right away.

At this point, you may consider skipping ahead to Flora Document, where we discuss the most important methods of that importanr class, or continue reading the rest of this section for more in-depth information about the parser.

2.2.3. Why a New Flora Parser?

Note that Flora comes with its own parser, written in XSB (we’ll call this the “native” parser). However, this parser is of no use for our Java API, for two reasons:

  1. It produces XSB prolog terms, not a Java AST
  2. It is too slow for interactive use, such as in Sunflower Studio, where we repeatedly need to parse a large number of files to produce up-to-date ASTs as the user edits the content.

We have made every effort to make the Sunflower Foundation Flora Parser behave the same way as Flora’s native parser. However, it is important to understand one fundamental difference between the two parsers: Flora’s native parser loads each parsed statement into the Knowledge Base (KB) before parsing the next line. This means that this parser is not purely syntactic or context-free: It has access to the KB and can do some forms of type checking and other semantic checks. In contrast, the Sunflower Foundation Flora parser is purely syntactic. It does not update the KB as it proceeds. It does, however, keep track of certain declarations, such as namespace prefix and operator declarations, in order to correctly parse later content.

The result of this is that the Sunflower Foundation Flora Parser is somewhat more permissive than the native parser: Some errors may not be shown until a file is actually loaded into the Flora engine.

2.2.4. Flora Lexer

Underlying the Flora Parser is the FloraLexer class. The job of the lexer is to produce a list of tokens from a Flora content string. The parser then operates on the level of tokens, rather than characters, making its job much easier.

The tokens are kept in FloraDocument (see the getTokens() method), and also in each FloraTerm created by the parser (see Flora Abstract Syntax Tree ).

There are two reasons for keeping the tokens around, even when you have the AST:

  1. Each token contains information about where in the file it is located (see the start() and end() methods on FloraToken). This information can be used to locate for example a fact or rule inside a Flora file. In Sunflower Studio, we use this to highlight facts that the user is looking at in the KB Editor tab.
  2. The tokens can be used as the basis for syntax highlighting, such as in the Sunflower Studio Flora Text Editor. Because syntax highlighting has to react very quickly to editing done by the user, it is more efficent to use token-level information (and the lexer) than to use the parser for this purpose.

Warning: FloraDocument and FloraTerm objects that have been created programmatically (i.e., using their constructors) will not have token information! The same is true for terms created using FloraTerm.clone().

2.2.5. Flora Keywords

Closely related to the lexer, the FloraKeywords class contains constants for all the Flora keywords. This is used by the aforementioned syntax highlighting feature. If you need to refer to a Flora keyword or operator in code, such as the \and operator, you should use the constant defined in FloraKeywords, i.e.:

FloraKeywords.FL_AND

The FloraVocabulary class provides an interface to check whether a given string is a Flora keyword (or delimiter, builtin operator, etc). This is primarily used for syntax highlighting. Check the JavaDoc for more information.

2.3. Flora Document

In Parsing Flora Content, we described how to produce a FloraDocument from a Flora file. A FloraDocument contains the following basic types of content (some of these may be empty sets):

  • Tokens
  • A NamespaceMapping
  • Prefix declarations
  • User-defined operator declaratiobs
  • Facts
  • Rules
  • User-defined functions
  • Queries
  • Latent queries
  • Errors (see Syntax Errors)
  • Imports

All of these can be retrieved with the appropriate getter method, i.e., getFacts(), getLatentQueries(), and so on. See the JavaDoc for more details. These are all maintained as separate data structures, so the getters are fast constant-time operations.

2.3.1. Derived Information

FloraDocument contains one additional data structure, which is a map from Identifier to Fact; specifically, facts containing a Flora FrameTerm on the top level. These facts are “special” in that they contain structured “ontological” information. Consequently, this map is used to build the Ontology Model.

The primary method for accessing this information is getFrameFactsForIdentifier(). There are also some additional (not constant-time!) methods that extract useful information from this mapping, such as getInstanceFrames() and getAssertedSubclassOfs().

The derived information in FloraDocument is primarily intended for building the ontology model, and you will probably not need to use these methods directly.

2.3.2. Manipulating a Flora Document

FloraDocument includes a number of methods, with names starting with add, delete, or remove, which can be used to modify its content. These methods are used by the Sunflower Studio KB Editor.

Some caution is advised when using these methods. After executing these methods, terms that were previously extracted from the document may no longer be valid. For example, if you use getFacts(), then call deleteIdentifier(), the set of facts you retreived will no longer be valid (unless the identifier to delete did not exist in the document).

The methods do recreate correct token information, so positions of facts are updated accordingly.

2.4. Flora Abstract Syntax Tree

The AST representation in Sunflower Foundation is based on the FloraTerm interface, and its various implementation classes. These classes are all in the com.sri.floralib.ast Java package. The class- and package-level JavaDoc for this package and its classes provide detailed information about how the many different kinds of Flora terms are represented in this AST. In most cases, the caller does not need to worry about what kind of AST object are at hand. Mostly, FloraTerm objects are passed to other methods. If a method needs to be called on a FloraTerm, it is usually defined on the top-level interface level, e.g. getIdentifiers() or getVariables(). These methods descend through the frame structure to retreive the appropriate information.

If you find yourself writing your own code to recursively descend through the frame structure, or to syntactically analyze or manipulate Flora terms, consider whether there may already be a FloraTerm method that does the job, or request that we add the functionality to to Sunflower Foundation.

In some cases, it may be useful to examine the type of object and take some appropriate action that depends on it, e.g.:

FloraTerm t = FloraParser.parseTerm(str,nm);
if (t instanceof BinaryInfixTerm){
  BinaryInfixTerm bit = (BinaryInfixTerm)t;
  if (t.getOperator().equals(FloraKeywords.FL_AND)){  // check if it is an "\and" term
    FloraTerm left = bit.getLeftOperand();
    FloraTerm right = bit.getRightOperand();
    //... do something with the left and right conjuncts
  }
  else if ...
}

2.4.1. Manipulating the Abstract Syntax Tree

Most of the AST classes do not have setter methods. FloraTerm objects are usually treated as immutable. The exception is FrameTerm and some related classes and interfaces like Frame and FrameContent. These do contain methods that modify the term, like Frame.removeFrameContent(). These methods are used by FloraDocument to modify its content in e.g., FloraDocument.removeInstanceFrameContent() (see Manipulating a Flora Document).

2.5. Flora Document Service

The IFloraDocumentService interface provides methods to deal with import trees of Flora documents. Usually, we place Flora files, along with their imported files, in the same directory structure (i.e., with one common top-level directory for all the files).

Sunflower Foundation has one implementation of IFloraDocumentService, called FileFloraDocumentService. Sunflower Studio has an additional implementation called FloraDocumentService. There are two main differences between the two implementations:

  1. FileFloraDocumentService works with Java File objects, whereas FloraDocumentService works with Eclipse IFile objects. The IFloraDocumentService has a generic parameter SourceType to cover these two cases. The two subclasses both instantiated this parameter, i.e. they are not generic classes.
  2. FloraDocumentService implements listeners to discover changes in the underlying Flora files, and will re-parse the documents in such cases. This is to support user editing of files in the Sunflower Studio IDE. FileFloraDocumentService, which is intended for standalone use, does not have these listeners, because we don’t anticipate such a need (and we don’t want the unnecessary complexity).

We will only discuss FileFloraDocumentService from here on. We mentioned above the convention of having one top-level directory for Flora content. The constructor of the FileFloraDocumentService class makes this convention explicit: You have to provide a File for this top-level directory.

After creating a FileFloraDocumentService, you can use it to retrieve FloraDocument objects for any files in the top-level directory provided to the constructor. For example:

FileFloraDocumentService docService = new FileFloraDocumentService(new File("flora"));
FloraDocument doc = docService.getDocument(new File("flora/file1.flr"));

The document service will call the Flora parser as needed, and cache parsed documents so that they do not need to be re-parsed.

You can also set and get the “active file” (using setActiveFile() and getActiveFile). This concept is used in Sunflower Studio to indicate the file that the user is currently working on (which is used as the basis for various UI features). Similary, getActiveDocument() returns the FloraDocument associated with the active file.

The document service can also provide the Namespace Mapping of a document, either the “local” one (i.e. the same as what you would get using FloraDocument.getNamespaceMapping()), using getDeclaredPrefixes(), or the “combined” namespace mapping of the document and all its imports, using getAllPrefixes(). The latter is very useful, because a NamespaceMapping has to be provided to FloraParser.parseTerm(), (see Parsing Flora Content and Reasoner Interface) and this is the easiest way to achieve this.

Given that most Flora files have imports, we recommend using the document service to parse flora files, rather than using the Flora parser directly, for most cases.

2.6. Ontology Model

The IOntologyModel interface and its main (but abstract) implementation class, the AbstractOntologyModel, provide the following main functionalities:

  1. A unified view over the Flora “entities” (i.e., its classes, individuals, queries, and rules) from the import closure of a Flora document such that all those different entities are accessible via one central object - the ontology model.

    The ontology model provides methods to retrieve all the identifiers, classes, individuals, queries, rules, etc. from the primary document as well as its import closure.

    An AbstractOntologyModel contains references to its primary (or root) Flora FloraDocument, as well as to all FloraDocuments resulting from (direct or indirect) imports. The AbstractOntolyModel is agnostic as to where those FloraDocuments originate from (e..g., databases, files, ...). At the time of this writing, Sunflower Foundation only supports File ontology sources, resulting in the class FileOntologyModel, which is the main (and currently only) concrete implementation of the IOntologyModel. Most of its implementation is given in the abstract superclass AbstractOntologyModel. Sunflower Studio also includes a IFileOntologyModel to support ontology models produced from Eclipse’s IFiles.

    In addition, a notion of locality is implemented. For example, a class identifier is considered a local class in case it has a class frame in the primary (current) ontology document, and likewise for individuals, queries, and rules.

    A notion of generalized locality is also implemented:

    • A class is considered generalized local if it is either local, or if it has a subclass which is local, or if it is instantiated by a local individual.
    • An individual is considered generalized local if it is either local, or if it is instantiating a local class.
  2. Heuristics that “classify” identifiers as individuals or classes. In Flora, the same identifier can act as a class, and as an individual However, the class-individual distinction is often considered useful. Hence, the ontology model tries to recognize which identifiers are individuals, and which are classes. In general, an identifier is a class if it has some class frame content somewhere in the import closure, i.e., for the identifier id to be recognized as a class, there must be some occurrence of id[| ... |] somewhere in the import closure (where ... denotes any - possibly empty - frame content). For individuals, we required that there must be some occurrence of an individual frame on the identifier somewhere: id[ ... ]. However, we will also recognize an identifier as an individual in case it is argument to a class membership / instance assertion: id : C. Likewise, an identifier will be recognized as a class, if it participates in some class superclass axiom: id :: superclass. Please note that an identifier may have both class and individual information associated with it. In that case, it is up to the client application to decide how to handle this situation. The ontology model will happily consider the identifier as being both a class and an individual (i.e., both isClass and isIndividual can return true).

    And identifier is considered a rootIndividual in case it has individual frame content (hence, is recognized as an individual), but does not instantiate any classes (i.e., has no types). The method getRootIndividuals returns those individuals.

    Likewise, the method getRootClasses returns the set of classes which do not have superclasses, and hence, those classes are the roots of the taxonomy (see Inferences Performed by the Ontology Model and the Taxonomy for an explanation of the taxonomy). A graphical taxonomy viewer might chose to show those root classes as children of a toplevel root node (the constant identifier topClassRootIdentifier might be used for this, see Section Root Identifier Constants Defined in IOntologyModel).

  3. Methods for accessing the provenance of axioms, i.e., for subclass axioms and instance assertions (also called class membership assertions). Note that these axioms typically require 2 arguments (e.g., a class and a superclass, or an instance and a class). Methods for retrieving the corresponding ontology sources of such axioms are provided, e.g., getSourcesForAssertedSubclass. In the FileOntologyModel, the returned “source” in which the corresponding axiom is asserted will be a File.

  4. Methods for modifying the content of the ontology, e.g., it is possible to add and delete individuals, classes, individual and class properties, as well as axioms to a primary or imported ontology source (and hence FloraDocument) using the ontology model.

    Clients can be informed about changes to (in) the ontology model by registering a IOntologyModelListener (see methods in JavaDoc).

  5. Query methods for accessing the asserted information in an ontology model, i.e., there are query methods for retrieving the asserted superclasses of a class, for getting the asserted types of an individual, for retrieving the asserted property values of a class or individual, etc.

  6. Query methods for retrieving inferred information from the ontology model. For example, it is possible to retrieve the direct and indirect sub- and superclasses of a class, to retrieve the direct and indirect types of an individual, to retrieve the equivalent classes of a class, etc. The inferences performed by the ontology model are described in the next Subsection.

    A light form of inference is required in order to identify the sub- and superclasses of a class, and the direct and indirect types of an individual. This inference is performed by computing the so-called class taxonomy first, from which the desired answers can be obtained, see Inferences Performed by the Ontology Model and the Taxonomy for more information.

    In a nutshell, the taxonomy is a directed acyclic graph (DAG) with nodes representing sets of equivalence classes, or (class) equivalent classes for short. Each node is identified by a so-called representative, which may be any member in that equivalence set. The edges in the taxonomy represent the direct superclass / direct subclass relationship, which has to be computed on the basis of the asserted superclass axioms. We give an idea of how the taxonomy is computed in the next Subsection.

    The class taxonomy can be retrieved and constructed in a top-down fashion. Start by creating a DAG containing a single node for identifier topClassRootIdentifier, see Root Identifier Constants Defined in IOntologyModel). Then, retrieve the taxonomy root classes from the ontology model, using getRootClasses. Make those nodes children of the topClassRootIdentifier node. Now, for each node in the DAG that does not have children (let’s call them leaf nodes), retrieve their direct subclasses from the ontology model using getDirectSubclasses, and add them as children to the corresponding parent node, until there are no more leaf nodes for which getDirectSubclasses returns a non-empty set.

2.6.1. Inferences Performed by the Ontology Model and the Taxonomy

The central reasoning service of the ontology model is computation of the so-called taxonomy. The taxonomy is a directed acyclic graph with nodes representing sets of equivalence classes, and edges representing the direct subclass (superclass) of relationship between the classes in these sets of equivalent classes.

Note that the direct superclasses (direct subclasses) may not be identical to the set of asserted superclasses (asserted superclasses, resp.) of a class.

In this example:

A[||] :: B[||].
B[||] :: C[||].
A[||] :: C[||].

the class C is an asserted superclass of A, but not a direct superclass (in fact, the last superclass axiom is redundant, as it follows logically from the first two).

Moreover, if we also add:

C[||] :: A[||]

then the taxonomy will contain only one node, as all classes A, B, C are equivalent; hence, the taxonomy contains a node representing the equivalence class {A, B, C}. Note either A, B, or C may become the representative for that equivalence class (i.e., either A, B, or C). Given the representative for equivalence class or set of equivalent class, it is always possible to retrieve the other members of the equivalence class, e.g., in case A was selected as the representative class, then it is possible retrieve A given B, or C.

Let us assume that we also added the following axioms to the ontology model:

D[||] :: C[||].
E[||] :: D[||].

Now, D and E are subclasses of A, B, and C. However, only D is a direct subclass of them.

In addition to retrieving type and super / subclass information, it is also possible to retrieve asserted property values, as well as inferred (i.e., inherited) property values. Consider:

A[| Aprop -> 123 |].
B[| Bprop -> 456 |].
E[| Eprop -> 789 |].
e : E.

The method getClassPropertyValues can be used to retrieve the asserted class property values of a class - we expect Eprop -> 456 for E. The property values are returned as a FrameTripleAssertion collection. In addition, getClassPropertyValues can also return the inherited class property values. In that case, we will also be getting Aprop -> 123 and Bprop -> 456.

Likewise, this kind of inference also applies to instances of classes. If we ask for the asserted individual property values of e using getIndividualPropertyValues, we will get an empty set. However, if we request to also include inherited property values, then class property value inherited from its direct types, E, hence Eprop -> 789, as well as class property values inherited from its indirect types, D, C, B, A, will be included: Eprop -> 789, Bprop -> 456, Aprop -> 123.

2.6.2. A Note on Completeness

Please note that the ontology model does not capture all inferences that could be obtained from Flora reasoning. If you want to have the full inferences, you need to use the Reasoner Interface to query Flora.

The computation of the taxonomy, and hence the computation of the class equivalence classes, their direct and indirect super- and subclasses, as well as the direct and indirect types of an individual, is solely based on the asserted class-superclass axioms (see Inferences Performed by the Ontology Model and the Taxonomy for examples illustrating that kind of reasoning). Inferences from Flora rules are not considered, i.e., effects from rules expressing sufficient conditions for class membership.

2.6.3. Root Identifier Constants Defined in IOntologyModel

The IOntologyModel interface defines a couple of constants, such as:

topClassRootIdentifier
topLatelQueriesRootIdentifier
topQueriesRootIdentifier
...

etc. In a graphical display of the taxonomy, these identifiers can act as root nodes. Note that these identifiers are always considered local, but in fact, they are not part of the corresponding Flora files or FloraDocuments. Hence, they cannot be used in queries, i.e., you will not be able to retrieve the direct subclasses of topClassRootIdentifier by means of getDirectSubclasses. Please use getRootClasses instead.

2.6.4. Flora Ontology Service

It is recommended to use the Flora Ontology Service, i.e., FileOntologyService, to create (and subsequently retrieve) the ontology model for a given primary source file.

To create an ontology model FileOntologyModel for a given source file, it is sufficient to simply call getOntologyModel with the source file argument. The corresponding required FloraDocument will be looked up or created automatically from the included FileFloraDocumentService. The constructor of FileOntologyService requires a reference to a FileFloraDocumentService.

The FileOntologyService then maintains a mapping from source files to the corresponding ontology models. In addition, there is a notion of a current or active ontology model.

To instantiate the Flora Ontology service, use the following piece of code; note that new File("flora")); denotes the directory in which the Flora file resides:

FileFloraDocumentService docService = new FileFloraDocumentService(new File("flora"));
FileOntologyService ontService = new FileOntologyService(docService);

Using the ontService, we can then easily construct an ontology model for a given Flora file contained in the "flora" directory:

File floraFile = new File("example.flr");
IOntologyModel<File> ontologyModel = ontService.getOntologyModel(floraFile);

It is straightforward to use the documented API in order to request information or inferences from the model:

Set<Identifier> classes = ontologyModel.getAssertedClass();
Set<Identifier> superclasses = ontologyModel.getSuperclasses(identifier);

Note that the Flora Ontology Service can subsequently be used to retrieve the current ontology model, or to retrieve an ontology model for a given source file as follows:

IOntologyModel<File> ontologyModel2 = ontService.getOntologyModel(floraFile);
IOntologyModel<File> ontologyModel3 = ontService.getActiveOntologyModel();

At this point, the expression (ontologyModel3 == ontologyModel2) && (ontologyModel2 == ontologyModel) holds true.

2.7. Reasoner Interface

Executing queries and doing something with the query results is the main purpose of using Flora in the first place. Thus, the reasoner interface is perhaps its most important component. The interface to the reasoner is specified by IFloraEngine. There are two implementations, FloraInterprologEngine and FloraZMQClient. For now, use FloraInterprologEngine, as FloraZMQClient is still in development. However, we recommend that you use the interface wherever possible, so that you can easily switch engines in the future.

The constructor to FloraInterprologEngine has two constructors. Both require an XSB command and a Flora directory as arguments.

The XSB command should be the XSB executable found in the XSB/config/[arch]/bin directory, not the symlink/BAT file in XSB/bin.

The Flora dir should be the top-level “flora2” directory of the flora installation.

The second constructor also takes a contentDir argument. This is the top-level directory to load Flora files from. It should be the same as the directory passed to the Flora Document Service. This allows Flora to find imported files using relative paths. This is critical in order for Flora files to be portable between different systems, since absolute file paths will almost never be the same on any two systems.

Though probably not often needed, you can also change this content directory after the Flora engine has been created, using setNewContentDir(). Note that this retracts knowledge of the old content directory.

IFloraEngine has convenience methods gor loading and adding flora files (loadFile(), addFile()), and for simple yes/no (or true/false) Flora commands (floraCommand()). However, the most general query method is floraQuery(). This method takes a FloraTerm (which can be produced by the Flora parser, see Parsing Flora Content, or using FloraTerm constructors). The method returns a QueryResult. This method also needs a namespace mapping (which is used to parse the query results returned by Flora).

QueryResult contains one or more Solution objects, each of which contains bindings (in the form of FloraTerm objects) for all the variables in the query. See the JavaDoc for more details.

Putting it all together, the entire process of executing a query using Sunflower Foundation might look something like this:

File contentDir = new File("flora");
FileFloraDocumentService docService = new FileFloraDocumentService(contentDir);
IFloraEngine engine = new FloraInterprologEngine(xsbCmd, floraDir, contentDir.getAbsolutePath());
...
FloraDocument doc = docService.getDocument(new File("flora/file1.flr"));
NamespaceMapping nm = doc.getAllPrefixes();
String queryStr = "p(?x)";
FloraTerm queryTerm = FloraParser.parseTerm(queryStr,nm);
QueryResult res = engine.floraQuery(queryTerm, nm);
for (Solution sol : res.getSolutions()){
  System.out.println("?x = " + sol.getBinding("?x"));
}

2.8. Importing and Exporting

In the development of Sunflower, we have placed a high priority on being able to use Sunflower along with existing technologies, and to ease knowledge acquisition by being able to ingest data from readily available sources. We describe the different import/export interfaces provides by Sunflower Foundation in the following sub-sections.

2.8.1. Importing from RDF/OWL

Sunflower Foundation can create Flora files from RDF/OWL/OWL2 files. The JavaDoc for the Jena2FloraTranslator class describes how the translation works in some detail. From an API point of view, the translator is very simple: There are only two public (and static) methods: translateMonolithic() and translateOntologyFolder(). The former translates an OWL file, plus all its imports, into one big Flora file. The latter translates a directory of OWL files to a directory of Flora files, with a one-to-one mapping between OWL and Flora files, maintaining sub-directory structures.

Note that in both cases you have to specify both a Flora “base folder” and a Flora “destination folder”. This is done in order for relative paths to be correct. The “base folder” should be the top-level Flora content folder, i.e., the content directory you provided to FileFloraDocumentService (see Flora Document Service) and FloraInterprologEngine (see Reasoner Interface).

For the “monolithic” translation, you also have to specify an OWL “base folder”. The translator will look in all files in this directory and its sub-directories to find imports (exactly like a “Local Folder Repository” in Protege 3.5).

2.8.2. Exporting to RDF/OWL

The OWL exporter assumes the same mappings between Flora as the OWL importer, so again, see the JavaDoc for Jena2FloraTranslator for details on this. The importer tries to create Flora files that are re-exportable to OWL. For imported OWL files that have not been further modified in Flora, a re-exported file should be quite close to the original file. However, SWRL rules are not supported by the imported, so those will be lost in the round-trip. In general, it is best to avoid translating back and forth, as it is difficult to guarantee losslessness.

For Flora files that were not created by the OWL importer, the OWL exporter has to do some guesswork/heuristics. For example, the importer preserves rdfs:range and rdfs:domain facts. Without those, domains and ranges of properties have to be “guessed”, because Flora does not have a concept of global properties with ranges and domains. Similarly for Ontology URIs, which do not exist in Flora.

Similarly to the importer, the exporter allows you to translate a single Flora file to OWL, or to translate a whole Flora directory to a corresponding OWL directory.

When exporting a single file, it is possible to not include imported files. This allows for workflows such as:

  1. Develop ontologies in OWL.
  2. Export the OWL directory to Flora.
  3. Create a new Flora KB file that imports the translated Flora files.
  4. Export only the new Flora file back to OWL.

Creating a new KB in Flora can be advantageous, as it allows you to use Sunflower Studio features like Importing from CSV or SQL.

2.8.3. Importing from CSV or SQL

The CSV2Flora class provides a way of importing KB facts from a comma-separated values (CSV) file, e.g., one produced by Microsoft Excel, or exported from a database. Similarly the SQL2Flora class allows you to import data from a live or serialized SQL database.

Both these importers treat the data to import in the same way, and both have a constructor that takes two arguments: a default prefix, and a default namespace. These are used when generating new identifiers from the content in the data. The common behavior of the two importers is implemented in the Rows2Flora class.

The importers follow the following heuristics regarding how they create Flora content from the data in the table:

  1. The first column name specifies the class name of all the data in the table.
  2. All the other columns specify property names to use for the data in those columns.
  3. Each row in the table corresponds to one individual.
    • In the first column, the individual name is found.
    • Each of the other columns specifies a property value of the property corresponding to that column.
  4. For each cell,
    • Remove any double quotes around the value.
    • If the cell contents can be interpreted as a number, use the number value.
    • If the cell contents contains a ‘#’ character, use the value as is (i.e. it is an individual name).
    • If the cell contents can be interpreted as a currency, use the numeric value of the currency amount.
    • If the value is a date/time, treat it as a Flora \dateTime value.
    • Otherwise, use the default prefix followed by ‘#’ and the cell value.

In addition, the CSV importer assumes that each imported file may only contains one table.

2.8.3.1. Connecting to a live SQL database

TBD by Daniel?

2.8.4. XML and JSON support

FloraTerm and QueryResult objects can be written to, and read from both XML and JSON formats.

To parse the serialized forms, use FloraXMLParser and FloraJsonParser. These both have a parseFloraTerm method. The version in the XML parser takes a DOM Element, whereas the JSON parser expects an ObjectNode (from the Jackson JSON library). As the name implies, these methods return a FloraTerm corresponding to the XML/JSON encoding. For QueryResult objects, two constructors in that class (for XML and JSON) data, respectively) are used to parse the serialized versions.

Similarly, FloraTerm and QueryResult objects can be serialized to XML/JSON using the toXml() and toJson() methods.

In addition to all this, there is also a JSON interface to the reasoner engine itself, specified in the abstract FloraJsonClient class. This is implemented by the FloraZMQClient class.