2. Sunflower Foundation Programmer’s Guide¶
Author: | Daniel Elenius <elenius@csl.sri.com> |
---|---|
Date: | 2018-09-07 |
2.1. Introduction¶
Sunflower Foundation is the “core” or “backend” component of the Sunflower software suite. It was originally design to serve the needs of the Sunflower Studio IDE, but has since evolved to serve as a generic API for working with Flora in Java. The relationship between Sunflower Foundation and Flora is similar to that between Jena and RDF/OWL.
This Programmer’s Guide is written for Java programmers. The reader is also assumed to have some familiarity with Flora. This is not a Flora tutorial. However, we recommend several other sources as tutorial and reference on the Flora language:
- The Sunflower Studio User Manual includes a quick Flora language tutorial, which is kinder and gentler than the official Flora-2 material mentioned below.
- The Flora website also has some tutorials.
- The Flora manual is the official reference on the Flora language. Get the latest version in the Flora-2 svn on sourceforge, because it is updated frequently as the language evolves.
- The XSB user manual can also be helpful, given that Flora-2 builds on XSB, and the Flora manual assumes knowledge of Prolog/XSB. Get it from the XSB svn on sourceforge.
This Guide is not intended to cover every class and method in Sunflower Foundation. Instead, all the public classes and methods are extensively documented using JavaDoc. The JavaDoc should be considered the primary, authoritative API documentation, whereas this Guide serves as an tutorial and introduction to help the user get started.
We also strongly recommend downloading Sunflower Studio and using it for creating Flora content. Experimenting with the features of Sunflower Studio can also help understand the motivation behind design decisions in Sunflower Foundation.
2.2. Parsing Flora Content¶
In order to syntactically analyze and manipulate Flora content, we need to parse it into a Java representation, known as a Flora Abstract Syntax Tree (AST). In Sunflower Foundation, the AST of a flora file is contained in a Flora Document. To produce a Flora Document, we need to use the Flora Parser.
To parse an entire Flora document, you need to feed it an
InputStream
containing the content:
File floraFile = new File("example.flr");
FloraDocument doc = FloraParser.parse(new FileInputStream(file));
or:
String floraContent = "p(?x) :- q(?x).\n";
FloraDocument doc = FloraParser.parse(new ByteArrayInputStream(floraContent.getBytes()));
You can also give the parser a String
directly:
String floraContent = "p(?x) :- q(?x).\n";
FloraDocument doc = FloraParser.parse(floraContent);
Often, Flora files import other Flora files. In such cases, the imported files also need to be parsed, in order to get a correct Namespace Mapping and correct parsing of user-defined operators that are defined in imported files. For these reasons, the Flora Document Service is normally used to produce Flora Documents. The Document Service uses the parser under the hood, but takes care of finding and parsing the relevant imported files.
It is also possible to parse just one flora term. Flora terms are the basic units of the AST. Parsing flora terms is useful for example when parsing user string input which is to be used as a query (see Reasoner Interface).
To parse a term, use one of these methods:
String str = "p(?x)";
FloraTerm t = FloraParser.parseTerm(str);
or:
String str = "foo#bar(?x)";
NamespaceMapping nm = ...
FloraTerm t = FloraParser.parseTerm(str);
There are several ways to get the appropriate NamespaceMapping
,
discussed in the next sub-section.
2.2.1. Namespace Mapping¶
The NamespaceMapping
class defines a mapping between namespace
prefixes and the corresponding namespace URIs.
There are some rules for how to use prefixes and namespaces in Flora:
- All prefixes used in a file should be defined in that file.
- The same prefix should not be used for a different namespace in an imported file (this rule is stricter than in RDF/OWL).
Note that when using the OWL Importer (see Importing and Exporting), the appropriate namespace declarations are added to all files.
When parsing an entire Flora file, there is no need for the user to
provide a NamespaceMapping
since it can be generated from the
prefix declarations in the file. However, when parsing individual
terms, which may contain namespace prefixes, a NamespaceMapping
has to be provided to the parser to give it this contextual
information.
There are several ways to produce a NamespaceMapping
:
- Creating one manually, using the constructor and the
addPrefixNS()
method. This is not recommended, and probably only useful for other parts of Sunflower Foundation. - Getting it from a
FloraDocument
, using thegetNamespaceMapping()
method. Warning: This only returns theNamespaceMapping
for the document in question, not including prefixes defined in imported files! - Getting it from the Flora Document Service.
- Getting it from the Flora Ontology Service.
Methods #3 and 4 are the recommended ones, and are one of the main reasons to use the document and/or ontology services. See those sections for more details.
2.2.2. Syntax Errors¶
When parsing Flora content, the Flora Parser may discover a syntax,
and throw a FloraParserException
. This exception stores the file
position where the problem occurred, and a description of the
problem.
There is a difference in how these syntax errors are handled when parsing a flora document, versus parsing a single term:
- When parsing a Flora document, the errors are not thrown. Instead,
the parser tries to continue parsing after the error if
possible. All the errors are stored in the FloraDocument. After
parsing, the caller should check
FloraDocument.getErrors()
to retreive these errors, if any (or useFloraDocument.hasErrors()
first). - When parsing a Flora term, any exception is thrown right away.
At this point, you may consider skipping ahead to Flora Document, where we discuss the most important methods of that importanr class, or continue reading the rest of this section for more in-depth information about the parser.
2.2.3. Why a New Flora Parser?¶
Note that Flora comes with its own parser, written in XSB (we’ll call this the “native” parser). However, this parser is of no use for our Java API, for two reasons:
- It produces XSB prolog terms, not a Java AST
- It is too slow for interactive use, such as in Sunflower Studio, where we repeatedly need to parse a large number of files to produce up-to-date ASTs as the user edits the content.
We have made every effort to make the Sunflower Foundation Flora Parser behave the same way as Flora’s native parser. However, it is important to understand one fundamental difference between the two parsers: Flora’s native parser loads each parsed statement into the Knowledge Base (KB) before parsing the next line. This means that this parser is not purely syntactic or context-free: It has access to the KB and can do some forms of type checking and other semantic checks. In contrast, the Sunflower Foundation Flora parser is purely syntactic. It does not update the KB as it proceeds. It does, however, keep track of certain declarations, such as namespace prefix and operator declarations, in order to correctly parse later content.
The result of this is that the Sunflower Foundation Flora Parser is somewhat more permissive than the native parser: Some errors may not be shown until a file is actually loaded into the Flora engine.
2.2.4. Flora Lexer¶
Underlying the Flora Parser is the FloraLexer
class. The job of
the lexer is to produce a list of tokens from a Flora content
string. The parser then operates on the level of tokens, rather than
characters, making its job much easier.
The tokens are kept in FloraDocument
(see the getTokens()
method), and also in each FloraTerm
created by the parser (see
Flora Abstract Syntax Tree ).
There are two reasons for keeping the tokens around, even when you have the AST:
- Each token contains information about where in the file it is
located (see the
start()
andend()
methods onFloraToken
). This information can be used to locate for example a fact or rule inside a Flora file. In Sunflower Studio, we use this to highlight facts that the user is looking at in the KB Editor tab. - The tokens can be used as the basis for syntax highlighting, such as in the Sunflower Studio Flora Text Editor. Because syntax highlighting has to react very quickly to editing done by the user, it is more efficent to use token-level information (and the lexer) than to use the parser for this purpose.
Warning: FloraDocument
and FloraTerm
objects that have
been created programmatically (i.e., using their constructors) will
not have token information! The same is true for terms created using
FloraTerm.clone()
.
2.2.5. Flora Keywords¶
Closely related to the lexer, the FloraKeywords
class contains
constants for all the Flora keywords. This is used by the
aforementioned syntax highlighting feature. If you need to refer to a
Flora keyword or operator in code, such as the \and
operator, you
should use the constant defined in FloraKeywords, i.e.:
FloraKeywords.FL_AND
The FloraVocabulary
class provides an interface to check whether a
given string is a Flora keyword (or delimiter, builtin operator,
etc). This is primarily used for syntax highlighting. Check the
JavaDoc for more information.
2.3. Flora Document¶
In Parsing Flora Content, we described how to produce a
FloraDocument
from a Flora file. A FloraDocument
contains the
following basic types of content (some of these may be empty sets):
- Tokens
- A
NamespaceMapping
- Prefix declarations
- User-defined operator declaratiobs
- Facts
- Rules
- User-defined functions
- Queries
- Latent queries
- Errors (see Syntax Errors)
- Imports
All of these can be retrieved with the appropriate getter method,
i.e., getFacts()
, getLatentQueries()
, and so on. See the
JavaDoc for more details. These are all maintained as separate data
structures, so the getters are fast constant-time operations.
2.3.1. Derived Information¶
FloraDocument
contains one additional data structure, which is a
map from Identifier
to Fact
; specifically, facts containing a
Flora FrameTerm
on the top level. These facts are “special” in
that they contain structured “ontological” information. Consequently,
this map is used to build the Ontology Model.
The primary method for accessing this information is
getFrameFactsForIdentifier()
. There are also some additional (not
constant-time!) methods that extract useful information from this
mapping, such as getInstanceFrames()
and
getAssertedSubclassOfs()
.
The derived information in FloraDocument
is primarily intended for
building the ontology model, and you will probably not need to use
these methods directly.
2.3.2. Manipulating a Flora Document¶
FloraDocument
includes a number of methods, with names starting
with add
, delete
, or remove
, which can be used to modify
its content. These methods are used by the Sunflower Studio KB Editor.
Some caution is advised when using these methods. After executing
these methods, terms that were previously extracted from the document
may no longer be valid. For example, if you use getFacts()
, then
call deleteIdentifier()
, the set of facts you retreived will no
longer be valid (unless the identifier to delete did not exist in the
document).
The methods do recreate correct token information, so positions of facts are updated accordingly.
2.4. Flora Abstract Syntax Tree¶
The AST representation in Sunflower Foundation is based on the
FloraTerm
interface, and its various implementation classes. These
classes are all in the com.sri.floralib.ast
Java package. The
class- and package-level JavaDoc for this package and its classes
provide detailed information about how the many different kinds of
Flora terms are represented in this AST. In most cases, the caller
does not need to worry about what kind of AST object are at
hand. Mostly, FloraTerm
objects are passed to other methods. If a
method needs to be called on a FloraTerm
, it is usually defined on
the top-level interface level, e.g. getIdentifiers()
or
getVariables()
. These methods descend through the frame structure
to retreive the appropriate information.
If you find yourself writing your own code to recursively descend
through the frame structure, or to syntactically analyze or manipulate
Flora terms, consider whether there may already be a FloraTerm
method that does the job, or request that we add the functionality to
to Sunflower Foundation.
In some cases, it may be useful to examine the type of object and take some appropriate action that depends on it, e.g.:
FloraTerm t = FloraParser.parseTerm(str,nm);
if (t instanceof BinaryInfixTerm){
BinaryInfixTerm bit = (BinaryInfixTerm)t;
if (t.getOperator().equals(FloraKeywords.FL_AND)){ // check if it is an "\and" term
FloraTerm left = bit.getLeftOperand();
FloraTerm right = bit.getRightOperand();
//... do something with the left and right conjuncts
}
else if ...
}
2.4.1. Manipulating the Abstract Syntax Tree¶
Most of the AST classes do not have setter methods. FloraTerm
objects are usually treated as immutable. The exception is
FrameTerm
and some related classes and interfaces like Frame
and FrameContent
. These do contain methods that modify the term,
like Frame.removeFrameContent()
. These methods are used by
FloraDocument
to modify its content in e.g.,
FloraDocument.removeInstanceFrameContent()
(see Manipulating a
Flora Document).
2.5. Flora Document Service¶
The IFloraDocumentService
interface provides methods to deal with
import trees of Flora documents. Usually, we place Flora files,
along with their imported files, in the same directory structure
(i.e., with one common top-level directory for all the files).
Sunflower Foundation has one implementation of
IFloraDocumentService
, called
FileFloraDocumentService
. Sunflower Studio has an additional
implementation called FloraDocumentService
. There are two main
differences between the two implementations:
FileFloraDocumentService
works with JavaFile
objects, whereasFloraDocumentService
works with EclipseIFile
objects. TheIFloraDocumentService
has a generic parameterSourceType
to cover these two cases. The two subclasses both instantiated this parameter, i.e. they are not generic classes.FloraDocumentService
implements listeners to discover changes in the underlying Flora files, and will re-parse the documents in such cases. This is to support user editing of files in the Sunflower Studio IDE.FileFloraDocumentService
, which is intended for standalone use, does not have these listeners, because we don’t anticipate such a need (and we don’t want the unnecessary complexity).
We will only discuss FileFloraDocumentService
from here on. We
mentioned above the convention of having one top-level directory for
Flora content. The constructor of the FileFloraDocumentService
class makes this convention explicit: You have to provide a File
for this top-level directory.
After creating a FileFloraDocumentService
, you can use it to
retrieve FloraDocument
objects for any files in the top-level
directory provided to the constructor. For example:
FileFloraDocumentService docService = new FileFloraDocumentService(new File("flora"));
FloraDocument doc = docService.getDocument(new File("flora/file1.flr"));
The document service will call the Flora parser as needed, and cache parsed documents so that they do not need to be re-parsed.
You can also set and get the “active file” (using setActiveFile()
and getActiveFile
). This concept is used in Sunflower Studio to
indicate the file that the user is currently working on (which is used
as the basis for various UI features). Similary,
getActiveDocument()
returns the FloraDocument
associated with
the active file.
The document service can also provide the Namespace Mapping of a
document, either the “local” one (i.e. the same as what you would get
using FloraDocument.getNamespaceMapping()
), using
getDeclaredPrefixes()
, or the “combined” namespace mapping of the
document and all its imports, using getAllPrefixes()
. The latter
is very useful, because a NamespaceMapping
has to be provided to
FloraParser.parseTerm()
, (see Parsing Flora Content and
Reasoner Interface) and this is the easiest way to achieve this.
Given that most Flora files have imports, we recommend using the document service to parse flora files, rather than using the Flora parser directly, for most cases.
2.6. Ontology Model¶
The IOntologyModel
interface and its main (but abstract)
implementation class, the AbstractOntologyModel
, provide the
following main functionalities:
A unified view over the Flora “entities” (i.e., its classes, individuals, queries, and rules) from the import closure of a Flora document such that all those different entities are accessible via one central object - the ontology model.
The ontology model provides methods to retrieve all the identifiers, classes, individuals, queries, rules, etc. from the primary document as well as its import closure.
An
AbstractOntologyModel
contains references to its primary (or root) FloraFloraDocument
, as well as to allFloraDocuments
resulting from (direct or indirect) imports. TheAbstractOntolyModel
is agnostic as to where thoseFloraDocuments
originate from (e..g., databases, files, ...). At the time of this writing, Sunflower Foundation only supportsFile
ontology sources, resulting in the classFileOntologyModel
, which is the main (and currently only) concrete implementation of theIOntologyModel
. Most of its implementation is given in the abstract superclassAbstractOntologyModel
. Sunflower Studio also includes aIFileOntologyModel
to support ontology models produced from Eclipse’sIFiles
.In addition, a notion of locality is implemented. For example, a class identifier is considered a local class in case it has a class frame in the primary (current) ontology document, and likewise for individuals, queries, and rules.
A notion of generalized locality is also implemented:
- A class is considered generalized local if it is either local, or if it has a subclass which is local, or if it is instantiated by a local individual.
- An individual is considered generalized local if it is either local, or if it is instantiating a local class.
Heuristics that “classify” identifiers as individuals or classes. In Flora, the same identifier can act as a class, and as an individual However, the class-individual distinction is often considered useful. Hence, the ontology model tries to recognize which identifiers are individuals, and which are classes. In general, an identifier is a class if it has some class frame content somewhere in the import closure, i.e., for the identifier
id
to be recognized as a class, there must be some occurrence ofid[| ... |]
somewhere in the import closure (where...
denotes any - possibly empty - frame content). For individuals, we required that there must be some occurrence of an individual frame on the identifier somewhere:id[ ... ]
. However, we will also recognize an identifier as an individual in case it is argument to a class membership / instance assertion:id : C
. Likewise, an identifier will be recognized as a class, if it participates in some class superclass axiom:id :: superclass
. Please note that an identifier may have both class and individual information associated with it. In that case, it is up to the client application to decide how to handle this situation. The ontology model will happily consider the identifier as being both a class and an individual (i.e., bothisClass
andisIndividual
can returntrue
).And identifier is considered a
rootIndividual
in case it has individual frame content (hence, is recognized as an individual), but does not instantiate any classes (i.e., has no types). The methodgetRootIndividuals
returns those individuals.Likewise, the method
getRootClasses
returns the set of classes which do not have superclasses, and hence, those classes are the roots of the taxonomy (see Inferences Performed by the Ontology Model and the Taxonomy for an explanation of the taxonomy). A graphical taxonomy viewer might chose to show those root classes as children of a toplevel root node (the constant identifiertopClassRootIdentifier
might be used for this, see Section Root Identifier Constants Defined in IOntologyModel).Methods for accessing the provenance of axioms, i.e., for subclass axioms and instance assertions (also called class membership assertions). Note that these axioms typically require 2 arguments (e.g., a class and a superclass, or an instance and a class). Methods for retrieving the corresponding ontology sources of such axioms are provided, e.g.,
getSourcesForAssertedSubclass
. In theFileOntologyModel
, the returned “source” in which the corresponding axiom is asserted will be aFile
.Methods for modifying the content of the ontology, e.g., it is possible to add and delete individuals, classes, individual and class properties, as well as axioms to a primary or imported ontology source (and hence
FloraDocument
) using the ontology model.Clients can be informed about changes to (in) the ontology model by registering a
IOntologyModelListener
(see methods in JavaDoc).Query methods for accessing the asserted information in an ontology model, i.e., there are query methods for retrieving the asserted superclasses of a class, for getting the asserted types of an individual, for retrieving the asserted property values of a class or individual, etc.
Query methods for retrieving inferred information from the ontology model. For example, it is possible to retrieve the direct and indirect sub- and superclasses of a class, to retrieve the direct and indirect types of an individual, to retrieve the equivalent classes of a class, etc. The inferences performed by the ontology model are described in the next Subsection.
A light form of inference is required in order to identify the sub- and superclasses of a class, and the direct and indirect types of an individual. This inference is performed by computing the so-called class taxonomy first, from which the desired answers can be obtained, see Inferences Performed by the Ontology Model and the Taxonomy for more information.
In a nutshell, the taxonomy is a directed acyclic graph (DAG) with nodes representing sets of equivalence classes, or (class) equivalent classes for short. Each node is identified by a so-called representative, which may be any member in that equivalence set. The edges in the taxonomy represent the direct superclass / direct subclass relationship, which has to be computed on the basis of the asserted superclass axioms. We give an idea of how the taxonomy is computed in the next Subsection.
The class taxonomy can be retrieved and constructed in a top-down fashion. Start by creating a DAG containing a single node for identifier
topClassRootIdentifier
, see Root Identifier Constants Defined in IOntologyModel). Then, retrieve the taxonomy root classes from the ontology model, usinggetRootClasses
. Make those nodes children of thetopClassRootIdentifier
node. Now, for each node in the DAG that does not have children (let’s call them leaf nodes), retrieve their direct subclasses from the ontology model usinggetDirectSubclasses
, and add them as children to the corresponding parent node, until there are no more leaf nodes for whichgetDirectSubclasses
returns a non-empty set.
2.6.1. Inferences Performed by the Ontology Model and the Taxonomy¶
The central reasoning service of the ontology model is computation of the so-called taxonomy. The taxonomy is a directed acyclic graph with nodes representing sets of equivalence classes, and edges representing the direct subclass (superclass) of relationship between the classes in these sets of equivalent classes.
Note that the direct superclasses (direct subclasses) may not be identical to the set of asserted superclasses (asserted superclasses, resp.) of a class.
In this example:
A[||] :: B[||].
B[||] :: C[||].
A[||] :: C[||].
the class C is an asserted superclass of A, but not a direct superclass (in fact, the last superclass axiom is redundant, as it follows logically from the first two).
Moreover, if we also add:
C[||] :: A[||]
then the taxonomy will contain only one node, as all classes A, B, C are equivalent; hence, the taxonomy contains a node representing the equivalence class {A, B, C}. Note either A, B, or C may become the representative for that equivalence class (i.e., either A, B, or C). Given the representative for equivalence class or set of equivalent class, it is always possible to retrieve the other members of the equivalence class, e.g., in case A was selected as the representative class, then it is possible retrieve A given B, or C.
Let us assume that we also added the following axioms to the ontology model:
D[||] :: C[||].
E[||] :: D[||].
Now, D and E are subclasses of A, B, and C. However, only D is a direct subclass of them.
In addition to retrieving type and super / subclass information, it is also possible to retrieve asserted property values, as well as inferred (i.e., inherited) property values. Consider:
A[| Aprop -> 123 |].
B[| Bprop -> 456 |].
E[| Eprop -> 789 |].
e : E.
The method getClassPropertyValues
can be used to retrieve the
asserted class property values of a class - we expect Eprop -> 456
for E
. The property values are returned as a
FrameTripleAssertion
collection. In addition,
getClassPropertyValues
can also return the inherited class
property values. In that case, we will also be getting Aprop ->
123
and Bprop -> 456
.
Likewise, this kind of inference also applies to instances of classes.
If we ask for the asserted individual property values of e
using
getIndividualPropertyValues
, we will get an empty set. However, if
we request to also include inherited property values, then class
property value inherited from its direct types, E
, hence Eprop
-> 789
, as well as class property values inherited from its indirect
types, D, C, B, A
, will be included: Eprop -> 789, Bprop -> 456,
Aprop -> 123
.
2.6.2. A Note on Completeness¶
Please note that the ontology model does not capture all inferences that could be obtained from Flora reasoning. If you want to have the full inferences, you need to use the Reasoner Interface to query Flora.
The computation of the taxonomy, and hence the computation of the class equivalence classes, their direct and indirect super- and subclasses, as well as the direct and indirect types of an individual, is solely based on the asserted class-superclass axioms (see Inferences Performed by the Ontology Model and the Taxonomy for examples illustrating that kind of reasoning). Inferences from Flora rules are not considered, i.e., effects from rules expressing sufficient conditions for class membership.
2.6.3. Root Identifier Constants Defined in IOntologyModel¶
The IOntologyModel
interface defines a couple of constants, such
as:
topClassRootIdentifier
topLatelQueriesRootIdentifier
topQueriesRootIdentifier
...
etc. In a graphical display of the taxonomy, these identifiers can act
as root nodes. Note that these identifiers are always considered
local, but in fact, they are not part of the corresponding Flora files
or FloraDocuments
. Hence, they cannot be used in queries, i.e.,
you will not be able to retrieve the direct subclasses of
topClassRootIdentifier
by means of getDirectSubclasses
.
Please use getRootClasses
instead.
2.6.4. Flora Ontology Service¶
It is recommended to use the Flora Ontology Service, i.e.,
FileOntologyService
, to create (and subsequently retrieve)
the ontology model for a given primary source file.
To create an ontology model FileOntologyModel
for a given source
file, it is sufficient to simply call getOntologyModel
with the
source file argument. The corresponding required FloraDocument
will be looked up or created automatically from the included
FileFloraDocumentService
. The constructor of
FileOntologyService
requires a reference to a
FileFloraDocumentService
.
The FileOntologyService
then maintains a mapping from source files
to the corresponding ontology models. In addition, there is a notion
of a current or active ontology model.
To instantiate the Flora Ontology service, use the following piece of
code; note that new File("flora"));
denotes the directory in which
the Flora file resides:
FileFloraDocumentService docService = new FileFloraDocumentService(new File("flora"));
FileOntologyService ontService = new FileOntologyService(docService);
Using the ontService
, we can then easily construct an ontology
model for a given Flora file contained in the "flora"
directory:
File floraFile = new File("example.flr");
IOntologyModel<File> ontologyModel = ontService.getOntologyModel(floraFile);
It is straightforward to use the documented API in order to request information or inferences from the model:
Set<Identifier> classes = ontologyModel.getAssertedClass();
Set<Identifier> superclasses = ontologyModel.getSuperclasses(identifier);
Note that the Flora Ontology Service can subsequently be used to retrieve the current ontology model, or to retrieve an ontology model for a given source file as follows:
IOntologyModel<File> ontologyModel2 = ontService.getOntologyModel(floraFile);
IOntologyModel<File> ontologyModel3 = ontService.getActiveOntologyModel();
At this point, the expression (ontologyModel3 == ontologyModel2) &&
(ontologyModel2 == ontologyModel)
holds true.
2.7. Reasoner Interface¶
Executing queries and doing something with the query results is the
main purpose of using Flora in the first place. Thus, the reasoner
interface is perhaps its most important component. The interface to
the reasoner is specified by IFloraEngine
. There are two
implementations, FloraInterprologEngine
and
FloraZMQClient
. For now, use FloraInterprologEngine
, as
FloraZMQClient
is still in development. However, we recommend that
you use the interface wherever possible, so that you can easily
switch engines in the future.
The constructor to FloraInterprologEngine
has two
constructors. Both require an XSB command and a Flora directory as
arguments.
The XSB command should be the XSB executable found in the
XSB/config/[arch]/bin
directory, not the symlink/BAT file in
XSB/bin
.
The Flora dir should be the top-level “flora2” directory of the flora installation.
The second constructor also takes a contentDir
argument. This is
the top-level directory to load Flora files from. It should be the
same as the directory passed to the Flora Document Service. This
allows Flora to find imported files using relative paths. This is
critical in order for Flora files to be portable between different
systems, since absolute file paths will almost never be the same on
any two systems.
Though probably not often needed, you can also change this content
directory after the Flora engine has been created, using
setNewContentDir()
. Note that this retracts knowledge of the old
content directory.
IFloraEngine
has convenience methods gor loading and adding flora
files (loadFile()
, addFile()
), and for simple yes/no (or
true/false) Flora commands (floraCommand()
). However, the most
general query method is floraQuery()
. This method takes a
FloraTerm
(which can be produced by the Flora parser, see Parsing
Flora Content, or using FloraTerm
constructors). The method
returns a QueryResult
. This method also needs a namespace mapping
(which is used to parse the query results returned by Flora).
QueryResult
contains one or more Solution
objects, each of
which contains bindings (in the form of FloraTerm
objects) for all
the variables in the query. See the JavaDoc for more details.
Putting it all together, the entire process of executing a query using Sunflower Foundation might look something like this:
File contentDir = new File("flora");
FileFloraDocumentService docService = new FileFloraDocumentService(contentDir);
IFloraEngine engine = new FloraInterprologEngine(xsbCmd, floraDir, contentDir.getAbsolutePath());
...
FloraDocument doc = docService.getDocument(new File("flora/file1.flr"));
NamespaceMapping nm = doc.getAllPrefixes();
String queryStr = "p(?x)";
FloraTerm queryTerm = FloraParser.parseTerm(queryStr,nm);
QueryResult res = engine.floraQuery(queryTerm, nm);
for (Solution sol : res.getSolutions()){
System.out.println("?x = " + sol.getBinding("?x"));
}
2.8. Importing and Exporting¶
In the development of Sunflower, we have placed a high priority on being able to use Sunflower along with existing technologies, and to ease knowledge acquisition by being able to ingest data from readily available sources. We describe the different import/export interfaces provides by Sunflower Foundation in the following sub-sections.
2.8.1. Importing from RDF/OWL¶
Sunflower Foundation can create Flora files from RDF/OWL/OWL2
files. The JavaDoc for the Jena2FloraTranslator
class describes
how the translation works in some detail. From an API point of view,
the translator is very simple: There are only two public (and static)
methods: translateMonolithic()
and
translateOntologyFolder()
. The former translates an OWL file, plus
all its imports, into one big Flora file. The latter translates a
directory of OWL files to a directory of Flora files, with a
one-to-one mapping between OWL and Flora files, maintaining
sub-directory structures.
Note that in both cases you have to specify both a Flora “base folder”
and a Flora “destination folder”. This is done in order for relative
paths to be correct. The “base folder” should be the top-level Flora
content folder, i.e., the content directory you provided to
FileFloraDocumentService
(see Flora Document Service) and
FloraInterprologEngine
(see Reasoner Interface).
For the “monolithic” translation, you also have to specify an OWL “base folder”. The translator will look in all files in this directory and its sub-directories to find imports (exactly like a “Local Folder Repository” in Protege 3.5).
2.8.2. Exporting to RDF/OWL¶
The OWL exporter assumes the same mappings between Flora as the OWL
importer, so again, see the JavaDoc for Jena2FloraTranslator
for
details on this. The importer tries to create Flora files that are
re-exportable to OWL. For imported OWL files that have not been
further modified in Flora, a re-exported file should be quite close to
the original file. However, SWRL rules are not supported by the
imported, so those will be lost in the round-trip. In general, it is
best to avoid translating back and forth, as it is difficult to
guarantee losslessness.
For Flora files that were not created by the OWL importer, the OWL
exporter has to do some guesswork/heuristics. For example, the
importer preserves rdfs:range
and rdfs:domain
facts. Without
those, domains and ranges of properties have to be “guessed”, because
Flora does not have a concept of global properties with ranges and
domains. Similarly for Ontology URIs, which do not exist in Flora.
Similarly to the importer, the exporter allows you to translate a single Flora file to OWL, or to translate a whole Flora directory to a corresponding OWL directory.
When exporting a single file, it is possible to not include imported files. This allows for workflows such as:
- Develop ontologies in OWL.
- Export the OWL directory to Flora.
- Create a new Flora KB file that imports the translated Flora files.
- Export only the new Flora file back to OWL.
Creating a new KB in Flora can be advantageous, as it allows you to use Sunflower Studio features like Importing from CSV or SQL.
2.8.3. Importing from CSV or SQL¶
The CSV2Flora
class provides a way of importing KB facts from a
comma-separated values (CSV) file, e.g., one produced by Microsoft
Excel, or exported from a database. Similarly the SQL2Flora
class
allows you to import data from a live or serialized SQL database.
Both these importers treat the data to import in the same way, and
both have a constructor that takes two arguments: a default prefix,
and a default namespace. These are used when generating new
identifiers from the content in the data. The common behavior of the
two importers is implemented in the Rows2Flora
class.
The importers follow the following heuristics regarding how they create Flora content from the data in the table:
- The first column name specifies the class name of all the data in the table.
- All the other columns specify property names to use for the data in those columns.
- Each row in the table corresponds to one individual.
- In the first column, the individual name is found.
- Each of the other columns specifies a property value of the property corresponding to that column.
- For each cell,
- Remove any double quotes around the value.
- If the cell contents can be interpreted as a number, use the number value.
- If the cell contents contains a ‘#’ character, use the value as is (i.e. it is an individual name).
- If the cell contents can be interpreted as a currency, use the numeric value of the currency amount.
- If the value is a date/time, treat it as a Flora
\dateTime
value. - Otherwise, use the default prefix followed by ‘#’ and the cell value.
In addition, the CSV importer assumes that each imported file may only contains one table.
2.8.3.1. Connecting to a live SQL database¶
TBD by Daniel?
2.8.4. XML and JSON support¶
FloraTerm
and QueryResult
objects can be written to, and read
from both XML and JSON formats.
To parse the serialized forms, use FloraXMLParser
and
FloraJsonParser
. These both have a parseFloraTerm
method. The
version in the XML parser takes a DOM Element
, whereas the JSON
parser expects an ObjectNode
(from the Jackson JSON library). As
the name implies, these methods return a FloraTerm corresponding to
the XML/JSON encoding. For QueryResult
objects, two constructors
in that class (for XML and JSON) data, respectively) are used to parse
the serialized versions.
Similarly, FloraTerm
and QueryResult
objects can be serialized
to XML/JSON using the toXml()
and toJson()
methods.
In addition to all this, there is also a JSON interface to the
reasoner engine itself, specified in the abstract FloraJsonClient
class. This is implemented by the FloraZMQClient
class.