The Herbs inference engine is based on Jess, the acronym of "Java Expert Shell System"*. Jess supports the development of rule-based expert systems which can be tightly coupled to code written in Java.
Actually, Jess is at the same time:
1. a scripting environnement. This programmer's library entirely written in JAVA serves:
as an interpreter for the Jess langage (originally inspired by the CLIPS expert system, a specialised form of LIPS).
to create Java objects and call Java methods without compiling any Java code.
2. a tool for building software called expert systems that has the capacity to reason using knowledge you supply on the form of facts and rules.
In the context of HERBS 3 sets of facts can be distinguished:
Facts which characterise the microbial organisms For example, taxonomic and metabolic facts on Haemophilus influenzae:
Facts corresponding to the proteins families (ex: MF_01200) predicted or observed in the organism under study: (observed (MF_number MF_01200))
(observed (MF_number MF_00183))
Facts defining microbial metabolic pathways. As an example, the following set of facts corresponds to the definition of a pyrimidine biosynthesis pathway. This pathway is subdivided into steps, each of them involving proteins (e.g. MF_01211). These proteins can be linked by the logical connectors "and" and "or".
(define pyrimidine_biosynthesis -> and pyr-bios-step1 pyr-bios-step2 pyr-bios-step3 pyr-bios-step4 pyr-bios-step5 pyr-bios-step6)
(define pyr-bios-step1 -> and MF_01209 MF_01210)
(define pyr-bios-step2 -> and MF_00001)
(define pyr-bios-step3 -> or MF_00220)
(define pyr-bios-step3 for bacteria -> or MF_00219)
(define pyr-bios-step4 -> or pyr-bios-step4-complex)
(define pyr-bios-step4 for bacteria -> or MF_00225)
(define pyr-bios-step4-complex -> and MF_00224 MF_01211 ) (define pyr-bios-step5 -> and MF_01208)
(define pyr-bios-step6 -> or MF_01200 MF_01215)
All these facts are translated in the Jess language in "required facts" in the following form: ex: (require MF_01208).
All the facts that are not "observed" but that are potentially involved in a process of interest in the organism under study are declared as "absent". For instance: (absent MF_01208).
Besides specific rules required by the HERBS engine have been defined in the Jess langage. Rules can infer new facts if all the facts in their left hand side (LHS) are known, i.e. present in the knowledge base.
A Jess rule has 2 parts separated by the "=>" symbol (which correspond to "then"). The first part consist of the LHS pattern (used by the engine to match facts in the knowledge base); the second one consist of a RHS action (functions call).
Thus 2 groups of inference rules have been defined in order:
to check the consistency of annotation at proteome level. In this case rules are pointing out the missing, unexpected and ambiguous proteins (cf Herbs : checking the consistency of proteome annotations). The following example is a rule allowing the detection of missing proteins:
and (require ?x) (absent ?x)
(assert (missing ?x)))
For instance if the following facts are in the database: (require MF_01208) and (absent MF_1208), then the rule engine deduce a new fact which is: (missing MF_1208) in the organism under study.
to allow the user to progress in the knowledge of metabolic pathways. The user can then choose different inference criteria. According to this, hypothesis can be tested and if they are validated new facts can be added to the knowledge base.
The following example is an inference rule designed to decide to take into account proteins as "observed" or "absent" data. A profile match has been defined for the HAMAP Microbial Familly "?mf". If the score "?s" of the match of the protein against the profile of this familly is higher than the specified "the-cuttoff", then a new fact is infered: (observed ?mf).
(declare (salience 50))
(data (MF_number ?mf)(trusted_cutoff ?t)(member_cutoff ?m)(noise_cutoff ?n) (score ?s)(status ?st))
(test (or (and (eq ?*the-cutoff* "trusted_cutoff") (> ?s ?t))
(and (eq ?*the-cutoff* "member_cutoff") (> ?s ?m))
(and (eq ?*the-cutoff* "noise_cutoff") (> ?s ?n))) )
(assert (observed (MF_number ?mf)(data-status ?st) (status Observed) (source cutoff)))
Note: assert is a function call.
Jess can be licensed for commercial use, and is available for academic use. Jess is a registered trademark of Sandia National Laboratories. Java and all Java-based marks are trademarks or registered trademarks of Sun Microsystems, Inc. in the U.S. and other countries.