A set of rules is triggered in Tolven whenever a document is submitted by a Tolven application or received from an external source. Document procesing is accomplished in two phases. First, the document is captured and reliably stored in a database. This step says nothing about the validity of the content of the document, only that the document (message) as a whole has been permanently saved. The second phase, generally referred to as document processing is concerned with what to do with the document. This is where rules come in. The rest of this paper is about what happens in phase two – when the document is de-queued for processing.
Tolven guarantees that rules for a single document are serialized and occur within a single unit of work (transaction) that either completely works or is completely rolled back.
A rule can and often does create new documents. Just like the initial document, new documents created from a rule are first captured reliably and queued for processing which will occur independenly. Spawning be used to modularize tasks. The most important example of document spawning in tolven applications has to do with the movement of patient information between provider organizations and the patient (PHR). For example, when a provider completes a referral transaction recorded in his/her own account, that document, or parts of it may need to be forward to the patient or the referral provider.
The initial fact asserted into working memory is the document queued for evaluation. Three other global “facts” are available to rules:
- Index (MenuData)
- Metadata (MenuStructure)
These facts are special in that they cannot be retracted and, technically, they were never formally asserted as other facts are. They simply exist as far as rules exist.
The Account fact (object) leads us to everything we know about the account or that is related to the account that owns the document being asserted. An account is generally associated with either a family, a clinic or a hospital department. An account can also be associated with a whole hospital or whole institution but in that case, a lot of patients would be lumped into a fairly large pool which may not be desirable. Since the highest level of security in Tolven is applied at the account level, Tolven recommends that accounts be relatively small – say less than 50,000 patients and that in an enterprise, different departments or units exchange patient data formally. An account will typically have several users. For example, a family might have both parents as users. A clinic might include everyone needing access to patient data. (A user can be a member of more than one account althought that detail is not covered here.)
The vocabulary object represents everything we know about medical science, albeit limited to the scope of the underlying vocabularies. It is, however, important to distinguish between instance data, such as what is going on with a specific patient, and general knowledge, such as what could go wrong with patients in general or in specific categories (cohorts). This object provides access to general knowldge only.
The asserted document tells us everything about the state change (event) we are about to process. One might deduce from this point that all changes to the Tolven repository are initiated by documents. That is true, but more precisely, rules perform all state changes, and the only way to trigger a rule firing is through the creation and processing of a document. All persistent state changes must then be communicated to the database. The scope of the database in the context of rules is limited to one account and then only selected database data. Rules do not have carte blanche to modify the database at large. Technically, the “database” that can be modified by rules is constrained to the “app” schema. Rules are limited to a single account when reading data. Thus a rule executing in one account has no access to data in another account. This constraint is enforced at a fundemental level in Tolven.
The index data created by rules can, in large part, be controlled by declarative metadata. A simple example might be the name of a particular tab or menu item in the application. Thus, MenuStructure data can be considered a help of rules. They work together to build index – MenuData.
From these initial facts, everything else can be determined. We will see how that unfolds shortly. But it is important to note that patient, problem, medication, observation or anything specific about the situations we are interested in have not yet been asserted into working memory. Indeed, many objects you might consider intrinsic to a system like this are actually soft-coded as will be explained shortly. For example patient, while common, is not an intrinsic entity. Public health may focus on a case where patient plays only a minor role, if any. Research may focus on subject or cohort. So, in Tolven, the central, organizing concept (patient, encounter, cohort, etc) can vary by Account. Nevertheless, since the inital application examples in Tolven revolve aroun personal and clinical health records, patient will play a key role in many of these examples.
Tolven Rules Syntax
Tolven rules are composed of a condition and an action (a when and a then).
rule "a rule name" when aFact( ) anotherFact( ) then <do something if conditions are all true>; end
The condition evaluates facts asserted into the working memory of the rules engine. The actions that are carried out usually involve asserting or retracting facts. Working memory is a short-lived collection of facts upon which rules operate.
Processing a Document
Arguably, most documents contain more than a single fact. So, while an individual document event (a fact) is what triggers the rules to be run, most of what actually goes on in the rules has to do with the contents of the document. To keep things simple, we’ll assume that the document we are processing is about a single patient. However, the document may contain a list of patient problems. Each problem should be dealt with individually.
To accomplish this in rules, the first set of rules will be concerned with “exploding” the document. This process is very dependent on the structure of the specific document type. In all of these examples, we will use ASTM’s Continuity of Care Record (CCR) although the situation is nearly identical when processing HL7 or other format.
Any type of document can be processed in this way but we will ignore other document types for the time being.
During the explosion process we will take care to avoid any policy or provider-specific rules. All we want to do at this stage is determine what “facts” can be extracted from the document. So, while the seeded fact asserted by the system was “here is a document,” the explosion rule says “there is a document of type CCR with a version number of “V1.0″. Its action will be to identify patient, problems, medications, results, etc from that document asserting each one as a fact.
Since this is mostly mechanical and in the case of CCR is standardized, we will use the rules provided by Tolven and just assume that the individual facts about the doucment have been asserted. From there we can deal with facts specific to the domain without concern for the original document from which the facts are extracted.
|Class||Class being wrapped
|PatientType||ActorType||Information about the patient|
|AlertType||Alerts such as allergies|
|EncounterType||Information about encounters|
|FamilyHistoryType||Family history observations|
|FunctionType||functional Status of the patient|
|ResultType||Results, including observations|
|SupportProviderType||ActorType||Support providers such as family|
|VitalSignType||ResultType||Results specifically identified as Vital Signs.|
|ToType||ActorType||Who (or what) the CCR is to|
|FromType||ActorType||Who (or what) the CCR is from|
|PurposeType||Purpose of the document|
Most of these facts can occur zero or more times in a document. The condition part of rules pays no attention to how many times a fact occurs. The rule engine will run each rule for each applicable fact. In other words, rules intrinsically handle multiple occurances of a single type of fact.
If a fact is asserted but there is no rule needing that fact then the fact is effentively ignored. We will now write some rules that pay attention to some of these facts.
Normalization to Archetypes
The level of document decomposition accomplished so far is still insufficient for most decision support. For example, there’s little to be understood knowing that the patient has one or more Social History entries. It is necessary to know more details about specific Social History items. The first step in this process is to add the description of the social history item (preferrably in coded form) to the mix. For example,
rule 'Smoker Archetype Event" when $result : ResultType( $title : descriptionText, type == 'Microbiology", flag == 'Abnormal' ) $test : TestType( ) $list : List(name=="NewResults") then account.addListItem( $list, $result );
In the example so far, we’ve asserted a fact that the patient is a smoker based on the document we’re processing. But it says nothing about what we already know about the patient – perhaps we already knew that the patient was a smoker.
There is a long list of Archetypes and Tolven has established a separate wiki to support the development of these archetypes. So we don’t need to go into a lot of detail here.
Glasgow Coma Scale
Reasoning over the patient
Archetypes become very powerful when we begin to reason over the patient as a whole rather than just individual events. this kind of reasoning doesn’t usually care about documents. Rules are interested in, for example, if the patient has a certain diagnosis. It doesn’t matter where the fact came from or what message format was used.
Remember that the core Tolven framework doesn’t know that the current document is about a patient or that patient is even an important concept. So when we use the term patient here, we could just as easily be referring to a public health case, clinical trial subject, or a research cohort. But we’ll use patient as a common example in this paper.
Asserting Relevant External Facts
One way to reason over the patient would to to assert all facts already known about a patient into working memory in addition to the new facts we extracted from the current document. One might even imagine asserting facts about all patients into working memory and reasoning over everything! We could even assert all 6 million terms in UMLS! There are many technical and functional problems with this approach so we will just dismiss it as unworkable.
Instead, we’ll focus on what will work: working memory will center around a single document “event” which we have exploded to a number of smaller facts and further expanded to appropriate archetypes. The next step will be to pull in as many other facts as appropriate to correclty process that one event. This will involve several other kinds of facts:
- Vocabulary Concepts
- Application Metadata
- Historical facts
For each of these categories of facts, we will only acquire those additional facts needed to assimilate the new facts asserted from the document.
Since application metadata is essentially a rule helper, we can and should simple load all of this metadata into working memory. We might consider the rules that deal with metadata “meta rules” although the rules engine doesn’t know the difference.
rule 'Load application metadata" when $account : Account( ) then $account.assertAll();
The function here is simple: Since an account is always asserted as a fact, we always call a function called AssertAll() which asserts each known metadata item for the account. The system could have done this for us initially but we keep it here so rules have control over the process.
Once we add some meta-rules, a bunch of the boring rules we are writing as illustrations can be removed. For example, if metadata defines a list of selected patients on the account, then the meta rule simply checks if the current patient qualifies for inclusion on that list based on criteria supplied in the metadata.
In effect, these meta-rules harvest information from the current document as needed to add, change or remove data from indexes (MenuData). For illustation, we will write a few of these rules “the hard way” in order to explain what is functionally happening. Each of these rules is a “scout” looking for facts from the document (event) to add to the index that that rule represents.
rule 'Add patient to all patient list" when $pat : Patient( ) then assert( new MenuData( ) menuData.assertAll();
Persisting Archetype Instances
Now that we’ve extracted archtypes from the document and asserted these archetype instances into working memory, one of the things we might do with archtype instances is save them in a database. This could be very handy for various purposes including comparison in future transaction for this patient, graphs,
Indentifying the Patient
The document usually identifies one or more actors such as the Patient and maybe the Provider, support providers, and non-people actors like hospitals, labs, pharmacies, etc. We’ll focus on the Patient.
To accomplish this we will simply keep “exploding” the contents of the document continuing at the archetypes we just created.
Processing Vocabulary Concepts
Rather than directly putting an item on a particular list, instead, we will assert a new fact call a list candidate. This fact says that some rule has decided to put some piece of information onto a list of some kind.
Avoid Incorrect Conditions
Say we want a rule that looks for a blood pressure observations where the systollic is above 150 and the disastolic is above 100. If a reading is 150 / 90, then it does not match. A reading of 140 / 110 would not match. A reading of 160 / 110 would match because both values exceed their respective thresholds.
If you try coding this rule as follows, you’ll get disasertous results:
when $sys: BPSystolic( testResult > 150) $dia: BPDiastolic( testResult > 100) then ...
What this rule does is find all systolic readings over 150 and all diastolic readings over 100 and then pairs each one with the other yielding what relational database developers are very familiar with, a cartesian product. If you didn’t realize this error, you would think that the rules engine was broken. It’s not, it just can’t read your mind.
To correct this rule, we have to tie the individual readings to the overall BP reading.
when $bp: BloodPressure( ) $sys: BPSystolic( parent == $bp, testResult > 150) $dia: BPDiastolic( parent == $bp, testResult > 100) then ...
This rule says what we intend:
- There is a blood pressure.
- There is a systolic that has this BP as a parent and has a value over 150
- There is a diastolic that has this BP as a parent and has a value over 100
In other words, we need to deal with whole blood pressure observations, not just the individual components.
When the rules have finished processing a document (event), there can be some new facts asserted that will have an effect on the database. For example, we may have decided the patient to certain lists because of the contents of the document. One or more facts might even cause new documents (events) to be created. Toward the end of the process, we’ll show how to gather up these facts so Tolven knows to “put them where they belong”. But for know, we just need to create them. At his point, our rules are doing more than just analyziing the current event. the following are examples of things we might do:
This rule puts a new result on a results review list belonging to the user who ordered the test. The following rule says “there is a new result, there is a patient, there is a “to” actor (person or organization), and there is a new results list; if those conditions are true, put the result on the list.”
when $result : ResultType( $title : descriptionText ) $patient : PatientType( ) $to : ToType( ) $list : List(name=="NewResults") then account.addListItem( $list, $patient, $result, $to );
Security note: Tolven’s highest level of security is provided at the document level using encryption. Extracting information from a document for use on various lists, as we have just done, makes a system useful but at the same time slightly increases the risk of a breach because lists, while limited to authorized users, are not encrypted. This is not intended to discourage you from using patient identifiable facts from a document, only that you should only do so when there is a justifiable need.
We will improve this rule shortly but for now, it illustrates how input data is used.
In some cases, the data associated with the patient may be useful without the patient’s identity. The rule engine makes that quite easy to accomplish. Starting with the previous rule, trim it down so that it does not include any patient information (we will remove provider information also). We might do this, for example, to keep a list of positive microbiology results for hospital infectious disease monitoring. In other words, this particular list does not require anything more than the date, time and name of a particular positive result. And while we may in fact what to extract some general demographic information (zip code, gender, age range) we will just keep this example simple and not include any information about the patient at all.
rule 'Infectious disease monitor" when $result : ResultType( $title : descriptionText, type == 'Microbiology", flag == 'Abnormal' ) $test : TestType( ) $list : List(name=="NewResults") then account.addListItem( $list, $result );
This rule says “there is a result with a type of “Microbology”
As shown here, facts do not need to use patient identity at all making facts easily anonymized.
Adding the Patient to a List
While the resulting list might be useful, it might also be nice to keep a similar list by patient. Thus, we should be able to view a list of results for a specific patient, regardless of who ordered the test. So let us add another rule to take care of that requirement:
when $pat : Patient( type=="patient") $doc : Document( type=="lab result", subject==$pat ) $list : List(name=="Patient Results") then account.addListItem( $list, $doc );
This rule reads: “There is a patient, there is a document containing lab results for that patient, there is a patient results list; therefore, create an entry on the patient results list for the specific patient and result.” Some of these rules will be very specific at first, but later we can accomplish much more by broadening the scope of the fact(s) we are looking for. For example, we are limiting our selection of subjects to those with a type of patient. If we removed the constraint on subject type, the rule would still work but would apply to all subjects, regardless of their type.
Maintain a Disease Registry
A rule may look for a specific diagnosis in a CCR document and populate a disease registry. This example requires the use of “truth maintenance” rules. If one patient has many documents relating to a specific disease, there is no need to put that patient on the registry list more than once. Instead, each occurrence of the diagnosis simply “justifies” a single contribution to the registry. Here is an example (with some missing pieces and some liberties taken that we will resolve later): This rule says “there is a patient, there is a diabetes registry, there is a diabetes diagnosis and there is a document containing that diagnosis.”
when $pat : Subject( type=="patient") $list : List(name=="Diabetes Registry") $dx : Concept( name=="diabetes") $doc : Document( contains $dx ) then account.addListItem( $list, $pat );
The rules so far treat the assertion of new facts as irreversible: Once the fact is asserted, it appears there is no clear way to know when it should be removed except perhaps by some explicit event that says “remove patient from diabetes registry.” If, though, we were to retract the document, there is no clear connection between the document and the fact that the patient is on the diabetes registry; without that connection, we would not know that the patient should be removed from the registry. In the next example, we will see how this situation is handled.
You will notice that while “diagnosis” is used in the condition, it is not used at all in the consequence: Once we know that the patient is to be placed on the diabetes registry, we do not need to remember that the patient has diabetes. In fact, a user may want to keep patients with a confirmed diagnosis, as well as patients with symptoms of diabetes but no “confirmed” diagnosis on the same diabetes registry. This leads to the next level of problem that rules can help us with. Consider the following criteria for being put on the diabetes registry (this is intentionally oversimplified):
- Most recent serum glucose over 200 mg/dL
- Urinary tract infection
It is important to realize that these two facts are not likely to arrive in the same document or even at the same time. So there is no single document (event) we could instrument in order to harvest this information. (This problem is difficult or impossible to address in federated systems or pure message-based systems.)
First is the problem of keeping track of the difference between state-change facts, those in the document being processed versus historical facts, facts known prior to the arrival of the document. While we do want to “reason over the patient”, not just the document, dumping all facts into a large pool means we need to detecting “state changes” which is generally what we are interesting in processing, would become difficult.
Second, the sheer number of facts needed in working memory can grow out of hand very quickly thus resulting an slow execution and/or failing due to running out of memory. the following shows how it might look doing it the wrong way.
The first condition (line 2) is reasonable, since we have established that the rule execution starts with a document fact that is most likely about a single or small number of patients. The condition on line 3 is also still reasonable given that there are problably no more than hundreds of lists in an Account that would need to be asserted as facts.
1 when 2 $pat : Subject( type=="patient") 3 $list : List(name=="Diabetes Registry") 4 Concept( name=="Urinary tract infection", $utiCode : code) 5 Concept( name=="serum glucose", $gluCode : code) 6 ResultType( code==$utiCode) 7 ResultType( code==$gluCode, testResult.value > 200) 8 then 9 account.addListItem( $list, $pat )
Line 4 and 5 take us over the threshold: these conditions are looking for specific concept facts. Looking for a concept this way would require that we first assert every concept into working memory. UMLS has over a million. And if we need alternate text and codes, we will need all 6 million atoms asserted as facts.
The conditions on line 6 and 7 also suggest that every result for the patient be asserted into working memory as facts. While not millions, it could be thousands – too many to efficiently load from the database each time a new event is processed.
Rule engines work most efficiently when a relatively small number of facts are subject to a large number of rules. Databases work most effiently when a large number of facts are subject to a relatively small number of “rules”. Without speculating on a one-size-fits-all solutions, the fix to the problem is to remove the need to assert all concept facts into working memory. Instead, we only assert the concept facts that we need.
Facts asserted into working memory are arranged for efficient rule execution. but an eval is simply evaluated each time the rule is considered for firing. For this reason, evals should always be at or near the end of the list of conditions in a rule so that the other conditions reduce the frequency of the eval condition being evaluated. This rule reads: “there is a patient, there is an elevated glucose list, there is no list item for the elevated glucose list and this patient, the database has a ListItem for the elevated glucose list and the patient, therefore remove that list item from the database.”
or even historical facts about the patient
Associate Result with Disease Registry
In any case, we are not going to actually use this rule so we can ignore the other problems it has. But the problem to focus on is that the rule requires all documents to be asserted into working memory. The assertion of hundreds or thousands of documents for a patient, accumulated over many years, would become very slow. Instead, we can take advantage of intermediate (summary) data that may or may not ever be viewed but exists nonetheless. This allows us some additional control and modularity within the decision making process. The rule above will be revised to look at “summary” data about the patient, not documents. Then we will show how the summary data itself is created and maintained.
when $pat : Subject( type=="patient") $list : List(name=="Diabetes Registry") Summary( subject==$pat, key=="Elevated glucose") Summary( subject==$pat, key=="Presence of UTI") then account.addListItem( $list, $pat );
This rule says “There is a patient, there is a diabetes registry, there is a summary item for the patient reflecting elevated glucose, and there is a summary item for the (same) patient reflecting the presence of urinary tract infection. If all of that is true, then put the patient on the disease registry. Remember that we have another rule that might justify the same patient on the same registry list due to the presence of a diabetes diagnosis. This rule will not interfere with that other rule.
Security note: The outer boundary of a rule is the account so there is no need to explicitly state the fact that the patient must be”owned” by a particular account. Security, in this regard, is intrinsic.
You might notice that this rule has no reference to a document, yet it is documents that trigger the rule engine to go to work. Also, it appears that we will reevaluate this situation after every document is processed. Every document? Including documents for other patients? What about documents that have nothing to do with patients? Well, it may appear that way. But this is a good time to explain where the patient fact comes from. Remember, patient is not asserted initially. Here is another little rule that gets fired early on in the evaluation process:
when $doc : Document($subj : subject) then assert (new Subject( $subj) )
This rule asserts a subject (patient) fact if the document has one. If we were processing a document not about a patient, any rules referencing a patient fact in the condition would not fire — very convenient. If the document we were processing referenced more than one patient, all those rules referencing a patient would fire more than once, depending on the other conditions. For example, if a document was describing a blood transfusion between two known patients, then it might say patient A’s blood type is O negative and patient B’s blood type is O positive. But multi-patient scenarios can be very complex. One document about two patients will likely carry information about both patients and the rules should not mix the two. Asserting two patients and two blood types without writing the rules correctly results in matching both patients to both blood types! We will come back to that issue later.
In practice, we will need to extract more facts from the document. Indeed, this part of rule processing is the most tedious, especially when there are many different document formats. On the other hand, a complex document may only have a small bit of information we actually need to reason over. As a general architectural principle, we will try to normalize what we do extract from a document so that the same domain rules continue to work as we harvest facts from documents arriving in different standard or proprietary formats.
As promised, we will now look at how summary data is maintained. We will just look at one summary item because the approach is the same for the others, just with different conditions. We evaluate the new document for information suggesting that the patient should be on the “elevated glucose” list (we do not need to display this list). There may be other factors that put the patient on or remove the patient from a particular list, but that does not matter: If seven rules say that the patient should be on a certain list, that is as valid as one rule saying so.
when $pat : Subject( type=="patient") $list : List(name=="Elevated Glucose") $glu : Concept( name=="serum glucose") Document( subject==$pat, contains $glu > 200) then account.assertSummary( $list, $pat );
You will notice that this rule does not care if the patient is already on the list. We certainly do not want the patient to show up on the same list twice so we will make a small enhancement to the rule to fix this problem and make the rules run a bit faster: We include a condition saying the patient is NOT on the list (this enhancement will be short lived, but it is appropriate here).
when $pat : Subject( type=="patient") $list : List(name=="Elevated Glucose") not Summary( $list, $pat ) $glu : Concept( name=="serum glucose") Document( subject==$pat, contains $glu > 200) then account.assertSummary( $list, $pat );
Now, the action only occurs if the conditions are right to put the patient on the list and if the patient is currently not on the list. But we are not done. How does the patient ever get off the list? We will get to that shortly. But first…
It is time for another little rule that makes this all work. You will notice that the not summary() condition assumes we already know about summaries. But where did the summaries come from? The following rule says if we have information about a patient, we should assert all the summary items relating to that patient. Now, this could involve dozens of summary items, but it will be many fewer items than the number of raw documents for the patient. Also, the summary items are each fairly small. While there are ways to further reduce this size of this list, the operation is quite efficient the way it is.
when $pat : Subject( type=="patient") then account.assertSummaries( $pat );
As before, this rule only fires if there is a patient. It fires more than once if there is more than one patient.
Next we are going to enhance this summary item to include the date of the last observation. Remember we said the “most recent” serum glucose. By storing this date, we can determine if this new fact is newer than what is already there.
We never depend on the order of processing for this type of decision. Old results can be back-loaded from a system just being brought online or arriving out of order simply due to business processing delays. Each observation should have an effective date, what you can think of as a biological date, when the specimen was taken from the patient, which is different from when the measurement was done or when that measurement arrived into the information system.
Adding a date (or a result value if we were keeping track of average or maximum values) to the summary will mean that this summary item needs to be modified. When we get a newer result, we will update the summary item with the newer date. Modifying facts like this takes a bit more care. First, when a fact is modified, it effectively must be reevaluated since there may be other rules dependent on the details of that fact. So we can call it “modify” but in fact it is the same as a ‘retract from’ and a fresh assertion of the fact into working memory. We do not need to rebuild the fact itself; we just change it in place, which is good because deleting and recreating persistent data could otherwise cause undesired side-effects in the database. While redesigning this rule, we are going to add additional modularization by breaking it up into three separate rules. Two rules will only be concerned with maintaining the most recent serum glucose observation for the patient. These rules are completely objective and can be used for many subsequent purposes. The first rule creates a summary item if it does not exist. The second rule updates the date and value in the existing summary if the result we are evaluating is newer than the date in the summary.
when $pat : Subject( type=="patient") $list : List(name=="Most Recent Glucose") not Summary( $list, $pat ) $glu : Concept( name=="serum glucose") Document( subject==$pat, contains $glu, $date : effectiveDate, $val : value ) then account.assertSummary( $list, $pat, $date, $val )
when $pat : Subject( type=="patient") $list : List(name=="Most Recent Glucose") not Summary( $list, $pat ) $glu : Concept( name=="serum glucose") Document( subject==$pat, contains $glu, $date : effectiveDate, $val : value ) $item : Summary( list==$list, subject==$pat, date < $date ) then $item.setDate( $date ); $item.setValue( $val ); account.modifySummary( $item );
The first rule condition says: “There is a patient, there is a most recent glucose list, there is no applicable summary item for this patient, there is a serum glucose concept, there is a document containing a glucose observation for the patient.” The action then asserts a new summary item. The second rule says: “There is a patient, there is a most recent glucose list, there is a serum glucose concept, there is a document containing a glucose observation for the patient, there is a summary item for the patient and its date is older than the date of the glucose observation in the document.” The action will update the date and value in that summary item. In so doing, this new or updated fact will trigger other rules such as the following:
when $pat : Subject( type=="patient") $eg : List(name=="Elevated Glucose") $mrglu : List(name=="Most Recent Glucose") Summary( list==$mrglu, subject==$pat, value > 200 mg/dL ) then account.assertListItem( $eg, $pat );
This rules reads: “There is a patient, there is an elevated glucose list, there is a most recent glucose list, and there is a most recent glucose summary item where the patient’s value is over 200.” The consequence is to assert that the patient should be on the elevated glucose list. If the most recent glucose reading falls to 200 or below, the patient should be removed from the elevated glucose list. From this rule’s perspective, there is no concept of removing the patient from the list. Therefore, subsequent “displays” of the list will not include that patient until a new glucose value exceeds 200. The action simply does not fire if the patient should not be on the list. That is all well and good, but the database may already have a ListItem from a previous run of the rules. We saw earlier a rule asking the account to assert all summaries relating to a patient into working memory so that we could make decisions about what we already know about that item. There is a similar capability for list items, but we do not want to use that feature for this particular scenario.
We have established that facts in the database cannot all be efficiently asserted into working memory. Yet the bookkeeping detail of whether a database fact is or is not in working memory should have no effect on the outcome of rules if we can avoid it. For example, we could find that we need a database item asserted in working memory because we want to make a decision about it, while at the same time we may need to keep track of the lack of an assertion of that same fact. Tolven provides a mechanism that keeps this bookkeeping detail from negatively affecting the outcome of our rule processing. Returning to our example, the rules, as written so far, tell us that the patient should be on the list but nothing about if they were already on the list.
Let us say we wanted to congratulate patients for getting their glucose below 200. We can not simply ask if the patient is not on the list and ship an email. The patient may have never been on the list, so our congratulations would flow continuously to an unwitting (and undeserving) patient. What we are looking for is a transition in the elevated glucose list in which the patient goes from being on the list to being off the list. But we need to avoid “transient transitions” i.e., transitions that occur due to fact modifications (retract then assert again) during rule processing. Salience will help us with this problem.
A salience value is used in rules to control the order of processing. A lower integer value is a way for a rule to say “other(s) can go before me.” The default value is zero, meaning that if we wanted other eligible rules to go first and settle down before we fire this rule, we would specify a negative value. The numbers are relative so the actual values are unimportant as long as, when two rules would otherwise be selected for execution at the same time, the rule with the greatest salience value goes first.
We are now ready to write our transition rule that deals with what the database has and what working memory knows about the patient’s membership on the Elevated Glucose list. It is very important that this rule only fire after other rules have fired so we have assigned it a salience of -100 (assuming all other rules have a default salience of 0).
rule "remove EG list items" salience -100 when $pat : Subject( type=="patient") $eg : List(name=="Elevated Glucose") not ListItem( list==$eg, subject==$pat ) eval(database.hasListItem( $eg, $pat )) then database.removeListItem( $eg, $pat );
The account.removeListItem action does not immediately remove the item from the database. That will happen after the rules are completed. But the action does affect the outcome of the eval(database.hasListItem( $eg, $pat )) test which will return “false” from now on. This activity is not the same as fact assertion and retraction because neither of these activities is a fact per se. For example, you would not be able to find a removeListItem fact and so you would be unable to retract that fact. This point is important when we consider the problem of sending a congratulations memo. Again, we are looking for the case where the patient was on the list and is going to be removed from the list.
rule "EG list Congrats" when $pat : Subject( type=="patient") $eg : List(name=="Elevated Glucose") not ListItem( list==$eg, subject==$pat ) eval(database.hasListItem( $eg, $pat )) then account.sendEmail( $pat, "congrats", <template> );
If this rule were to be evaluated after the “remove EG list items” rule fires, it would not fire because the eval condition would find the database to not have the patient on the list any longer. But we took care of that by using the salience value in the previous rule to ensure that the removal happens after this rule has had a chance to fire. It is a good idea to avoid sequencing in rules because it makes them brittle. But in this case, we have a very real-world problem, synchronizing the pristine and isolated working memory world with the larger long-term-memory database world to justify it.
Since we went to the trouble of figuring out when to remove items from the database, we should also handle the new items not yet in the database. We will use a similar technique:
rule "Add EG list items" salience -100 when $pat : Subject( type=="patient") $eg : List(name=="Elevated Glucose") $item : ListItem( list==$eg, subject==$pat ) eval(!database.hasListItem( $eg, $pat )) then database.addListItem( $item );
This rule says: “there is a patient, there is an elevated glucose list, there is a list item (in working memory) for this patient and list, there is no corresponding ListItem entry in the database, therefore, add the list item to the database.
Now we will consider a simplification of these database rules. All lists should probably be maintained in the database, not just Elevated Glucose. Rather than creating a separate rule for each list we maintain, we can do it in just two general-purpose rules:
rule "Add list items" salience -100 when $subj : Subject() $list : List() $item : ListItem( list==$list, subject=$subj ) eval(!database.hasListItem( $item )) then database.addListItem( $item );
rule "Remove list items" salience -100 when $subj : Subject() $list : List() not ListItem( list==$list, subject=$subj ) eval(database.hasListItem( $eg, $pat )) then database.removeListItem( $eg, $pat );
As before, the first rule says “There is a subject, there is a list, there is a list item for this list and subject (in working memory), there is no list item for this list and subject in the database, therefore add the list item to the database.” The second rule says “There is a subject, there is a list, there is no list item for this list and subject (in working memory), there is a list item for this list and subject in the database, therefore remove the list item from the database.” These rules now have nothing to do with domain rules so we can relegate them to plumbing and keep them in a separate rule package.
With this background, it is now time to dig a little deeper into the event processing associated with documents. The lifecycle of a Tolven document is simple but important. During the editing phase, a document is mutable and typically has little, if any, effect on other data. It would normally be inappropriate to draw any inference based on data in an unfinished document. A document remains in the editing state during, for example, a multi-page wizard process. While the user may simply be moving from field to field or page to page, the system may be persisting that document in anticipation of the user either getting called away or ultimately pressing a “finish” button. Because it is mutable, the document should have little, if any, effect on other parts of the system. Once the application transitions the state of the document to “active” it becomes immutable, and this is the state of the document to which most rules apply. When a message is received from an external system, it usually skips the “editing” state, and the document is created in the “active” state from the outset.
Up until this point in the discussion, we have been careless in all of our rules that reference a document fact. The first rule we created is a good example. What we now really want it to say is “”there is a new active document and there is a new results list, therefore, put the former on the latter.”
when $doc : Document( type=="lab result", state=="active" ) $list : List(name=="New Results") then account.addListItem( $list, $doc );
If the document is not active, we ignore it. Tolven will assert the document again if its state changes, so we will catch it at that time. But before we make a mass change to add this to all our rules, what about the rest of the document lifecycle? After all, a document could have been in error, yet it caused things to happen in the system.
While documents may be revised, amended and go through a long, complex approval process, that is a variety of lifecycle “above” the lower-level lifecycle described here. Indeed, different Tolven accounts will probably have different document handling processes, perhaps themselves implemented in rules and workflow. Consider a simple revision scenario: a real-world document is created and acted upon by rules. For example, we put the patient on a diabetes disease registry because of a diagnosis found in that document. If it is later determined that the document was actually for a different patient, we will want to remove this patient from the disease registry (and perhaps place the other patient on the registry). But there are problems with that approach:
- The document is immutable so we cannot fix the document itself. Even if we could, that will not get this patient removed from the registry (or the other one added to the registry).
- Our second problem relates to truth maintenance: If one fact in one document is (somehow) retracted, it only reduces the justification for the patient being on the registry. There may be other documents also saying the patient is diabetic and they may be correct.
- The third problem is that retraction of a fact without consequences is rare in the real world: Someone may have taken action on the fact that the patient is on that registry. It may not be sufficient to simply remove the patient from the registry hoping that “no one noticed.”
To resolve this issue, our rules will have to be a little fancier. We did not say it earlier, but once a document is in the active state, it stays that way, even if it contains incorrect information. It will take another document to override, reverse, cancel or otherwise change the effects of the first document. Also, due to the fact that information systems are only an approximation of reality, we could have a document that reverses a real world action (such as canceling an order) where the original order being canceled does not exist in the database. So we will have to watch out for that situation. Of course, rule actions effecting other kinds of data are mutable, so if we had a way to do it rules are certainly allowed to reverse their own assertions. Within the confines of working memory, all that is needed is a retraction. But we are concerned with the database and what people have seen in the real world. We will start with our simplest example, which we have already modified just to pay attention to documents in the active state. We will also create a new rule that reacts to a document containing a nullification (telling us that the original document was incorrect and should be ignored):
when $doc : Document( type=="lab result", state=="active", action=="nullify", $original : original ) $list : List(name=="New Results") then account.removeListItem( $list, $original );
In this rule, we are reacting to the correction. The action, as the name implies, unconditionally removes the original document from the list. The removal action will not fail if the document was never on the list in the first place. Notice that we are checking a new attribute, the event action. State and action have different purposes and we need both. An active state says the document is ready to assert that the original document is to be nullified. So, with the introduction of action, we will have to go back and enhance our first rule as well.
when $doc : Document( type=="lab result", state=="active", action=="new" ) $list : List(name=="New Results") then account.addListItem( $list, $doc );
We did not have to include that test for action in the rule. If we wanted to include all actions on the new results list, that is perfectly OK, too. In fact, for the patient-specific result list, we could do just that; we will not remove anything from the list, just show an ever growing, chronological list of lab result transactions. It may show a document in the list and then later in the list show another document saying to ignore the first document. Which approach you take is a matter of personal end user taste but the decision may be affected by the underlying document architecture. For example, CCR documents do not provide for status changes and corrections, so they would just have to be shown in a cumulative fashion. We have shown how to do it either way now.
We need to review a simple kind of list called the manual list. This is a list in which the user explicitly states that he/she wants an item added to or removed from the list. In general, we have no date to work with and must rely on the processing order. We still depend on documents to effect these changes since the application never directly manipulates any lists.
when Document( type=="manualList", state=="active", action=="add", $list : list ) $subject : Subject() then account.addListItem( $list, $subject );
when Document( type=="manualList", state=="active", action=="remove", $list : list ) $subject : Subject() then account.removeListItem( $list, $subject );
These rules are pretty generic. They let the document tell us which list we are talking about and the subject. If we needed a rule to intervene, perhaps in a specific case, then we can add separate rules for those situations. For example, say we want to send an email to the owner of a new patient list whenever someone manually adds a subject (patient) to that list. In this case, we will focus on the document requesting the patient be added to the list, and we will let other rules actually add the patient to the list.
when Document( type=="manualList", state=="active", action=="add", $list : list ) $subject : Subject() List(id==$list, name=="New Patients", $owner : owner) then account.sendEmail( $owner, $subject, <template> );
This rule says “there is an active document asking that a subject be added to a list, there is a list corresponding to the list mentioned in the document having a name of new patients; if this is all true, then send an email to the owner of the list.” While not obvious in the rule, the list owner is actually a collection of users. But the rule does not execute one time for each owner of the list. So we only send one email but it may go to more than one user/owner.
While it is important to understand how normalization works, this is basic rule “plumbing” that you should not have to worry about if you are writing domain level rules. These rules are declared in a separate rule package that is not concerned with domain rules. Packaging rules in this way provides a clear distinction between the problems of figuring out what it is (normalization), what it means (knowledge) and what you are going to do about it (domain rules).
We have hinted that documents can be complex and varied. It would be wrong to say that a document is a single fact even though we have treated it as such so far. To begin with, even though we suggested otherwise, during document processing there is never more than one document asserted into working memory at any given time. We can create new documents, but new documents will be the focus of attention in their own invocation of the rule environment once they are persisted and queued for rule evaluation. We may even reference other documents by ID, but we will not be looking into those documents. That sets the stage for our next challenge: Extracting the knowledge from a document into working memory. We can safely assume that all the facts we extract from a document are related to that one document. We actually demonstrated this assumption in the rule that extracted the subject from the document:
when $doc : Document($subj : subject) then assert (new Subject( $subj) )
This rule (safely) broke the link between the document and the fact that there is a subject; doing so makes the rules easier to write. Now let us dig further into the document. We will use a CCR document that contains an alert section. We want to pull out the alerts for inclusion on a patient-wide alerts list. First, we need to constrain our rule to only CCR documents. Then we need to dig down a bit to find the alerts. The XML looks something like this in the document, but there are corresponding Java objects which we will use to navigate and reason over.
<ContinuityOfCareRecord xmlns="urn:astm-org:CCR"> <Patient> <ActorID>102</ActorID> </Patient> <Body> <Alerts> <Alert> <Type> <Text>Allergy</Text> </Type> <Description> <Text>Penicillin</Text> <Code> <Value>6369005</Value> <CodingSystem>SNOMED-CT</CodingSystem> <Version></Version> </Code> </Description> </Alert> </Alerts> ... </Body> <Actors> <Actor> <ActorObjectID>102</ActorObjectID> <Person> <Name> <DisplayName>Bob</DisplayName> </Name> </Person> </Actor> </Actors></ContinuityOfCareRecord>
Rules that lead to the extraction of alerts and other components of CCR look something like this:
when Document(type="ccr", $ccr : continuityOfCareRecord ) then assert ($ccr ) when ContinuityOfCareRecord ( $body : body ) then assert ($body )
when Body( $alerts : alerts ) then assert ($alerts )
when Alerts( $alert ) then assertList ($alert ) ... when $alert : AlertType( ) then ... do something with this alert.
The second to last rule gets an attribute called alert which is a list of AlertType objects. A special assertion method is called which will assert each of the AlertType items in the list into working memory. The last rule shows the beginnings of a domain rule that takes some action based on the presence of an alert.
We are going to add one more rule because we are going to stipulate that our domain rules are all based on HL7 RIM Observations and not CCR AlertTypes. So the AlertType assertion will not actually trigger anything useful since no rules are looking for CCR-based facts. You might be tempted to just modify the rules to also look for CCR types, but you should resist that temptation unless there is no other choice. We do have a choice in this case, which is to call a helper method that knows how to convert CCR to RIM, specifically AlertTypes to Observations. Then our domain rules can focus on domain problems and conversion rules can focus on conversion problems.
global CCRtoRIM rimUtil; rule "Convert from a CCR AlertType to RIM Observation" when $alert : AlertType( ) then assert( rimUtil.toAct( $alert ) ); end
If we had a rule that detected certain RIM observations and put them on an alert list, it would now find the converted CCR alert. For example:
rule "Alerts go on patient alert list" when Observation( classCode=="OBS", code=="OINT", $value : value ) $pat : Subject( type=="patient") $list : List(name=="Alerts" ) then account.addListItem( $list, $pat, $value ); end
This rule says: “There is an observation of the correct type to be an Alert and it has a value, there is a patient, there is a list named Alerts. If true, add a list item if it does not already exist on the Alerts list.”
We said in the introduction that we would be taking shortcuts at first. In many of the rules we refer to concepts by a simple name, such as “diabetes.” But that approach has several apparent problems:
- Diabetes is not the full formal name of Diabetes Mellitus, thus it could be confused with other diseases
- There are synonyms for the same exact concept
- There can be the same name for different concepts (resolved by context)
- We never specified “according to which terminology source” when stating the concept
- There are concepts such as Type I and Type II which are specific kinds of diabetes and they may or may not be intended for selection.
The point being that we will need to be much more specific when it comes to specifying concepts and at the same time we will probably have to be much more inclusive when looking for specific concepts in documents. Fortunately, the concept class we have been using has many of the capabilities we need. We will start with another concept mentioned in the introduction:
Concept( name=="Urinary tract infection" )
This condition looks for a concept in working memory by the name specified. But Tolven does not load all 6.4 million unique UMLS concept names into working memory. That would hardly fit the intent of “working memory.” Also, it would do no good to look up a UMLS concept by name during rule processing. In fact, this name is a local name useful only within the rule base in which it is created. We need to understand how that concept fact was created and asserted into working memory and what the concept fact actually has in it. Regarding the name, we could have called this concept harmonica but it would still refer to the concept urinary tract infection underneath. The name is only meaningful to our rules.
A global variable called knowledge is available to the rule base. It is an instance of the Tolven class knowledge. This object will be the font of all concept facts introduced into working memory. The knowledge object will even help a concept perform certain tasks after it has been asserted into working memory. So, early on in the rule evaluation process, we will need to ask the knowledge object to create the concept facts we will need to run our rules.
then knowledge.assertSourceConcept( "Urinary tract infection", "68566005", "SNOMED CT" )
That was relatively painless. So long as everyone that talks to us provides the SNOMED code for UTI, we should be good. But what if we get a fact with an IDC-9 code for UTI? Would we reject it? Should we create another concept based on the ICD-9 code? Actually, there are two mechanisms at work. First, the underlying vocabulary may already know about the mapping between SNOMED CT and ICD-9, in which case there is nothing extra we need to do. But if not, and we wanted to consider both codes to mean the same thing, we simply create two concepts with the same name and the two different source codes. Of course this practice is risky. You should depend on the underlying terminology to provide mapping unless you have no other choice.
So our concept assertions might look like this.
then knowledge.assertSourceConcept( "Urinary tract infection", "68566005", "SNOMED CT" ) knowledge.assertSourceConcept( "Urinary tract infection", "599.0", "ICD-9-CM" )
Now it no longer matters if there is an underlying mapping between the two concepts or not since we have made the mapping explicit by naming both of them the same. One reason we may need to do this is if a code has to be made up for a new concept or because the mapping does not exist in UMLS. Another variation yielding the same result is to use the UMLS concept code. And since both concepts used above map to the same UMLS concept unique identifier, we only need to state it once.
then knowledge.assertSourceConcept( "Urinary tract infection", "C0042029", "UMLS" )
Before you start running for the hills, we are not going to define all concepts this way, only the ones we need for rule evaluation. For example, if we had no need to find UTIs in any rule, we would never need to create the UTI concept in working memory, and that would not prevent us from processing UTIs.
Remember that urinary tract infection is our local name. We have not revealed what either SNOMED-CT or ICD-9 calls these concepts. UMLS has many synonyms for these concepts, so it would not help to know right now anyway. What we do need to understand is how Tolven matches facts extracted from documents to concept facts. It would of course be possible to match directly on strings, avoiding the need for concept facts. For example,
when Diagnosis( code=="68566005", CodingSystem=="SNOMED CT" )
would work, but it unnecessarily couples the rule to the content. A single rule could have dozens of other variations as well, making the rule nearly impossible to manage. Because of variations in terminology usage over time, you would have to be adjusting your domain rules to get them to match codes that are better managed by terminology experts. So we need to rewrite the match to be something like this:
when $concept : Concept( name=="Urinary tract infection" ) $dx : Diagnosis( $code : code) eval( $concept.hasInstance( $code) )
This rule reads “there is a concept we will refer to as urinary tract infection, there is a diagnosis (extracted from the document), the urinary tract infection concept object has found a match to the diagnosis fact.” The hasInstance method on concept is more than a simple string comparison. It will look through all UMLS “atoms” it has for the concept to find a match on the code specified in the diagnosis object. This method will also see if the code specified matches a more specific code than the top-level concept. For example, if the diagnosis fact is SNOMED CT code 4009004, Lower Urinary Tract Infection then it too would match since Lower Urinary Tract Infection is a kind of the more general unitary tract infection concept according to SNOMED. The concept object has other methods that provide both more strict and more liberal behavior but hasInstance is the most common method.
Technical note: Tolven rules are mostly expressed as first-order logic, but sometimes other methods can be more efficient. That is where the eval predicate comes in. In this case, due to the size of the teminology space, Tolven uses relational algebra with the result returned to the rule engine as a simple boolean. Further, Tolven makes an additional optimization: rather than walking hierarchies of concepts at all, Tolven trades off (disk) space for performance by flattening the hierarchies. For example, chlamydial prostatitis is a kind of lower urinary tract infection which is a kind of urinary tract infection. One could assert all of those facts into working memory and conclude that chlamydial prostatitis is a urinary tract infection. If we only had three concepts in our vocabulary, we would probably do just that. But with 1.3 million concepts and many more relationships between them, what would be the point? When reasoning over instance data, as we are doing here, the vocabulary is relatively static. So Tolven enumerates all relationships in the database so that there is exactly one hop between any two concepts regardless of the distance of the relationship in the classification. In this case, Tolven lets the database engine handle caching and indexing for efficiency.
Once we have matched on a concept, we may want to do something with the concept we found. But as we did earlier with the diabetes registry, we could simply forget about the concept after it has been used to find a matching value from the document. After all, if someone is on the diabetes registry, that is all we care about at that level. If we want to know the patient’s specific diagnosis, or the conditions that put the patient on that registry, users can find that elsewhere if they are interested.
Now. consider a different scenario where we need to post all diagnoses we find for the patient on a single combined list and we do not want duplicates on the list. In this case, we do not really want to categorize the diagnoses, so if both urinary tract infection and lower urinary tract infection are found, we want each separate on the list. We do, though, want to normalize the diagnosis to SNOMED CT so that if a SNOMED UTI and an ICD-9 UTI diagnosis for the patient are found, only one would appear on the diagnosis summary list.
For this solution, we work directly with the knowledge object. Concept objects are not needed.
when $pat : Subject( type=="patient") $eg : List(name=="Diagnosis Summary") $dx : Diagnosis( $code : code) then account.addListItem( $eg, $pat, knowledge.coerce( $code, "SNOMED CT" ));
This rule says “there is a Diagnosis Summary list, there is a patient, and there is a diagnosis fact (extracted from the document and asserted during normalization). If that is true, coerce the code to SNOMED CT and store the resulting code in the list if the code does not already exist in that list. In this rule, the condition logic carries over to the action portion of the rule, in the coerce function. While the function could be expressed in rules, it would not be of much benefit but would tend to clutter memory with a lot of trivial (mapping) facts. Additionally, the addListItem function performs the duplicate removal quite easily using a set collection. The downside of this approach is that no fact is asserted to represent the conclusion. If such a fact were needed, a proxy fact could also be asserted into working memory representing the conclusion. There is no limit as to what can be done in the then portion of a rule.
Rule Authoring Environment
The initial version of Tolven requires that rules be authored, compiled and deployed at the source level by a Java developer. In the future, this will be enhanced by an end-user friendly rule authoring tool. In any case, rules are internally partitioned by Tolven account. This means each account can have its own rules. In general, rules are prevented from seeing any data from or taking any action on any other account.
Tolven allows rules to be shared between accounts. There is the obvious physician to physician sharing “here are my disease registry rules,” but also physician to patient can be very useful. A physician may advise the patient on a particular regime and ask the patient to record results and notify the physician if a certain limit is reached. The physician can supply a (wizard) template to the patient to enter the self-observed results and a rule that says to put the results on a list for easy viewing. The rule can also create a notification to the physician when conditions warrant (the rule can only queue the notification; the user will have to approve the release of information to the physician).
The Tolven Platform is rapidly becoming the most widely adopted open source solution for healthcare information technology globally. Tolven clients in Europe, North America, and Asia are leveraging the breadth of solutions the Tolven technology can support to serve their needs.
The Tolven Platform
The Tolven Platform takes advantage of a broad, flexible, and open source architecture that gives healthcare and life sciences professionals as well as patients the information they need in an open and extensible solution. The Tolven Platform and applications have global applicability.