Model Verification

A very important activity in model construction is verification of model structure through analysis of the conditional independence and dependence statements encoded by the structure.

Letting arrows point from effect to cause instead of from cause to effect is one of the most frequent mistakes made, and which can lead to severe problems with specifying the CPTs (e.g., if arrows point from a large number of effect variables to a single cause variable) and wrong inference.

As an example of the practical use of CID analyses to perform structural model verification, consider the problem of constructing a model for detection of fraud usage of credit cards. Assume we have the following three Boolean variables:

A: Two or more PC’s bought within a few days using the same credit card.

B: Credit card used almost at the same time at different locations.

C: Fraud.

The model A \(\rightarrow\) C \(\leftarrow\) B may at first glance appear to be natural (A and B are used as “inputs” to determine the probability of C).

From the rules of d-separation, however, we can see that this model tells us that observing A (or B) does not provide us with any information about B (or A) when C is unknown. This is wrong, as observing A (or B) increases our belief in C, which in turn increases our belief that we might also observe B (or A). Therefore, this model simply gives wrong probabilities!

The model A \(\leftarrow\) C \(\rightarrow\) B, on the other hand, makes A and B dependent when we have no (definite) knowledge about C (i.e., observing A (or B) will increase our belief that we will also observe B (or A)). When we have definite knowledge about C (i.e., the value of C is known), then this model tells us that A and B are independent, which seems to be quite reasonable (if we know for sure that we are considering a fraud case, then observing A (or B) will not change our belief about whether or not we are going to observe B (or A)).