PC Algorithm

Two constraint-based algorithms are available for structure learning: The PC algorithm and the NPC algorithm.

The Hugin PC algorithm, which is a variant of the original PC algorithm due to Spirtes, Glymour & Scheines (2000), belongs to the class of constraint-based learning algorithms. The basic idea of these algorithms is to derive a set of conditional independence and dependence statements (CIDs) by statistical tests.

The algorithm performs the following steps:

One important thing to note about the PC algorithm is that, in general, it will not be able to derive the direction of all the links from data, and thus some links will be directed randomly. This means that the learned structure should be inspected, and if any links seem counterintuitive (e.g., sweat causes fever, instead of the other way around), one might consider using the Learning Wizard (which provides a means of specifying structural domain knowledge) or the NPC algorithm, which allows the user to interactively decide on the directionality of undirected links.

Traditional constraint-based learning algorithms produce provably correct structures under the assumptions of infinite data sets, perfect tests, and DAG faithfulness (i.e., that the data can be assumed to be simulated from a probability distribution that factorizes according to a DAG). In the case of limited data sets, however, these algorithms often derive too many conditional independence statements. Also, they may in some cases leave out important dependence relations.

Generally, it is recommended to use the NPC algorithm, as the resulting graph will be a better map of the (conditional) independence relations represented in the data. In particular, when the data set is small, the NPC algorithm should be the one preferred. The NPC algorithm, however, has longer running times than the PC algorithm.