Data Dependences

The Data Dependences page of the Learning Wizard allows you to investigate the strengths of the marginal dependences between pairs of variables, using a slider.

Please note that the actions performed in the Data Dependences page have no effect on the resulting Bayesian-network model. The purpose of this Data Dependences page is only to gain insight into the strenghts of the pairwise dependences.

Interpretation of the Learned Graph

The Data Dependences page of the Learning Wizard initially shows the independence graph learned from data. This graph is directed, but with no directed cycles (i.e., it’s a DAG). The DAG represents the conditional and marginal dependences and independences found in the data. The links, however, cannot necessarily be interpreted as causal links. The directions of the links only ensure that the dependences and independences found can be read from the DAG, using e.g. Pearl’s d-separation criterion.

Notice that the DAG is learned only indirectly, based on measures of conditional and marginal dependences and independences found in the data.

To further investigate the dependences and independences found in the data, an undirected graph can be shown, where each link represents a marginal dependence with strength larger than one minus the current slider value. To see how to switch between the directed and the undirected graphs, see the description of the toolbar.

The Toolbar

The Data Dependences page contains a toolbar, including the following functionalities:

Show Directed Graph

This function displays a directed independence graph (i.e., a DAG). The DAG gets displayed by pressing the button figurehere

Please note that the Show Directed Graph and the Show Undirected Graph modes are mutually exclusive (i.e., the two associated buttons act as a couple of radio buttons).

Show Undirected Graph

This function displays an undirected independence graph. This graph is not necessarily identical to the directed graph with the directed links replaced by undirected ones. Each link in the undirected graph represents a marginal dependence with strength greater than one minus the current slider value. The undirected graph gets displayed by pressing the button figurehere1

As mentioned above, please note that the Show Directed Graph and the Show Undirected Graph modes are mutually exclusive (i.e., the two associated buttons act as a couple of radio buttons).

p-Value

Whether or not there is going to be a link between a pair of variables, say A and B, in the independence graph learned from the data depends on the degree to which A and B are (conditionally) (in)dependent - if they are marginally dependent, there will be a link; otherwise there won’t be a link. This degree is quantified through so-called p-values associated with the hypothesis that the two variables are (conditionally) independent. For each (small) set, C, of conditioning variables, a p-value for {A,B} is computed. This value expresses the probability that A and B are conditionally independent given C. The marginal p-value is the p-value corresponding to C={}.

The marginal dependence between A and B is defined as one minus the marginal p-value associated with {A,B}. Thus, a marginal dependence of 0 means that A and B are completely independent, and 1 means that they are completely dependent.

Slider

The current slider value represents a threshold such that only links in the current (directed or undirected) graph with marginal p-values less than the threshold are shown (or, equivalently, links with marginal dependence greater than one minus the threshold value). Thus, the slider provides a means of detecting the marginal strengths of the links. This can be very useful in determining which links should be forced to be included (see the help page for the Structural Constraints page of the Learning Wizard for a more detailed discussion of this issue).

Stretch Slider Scale

The value of the lower endpoint of the slider can be decreased by pressing the button figurehere4 or by dragging the slider ticks downwards, using the mouse.

Please note that the minimum value of the lower endpoint of the slider equals the maximum of the smallest floating point number available and the smallest marginal p-value over all links in the graph.

Compress Slider Scale

The value of the lower endpoint of the slider can be increased by pressing the button figurehere5 by dragging the slider ticks upwards, using the mouse.

Please note that the maximum value of the lower endpoint of the slider equals 1E-10.

Importing Information

Pressing the “Import”-button : figurehere6 , allows for import of all network information, such as node positions, labels, sizes, etc., from a net-file. This can be very useful, if the data relates to a network whose structure is known. In that case, you can simply import the labels and positions of the nodes. The learned network can then easily be compared to the existing one.