Single Policy Update

A LIMID is solved by Single Policy Updating (SPU). SPU is an iterative algorithm that updates one policy at a time and terminates when all policies have converged (i.e., more iterations change nothing). The algorithm usually finds the globally optimal policies, but it is possible that the algorithm may get stuck at a local maximum.

The parents specified for the decision nodes determine which observations should be taken into account when decisions are made. Ideally, we would specify all observations to be taken into account, but this may not be practical because the size of a policy table is exponential in the number of parents. We therefore often don’t specify less important observations as parents of decision nodes (for example, old observations are typically less important than new ones) in order to reduce the size of the policy tables.

Unless all relevant information has been specified as parents, then it can be useful to recompute policies whenever new information becomes available. This is because the computations take all existing observations (in addition to future observations specified as parents of decision nodes) into account when policies are computed.

Propagation of evidence and calculation of posterior distributions are performed under the strategy computed by the latest Single Policy Update (or under the initial strategy specified by the user, if Single Polcy Update has not been performed). SPU assumes that entered evidence has been propagated.

The SPU algorithm computes the probability distribution and expected utility function over the states of each chance and decision node in the LIMID.

Single Policy Updating is invoked by pressing the ‘SPU’-button of the Run Mode Tool bar.

See also Reset uninstantiated policies, Store policies and Recall stored policies.


Figure 1: The SPU button from the Run Mode Tool Bar.