next up previous contents
Next: Pigs: genetic counselling and Up: Asia: a simple expert Previous: Evidence propagation

Learning about parameters

spiegelhalter:etal:93 describe techniques for estimating parameters (i.e. the conditional probabilities) of such a network, where these parameters can be represented by additional nodes connected to a set of networks corresponding to each of a set of cases. The parameters can be given independent Dirichlet distributions and, with complete data, standard conjugate Bayesian updating is straightforward. With incomplete data a number of different analytic approximations have been suggested, but in fact a simulation solution is easily implemented.

Figure 24 illustrates the asia2 network in which tex2html_wrap_inline2920 represents the unknown conditional probability of bronchitis? given smoking?. Note that tex2html_wrap_inline2920 replaces the known conditional probability matrix p.bronchitis used in the first asia network described above. The observed part of the network is represented by the replicated plates. We illustrate learning about tex2html_wrap_inline2920 with a dataset of five cases, in which the true value for smoking is not observed for case 2, who has bronchitis and dyspnoea, and case 3, whose only positive feature is an x-ray.

   figure1868
Figure 24: Graphical model for asia2 example, with additional node tex2html_wrap_inline2920 representing the unknown conditional probability of bronchitis? given smoking?

The data file now does not contain values for p.bronchitis, but does have observed data for five cases.

Data for asia2 example

list(p.asia         = c(0.99, 0.01),
     p.tuberculosis = c(0.99, 0.01,
                        0.95, 0.05),
     p.smoking      = c(0.50, 0.50),
     p.lung.cancer  = c(0.99, 0.01,
                        0.90, 0.10),
     p.xray         = c(0.95, 0.05,
                        0.02, 0.98),
     p.dyspnoea      = c(0.9, 0.1,
                         0.2, 0.8,
                         0.3, 0.7,
                         0.1, 0.9)

     asia         = c(1,1,1,1,1),
     smoking      = c(2,NA,NA,1,1),
     tuberculosis = c(1,1,1,1,1), 
     lung.cancer  = c(2,1,1,1,1), 
     bronchitis   = c(2,2,1,2,1), 
     xray         = c(2,1,2,2,1), 
     dyspnoea     = c(2,2,1,2,2))

The BUGS code (shown below) now requires the observables to be vectors, and has put independent Dirichlet prior probability distributions with parameters (1,1) (i.e. uniform priors) on each of the unknown conditional distributions p(bronchitis | smoking=no) and p(bronchitis | smoking=yes).

Asia2: model specification in BUGS

model Asia2;
const
   N = 5;  # number of cases
var
   asia[N],smoking[N],tuberculosis[N],lung.cancer[N],
   bronchitis[N],either[N],xray[N],dyspnoea[N],
   p.asia[2],p.smoking[2],p.tuberculosis[2,2],theta.b[2,2],
   p.lung.cancer[2,2],p.xray[2,2],p.dyspnoea[2,2,2],prior[2];
data in "asia2.dat";
{
for (i in 1:N){
   smoking[i]      ~ dcat(p.smoking[]);
   tuberculosis[i] ~ dcat(p.tuberculosis[asia[i],]);
   lung.cancer[i]  ~ dcat(p.lung.cancer[smoking[i],]);
   bronchitis[i]   ~ dcat(theta.b[smoking[i],]);
   either[i]      <- max(tuberculosis[i],lung.cancer[i]);
   xray[i]         ~ dcat(p.xray[either[i],]);
   dyspnoea[i]      ~ dcat(p.dyspnoea[either[i],bronchitis[i],])
   }
#  priors for unknown probabilities
for (j in 1:2){
   theta.b[j,] ~ ddirch(prior[]);   # theta.b = p(bronchitis | smoking)
   prior[j] <- 1;
   }    
}

Analysis

100000 iterations after a 1000 iteration burn-in took 36 seconds and led to posterior mean estimates (standard deviations) of p(bronchitis | smoking=no) = .52 (.21) and p(bronchitis | smoking=yes) = .66 (.23). In addition we estimate that for cases 2 and 3 respectively, there is a probability .56 and .37 that they are smokers.


next up previous contents
Next: Pigs: genetic counselling and Up: Asia: a simple expert Previous: Evidence propagation

Daniel Farewell
Mon Sep 13 16:39:37 BST 1999