Due to family matters, I will not be visiting Ostersund as many times in 2006 as I did in 2005 and 2004. When I do visit, there will be MANY hours of lecturing. These are important hours, because I'm one of those old-fashioned professors who still believes that lectures can provide more useful information than the book along can, especially when the course materials are as difficult as they are for this course. So please do your best to attend all lectures.

During the first visit, I will cover many topics (see the syllabus at this web site), most of which are very important for your project. After those 3 days, you will have most, if not all of, the tools needed to do the project, depending upon which project you choose.

In the UC Irvine repository, many of the data sets involve continuous variables. Although the Bayesian Reasoner also works for continuous variables, it is a bit harder to set up. For our purposes, you can just convert all continuous variables into discrete variables by simply defining range bins. For example, if variable X has a minimum value of 10 and a maximum of 20, then you could define 4 bins: 1: (10 12.49), 2: (12.5 14.99) 3: (15 17.49) 4: (17.5 20). Then you can convert every X value into 1,2,3 or 4. So you'll need to write a preprocessor routine for the data files. You might want to specify the number of bins, and then the system will discretize the data accordingly.

Most of the UC Irvine data does NOT come with a pre-defined causal topology. For the simpler data sets, you may be able to design a reasonable topology yourself. Remember that even though your topology may not be the absolute best, many topologies are useful and will yield a reasonably simple set of conditional probability tables. So just design a topology and stick with it. If this proves too difficult, you can skip the UC Irvine database for homework #1 and generate a second artificial data set. But it is a good idea to get familiar with it (and to write a few data preprocessing routines), since we will be using the Irvine data for future homeworks.

I have written up the official text for homework 1. The homework descriptions are available via the "Homework Texts" link on the course homepage. Check the "UC Irvine" link for hundreds of possible data sets for use on homework 1 and other homeworks. We will cover the material for several of the homeworks during the lectures of January 24-26. Try to browse through chapters 13-14, 18 and 19 before that time. Do not expect to understand it all. I will try to cover the most important points during my lectures.