Advice to Masters Students

Welcome to the wonderful world of graduate studies!

Although my official title is "veileder", which translates directly into something like "the person who leads the way", I am more of a "veiviser", which translates directly into "the person who shows the way". I cannot LEAD very many of you anywhere, because I simply do not have time to go all the places (i.e., learn all the things) that you will go in the next year. I've been to some of those places, and stayed for varying lengths of time, but not, in most cases, as long as you will. In short, I can point you in the right direction, but the rest is up to you, almost exclusively!

This may be the first time in your academic career that you have received so much freedom and responsibility at the same time. I was in your shoes once, and I know that it is not easy. Self-discipline is extremely important in this situation. 12 (or fewer) months is not a lot of time for the task that lay ahead, so get started NOW and keep focused on your project and the necessary background topics.

Your basic task is to investigate an interesting cutting-edge area of artificial intelligence (AI) and understand it in a) enough breadth to both write a 10-15 page chapter on the main research in that area, and to relate your own work to that body of previous research; and b) enough depth to implement a system from that field, either one of your own invention or one based on the work of others, and apply that system to an interesting problem of your choosing, but of general interest to researchers in your selected field of study.

The Thesis Document

Your final report should consist of these 5 primary chapters:

I. Introduction

A 5-10 page introduction to your topic area and MOTIVATION for the project that you have chosen within that area. A VERY HIGH LEVEL description of your system is important to have here, but it should be very briefly dealt with in this chapter. The most important information in this section (and argueably in the entire thesis) is a statement of one or a few central research questions that your project is designed to investigate. This statement should relate, in some way, to interesting, relevant, and well-documented open research issues in your chosen field. This statement should very visibly be the main motivation for the remainder of the thesis. It should appear as a separate paragraph, preferably in an italic or even bold font, just to be sure that the reader gets it. After viewing this section, the reader should never be in doubt as to WHY you are doing this work.

Very general background information about the PROBLEM area (not the SOLUTION techniques) should also appear in this section, particularly those aspects of the problem that motivate your research.

Throughout the thesis document, you should refer back to these research issues just to make sure that both you and the reader understand how each aspect of your work relates to them.

II. Background

b) 15-20 pages of background in your research area. This should include a) a good explantion of the problem area, b) a general discussion of common solution techniques for this and similar problems, and c) a brief description of several (5-10) key systems that have been used to solve similar problems. 1-2 paragraphs per system is normally sufficient to convey the main contributions. These systems should be those whose motivating research questions and/or final results are most relevant to your motivations and desired results. The closer a piece of work is to your intended project, the more that you should write about it. To write a whole page (or even 2) about a closely-related system (possibly with a system-overview diagram) is not unusual and will often increase the educational value of this section.

III. Methodology

c) 20-30 pages describing the detailed design of your computer system. This does NOT mean source code, but a thorough discussion of the key components of your system and a solid justification for their selection. Diagrams are essential! UML diagrams are typically NOT useful in this chapter. You should be able to express your system in terms of more free-lance pictures that convey the essential concepts and processes, without burdening the reader with diagrams of the actual classes and methods that you implemented. If one or 2 high-level UML diagrams aids in this process, that's fine, but don't dump all your class, attribute and method names on the reader. The reader is not trying to reimplement your system, but to get a deep understanding of WHAT your system does. You do need to explain HOW your system functions, but not at the level of the small programming details. Focus on the CONCEPTUAL level, not the implementation level.

For example, if you build a simulator of some physical system, the proper level of implementation to show the reader involves the EQUATIONS (i.e. differential equations, constraint equations, etc.), not the actual code to implement those equations. If you have a lot of equations, then a diagram or two showing the main objects in the system and their main interactions is helpful. For example, you might diagram a neural network and the sensors and wheels of the robot that the network controls. In addition, drawings of the different neuron layers and their interconnections often conveys the essential essence of a neural network. Equations for the nodes in each layer, along with those of the synaptic learning rules, complete the picture such that an eager reader could begin to reimplement your system. Deeper details than this are best relegated to an appendix or a web page.

IV. Results and Discussion

In these 15-30 pages, you should a) describe the results of running your system on different problem scenarios, and b) analyze those results. Analysis consists of both pointing out interesting patterns in the data and trying to explain them in terms of the structure of your system.

It is fine to divide this chapter into two: one for results and one for a detailed discussion of them.

In many cases, a evaluator is LESS concerned with the actual results but MORE focused on your ability to explain those results. Good analysis can easily be a 1-2 letter-grade difference in a thesis. This is where you can show your skills as a scientist by setting up proper experiments and formally assessing their outcome. For example, if using an evolutionary algorithm to solve a problem, the results of a single run have absolutely no statistical significance. 20-50 runs are preferable.

In addition to quantitative (i.e. statistical) analyses, qualitative explanations are also important. In many cases, you will lack conclusive, quantitative proof that A caused B, but you should still give a qualitative argument for the A-B relationship if, in your opinion, it helps to explain your results.

V. Conclusion

In these 5-10 pages, you should sum up your project and discuss the key implications of your work WITH RESPECT TO THE GENERAL RESEARCH QUESTIONS posed in the introduction. This section should be at the same HIGH level on which the research questions were posed. Details of system behavior are restricted to the Results section, above. In general, this section should provide the final "take home message" from your work. What can other researchers in your field learn from your work?

It is okay to add a few anecdotal descriptions in the conclusion, such as how much trouble you had working with a robot or how much you learned by implementing a recurrent network from scratch, but these should ONLY be side commentaries, not the main theme. The thesis document is typically NOT intended as a description of the PROCESS that you've been through. It should only contain the key RESULTS of that process, in terms of what you have a) learned about your field, and b) contributed to your field.

VI. Appendix (Optional)

The UML and/or source code for some of the key system modules (or the whole system if it is not too long) can appear here.

General Comments on the Thesis Document

The page lengths are not strict, but only serve as guidelines. A typical masters thesis is 50-100 pages long. Much shorter, and people will begin to question how much work you have really put into the project. Much longer, and few people will take the time to read it in detail.

The REQUIRED SECTIONS of the thesis ARE strict. The thesis is a scholarly piece of work in computer science, and it must thereby abide by the standards of the field. A failure to include each of the 5 chapters listed above will almost inevitably result in a lower grade. ANY significant deviation from this general section organization needs to be approved by your advisor prior to writing.

Phases of Project Research

In a nutshell, there are 3 main phases to masters-project work. The lengths of these will vary among students taking the two different masters lines: liberal arts and engineering.

Background Reading

This involves reading A LOT of background material: you should expect to read several hundred pages of technical articles within your field. Conference proceedings are a great place to start: each article is only 6-10 pages long, so you can cover a good many pieces of related work in a day or two. Journal articles are also useful, but they tend to be longer, so do not get too hung up in a journal article unless you find the topic very interesting and relevant. If a conference article interests you, then there's a decent chance that the author(s) also have a longer version in some journal.

Avoid the temptation of implementing a system before you have done a good deal of background reading. You may end up building something that addresses no research issues (that people in your field recognize as important), or you may "reinvent the wheel". If you can hack up small tests in a few hours, that's fine, but don't spend weeks programming until you've thoroughly investigated the literature in your field.

Implementation

In this phase, you can continue background reading, but by now, you should have found a relatively specialized topic area, so your reading will be focused on that "niche". In addition, you should decide on a system to implement and begin designing and writing code. Although you will surely need to modify your system throughout the remainder of the project, you should get it reasonably stable by the end of this period.

Write-Up and (unfortunately) More Coding

Full speed ahead!! During this phase, you will need to fine tune the system and run it on a few problem cases. Once those results are in, the easy (???) part begins: writing it all down. Some people really enjoy this phase - others despise it. Regardless of your opinion of writing, this is the most important aspect of the project. If you have done good work but cannot write it up well, nobody will ever understand your true contribution, and your grade will suffer. However, if you have not implemented anything outstanding but have written a good essay on the field and a good description of your system and its pros and cons, then, believe it or not, you have still done "research" and you will be rewarded at least partially for your attempt and for being able to analyze and discuss the results in an intelligent manner.

During the write-up, you will invariably come across problems with your system, some of which are minor but some that require changes to the system code, reruns of the test cases, etc. This code rewriting is almost unavoidable in a computer-science project. BUDGET for it when you plan your project! Don't assume that you can only implement right up until a week or 2 before the deadline, when you plan to begin writing. The deep thinking and analysis of one's work during the write-up will almost always uncover flaws. Many of these flaws require only minor changes to the code, but the rerunning and reanalysis of the test cases can often take several days.

General Observations

After over a decade of master- and PhD-thesis advising, I've seen a lot of good projects..and a few that I wish I could forget. This section summarizes some of the good and bad things that I've seen in hopes that they will aid current students in steering in the proper direction.

In general, a good thesis process covers each of the 3 phases above, with approximately equal time devoted to each. A good thesis includes the sections listed above.

The very best theses do find interesting open questions in contemporary research and make some real headway in solving them, but these theses are quite rare. Please remember that not all research is about success and breakthroughs. We all have that ambition, but the odds are not in our favor. Most research projects end in a wimper, not a bang. You begin with high hopes for your system, but in the end, it may only do 20% of what you dreamed. The important thing is not to quit and call the whole project a failure, but to analyze the results and write them up. Your work can then help others who are interested in a similar problem, both by telling them what to do and what NOT to do. That's research - both the successes and the failures.

I hope that every masters thesis can turn into a publishable piece of research, but this is also overly ambitious. Not every project will make a new contribution to the field, and that is fine. This is not a PhD thesis!! My main concern is that each masters students "comes up to speed" in an area of research. This happens via reading (lots of it!!!) and (in my fields of study - evolutionary computation and artificial life) implementation.

If you are very astute, you will find a "hole" in contemporary research; and if you are very clever, you will design a system to fill that void. Finding the holes is non-trivial, but good places to start are the "future work" and "discussion" sections of research papers. Also, review articles on particular specialized topics are often extremely useful in this regard, since they tend to give general overviews of both the contributions and weaknesses in a field.

Many masters students do not find any significant holes that they are able to fill. Again, that is okay. It is not a requirement of the masters degree, although it may be the difference between an "A" and a "B" or "C" grade. In general, you do not need to fill a significant hole to get an "A", but you have to a) identify the hole, b) make a serious, well-justified attempt to fill it, and c) analyze very thoroughly the successes and failures of your attempt.

Although the publication of a masters thesis is a possibility, this is the exception, not the rule. The publication itself will be its own additional reward. So please do not get too hung up on the publishing. Just focus on your chosen topic. Learn as much as you can, and in the course of doing so, you may stumble onto something (a hole) that you can turn into something very significant. "Chance favors the well-prepared mind", as Louis Pasteur said. So do that prepararation as thoroughly as possible, and maybe lightning will strike! But if it does not, you can still get a good grade by doing well-justified work and writing an interesting report that displays your new-found knowledge.

What a Thesis is NOT

A masters thesis is not a fancy implementation with a post-hoc justification. We all know how fun it is to write software, but programming does not earn you a high mark from a major university such as NTNU. Your programming MUST be driven by interesting research questions, otherwise it is simply not research, not at all. If you do all the programming first and THEN try to find research issues that it fortuitously addresses, you will, in all probability, come up with only very lame justifications that a sensur will easily detect and strictly penalize. So resist the temptation to start programming until you have a general idea of the research question. You might program to EXPLORE possible angles on a research question, but do not blindly begin programming in hopes of eventually figuring out your motivation. This is akin to building a house first and then trying to add on the basement afterwards. In both cases, you'll be doing a lot of difficult digging!

A masters thesis in AI is not simply a discussion about something that interests you. It must contain the basic elements of the scientific method: hypothesis, methods, results, analysis, etc. You are a scientist exploring a question, and as a COMPUTER scientist, your natural tool of choice is the computer program.

Remember, the computer program in and of itself is not the RESULT. Although this may be the case in certain engineering disciplines, in AI, the program is the TOOL for PRODUCING the results. So never conclude that since a) you've pointed out an interesting research issue and b) you've written a bug-free program, then c) you are done. The behavior of your system must be analyzed with respect to the research goals. Failure to report and analyze these results and to explicitly link them to your research goals is often the difference between a good (A-B) and a bad (D-F) grade.

Common Problems

There are many reasons why theses get less than an "A" grade. In general, the A is a good goal to have, but it requires a lot of work. We don't give out that many of them.

Many very good projects end up with a B or C grade due to a variety of problems. The main ones are discussed below.

Time management is clearly a big problem for many students. This typically does not rear its ugly head until a few weeks before the due date. By then, it's normally too late to salvage a top grade. Students fail to recognize that writing up takes a long time and that it can involve time-consuming returns to coding.

A related problem occurs when students SEE an issue that, in a perfect world, would only take a day or 2 (or hour or 2) of recoding to correct, but they simply have no time left to go there. So the thesis includes hints as to how their system could be improved. If those hints are simple things, then a good evaluator will immediately ask, "Why the #&%! didn't (s)he do that?" The answer is clearly "time constraints", but, unfortunately, that is rarely considered a valid excuse (for a project with such long duration). So the letter grade takes a hit. The student MAY opt to not mention this issue in the report, but if the evaluator sees it anyway, then the consequences can be even worse. Either way, the student loses. The moral is to start writing up several months before the deadline such that, when you REALLY begin to think about your system and results (as a normal side-effect of the writing process), the problems that you uncover will have a decent chance of being rectified.

When it comes to DIFFICULT improvements of the system, these are easily listed as "future work", which an evaluator will normally view as a purely positive aspect of your thesis. Namely, it indicates that you are very aware of your systems strengths and weaknesses and have given deep thought to how you would improve the system if you had a few more MONTHS or YEARS to work on it.

Some students do a lot of background work but do not RELATE their own work to it. This weakness is easily detected in the conclusion chapter, where some students merely reiterate their system and its results, without ever connecting to the key research issues or the contributions of others. For those students who give an oral presentation, the weakness appears when a evaluator asks them to compare their work to that of Dr. X, and the student draws a complete blank as to a) what Dr. X did, and/or b) the similarities and differences between their work and that of Dr. X.

This problem is deemed highly problematic by most evaluators for a very simple reason: most masters projects do not produce earth-shattering results but, rather, represent "first steps" into a field by the student. So since the final result will probably not be something extremely useful for the student in their future career, the evaluator hopes that, at least, the student has learned considerable concepts and techniques from a research field, which they may then be able to RELATE to problems that they will encounter in their careers. If they cannot RELATE them to something that they have worked on for the past year, then it's a bad sign!

A really bad thesis is one that lacks any connection to significant research issues. You may hack up an awesome program with fancy 3-d graphics and a complete "fan base" of users, but if you can't describe your system as a valid attempt to solve a problem that RESEARCHERS in AI consider noteworthy, then you'll be looking at a D or worse for a grade. The differences between research and development are real and extremely significant for an academic piece of work. Don't lose sight of them in the midst of those months of programming.

For a description of NTNU's guidelines for evaluating masters theses, look here .