CSc 335
Project 1 -- Break that Code
Due Wednesday, April 15, 2009
Objectives
- To practice with programs running multiple threads
- To understand and articulate how concurrency
using shared data structures can lead to incorrect results
- To begin using the Nachos thread system as well
as system calls like Fork and Yield
- To break completely working code...
Your Mission
In this project, you'll start to analyze and use the Nachos system to
test out a multi-threaded program. Using your doubly-linked list
from the last project, you will create a program where multiple
threads use a shared linked list simultaneously. Finally, you'll set
up situations to cause your linked list to produce incorrect results
(or even crash!) By doing this, you'll demonstrate how concurrency
can cause serious problems -- and motivate the need for protection
mechanisms.
Setting Up NACHOS
The first step is to get a copy of Nachos, compile it, and run
the test program to see how it currently does multi-threading.
Log into antipasto and get a
terminal window so you can run commands. If you're on one of the PASTA
machines, log into antipasto after logging into your home account
with ssh (secure shell):
ssh <username>@antipasto
where you replace <username> above with your username.
- Use the cp -R command to copy the following directory
and everything inside it into your own antipasto account:
/home/csc335pub/nachos/nachos-3.4/code
- Then use chmod to reset the permissions on
this directory so that only you (the owner) can read and modify it.
Remember to use man pages for help with commands like
cp and chmod!
- The files in the threads subdirectory will be the main ones
you'll work with in this project. So make threads the current
directory.
- Type
make depend
while being in the threads directory.
As you read in the
makefile tutorial, this readies the makefile by creating the
correct dependencies
which determine when a file must be recompiled.
- Compile the code in this directory by typing make.
You'll see the g++ compiler do its thing with a bunch of
files (and spit out a bunch of deprecated and other warnings -- which
you should ignore). The last thing that should be echoed back to
you when running the makefile is a linking of all the object (*.o) files.
- A nachos executable should appear in your directory.
(List the directory contents to be sure.)
Run it by typing nachos.
You may have to type ./nachos if the current directory
is not in your path. If it worked, you should see a program that
creates two threads and bounces back and forth between them
as they each iterate through a loop various numbers of times.
Congrats. You've just run your first program on the Nachos simulator.
Now it's time to dive into the Nachos source code to understand
what just ran and how it works. The threads directory contains
a bunch of files that you should study to understand the output
you just saw. Find the main function where the program
starts and begin tracing through the code. Use grep to help
you search for keywords in various files so you can follow the trail.
On a separate piece of paper, answer the following questions:
- What function is responsible for printing
things like thread 0
looped 3 times? What file is this in?
- What function makes Nachos suspend the current thread and
(re)start the other?
- In what function is the secondary thread actually Forked?
Explain what the two arguments of the Fork line are actually
doing in the context of this program.
- You can turn on DEBUGging flags with the -d option when
running Nachos. Turn on the one that shows thread debugging statements
and run Nachos again (you'll have to read comments to figure out how
to do this). According to these debugging statements,
what are the names of the two threads?
- In what function (and in what file) is the primary thread
being created? Hint: once you know the name of the primary
thread, use grep to search for all instances of it.
- When a new thread is created with Fork, does that
new thread start immediately? If so, where is this happening?
If not, what happens to this new thread once Forked?
Explain how you determined this.
You'll be turning in your answers as part of a writeup for this
project.
Part 1: Make your own multi-threaded program
Create a new file called dllist-nachostest.cc that
contains a function modeled on the ThreadTest
function located in threadtest.cc. This function,
call it DLListTest(), will run various concurrency
tests using a shared doubly-linked list depending on what
test number it is given.
For your first test, make a function that creates a certain
number of threads (you can hard code this number, but
make it easy to change). Each thread should do the following:
- Call AddN() (from the last project) a certain
number of times. Again, you can hard code this number, but
make it easy to change. This will add n nodes to the list.
Print the list after n nodes have been added.
- This thread should then suspend itself and allow another thread
to proceed.
- Finally, call RemoveN() the same number of times
that it called AddN() previously. Print each item removed.
Print the list after n nodes have been removed.
If this were just a single-threaded program, this code would
remove exactly what it had just inserted. But with multiple threads,
each thread has the possibility of removing different nodes than
what it had originally inserted. Make sure you understand why!
Before trying to compile this, you must tell Nachos
that you have new files that it must take into account
(dllist-nachostest.cc plus all of the doubly-linked list files
from the last project. Be sure all of these files are in
the threads directory.)
The g++ compiler knows
what files to compile by what is
listed in the Makefile.common file
in the code parent directory.
Take a look at it now and find the lists
of *.h and *.cc files
that it uses to compile. Make the following changes:
- In the THREAD_H list, add ../threads/DLList.h
- In the THREAD_C list, add:
- ../threads/DLList.cc
- ../threads/DLList-driver.cc
- ../threads/dllist-nachostest.cc
- and remove ../threads/threadtest.cc from the list
- In the THREAD_O list, add:
- DLList.o
- DLList-driver.o
- dllist-nachostest.o
- and remove threadtest.o from the list
Then, go back into the threads directory and type
make depend
so that Nachos will recognize the changes in Makefile.common.
Remember, whenever you want Nachos to compile new files of
your own making, you must add them to Makefile.common
in the parent directory, go back to the correct subdirectory,
and then type make depend. Only then will
the makefile be ready to compile.
PITFALL ALERT: This section may have seemed like
it was presented in a very step-by-step manner, but I've actually left
out a lot of nuances related either to C++ or Nachos. It is your
responsibility to read through documentation and look at other files
in the threads directory to fill in details. And ask for help
from your classmates and myself!
Debug your code until it works. Congratulations! You now have a
program
where multiple threads are inserting and deleting from a shared data
structure. By varying the number of threads and the
number of times AddN() and RemoveN() are called,
you can make the threads interact in different ways. But this test
doesn't cause the code to work incorrectly...yet.
Part 2: Break it down
As we've seen in class, there are places where a context switch may
happen that will cause shared variables to change (or not
change) inappropriately. In this part, your job is to
cause those kinds of errors to occur.
- First, think of interleavings involving your linked-list code
that would cause errors to occur. You must think of at
least four (4) different types of errors.
- For each error, cause it to occur by adding another test case to
your DLListTest() function. In a real OS, you don't
get to control where context switches occur, but in Nachos you
can! Use the currentThread->Yield() function, which will
yield control to a waiting thread.
You may think of situations where you want one thread to get to
a particular location in your code and then yield, but then
a second thread will get to that same point in your code but
should not yield. That is, you may want to specify
a point in your code for a yield to happen conditionally,
i.e. it should happen for one thread but not another.
To handle this, I recommend implementing
a 2-D array of booleans called yieldData. Position
yieldData[i][j] represents what should happen at
location j in the code once that location has already been
reached i times. If true, the thread should yield. Otherwise,
it should not. You will also need to keep track of the number
of times each location of interest has already been reached.
Then, in your test case, you can initialize the yieldData
array so that yielding will happen where and when you want it.
At the location where you want conditional yielding to occur,
just call YieldIfShould():
void YieldIfShould(int loc)
{
// Given this unique location, loc, this function
// yields the current thread if it ought to. It knows
// whether to do this or not by consulting the yieldData array.
}
Now you can set up points in your code where different threads
may or may not yield when they reach those points. This will
open up many more possibilities of interleaving errors for you.
- Finally, document your test cases in the writeup
you started earlier for this project. Each test case should
contain
- a test number. I should be able to run Nachos using
nachos -q <testNum> to run this test.
- a text description of the error
- a "play-by-play" of the error as it is occurring. Include
what files and line numbers are involved. For example:
Thread 0: Insert() method. Thread finds list empty and yields [file DLList.cc, line 82]
Thread 1: also calls Insert() method. Finds list empty and completes. Thread completes.
Thread 0: Resumes Insert() [line 83]. Updates firstNode pointer to new node in [line 85].
This is incorrect since we have now lost the node added by Thread 1.
Remember, you need at least four different types of
errors.
Note: When you move on to a new test case, please do not remove
calls to
YieldIfShould in your code that dealt with previous test cases.
You can
"disable" those locations simply by putting falses in the right places
in the
yieldData array. When I grade this, I should be able to simply
run your test cases straight away in any order.
Grading
This project is worth 50 points:
- 15 points for the working multi-threaded test case of Part 1
- 25 points for the four different error-producing test cases
of Part 2
- 10 points for the writeup. The harder it is for me to understand
what your
test cases are doing, the fewer points you will receive.
On paper, please turn in your writeup and source code for all code you
have
written for this project. (Do not print out Nachos files that
have not changed.) You should also turn in a copy of the output
for all five test cases (one case for Part 1, four cases for Part 2).
Electronically, you will be submitting your entire code directory
to me.
Submit it using the
submit script you used in the last project:
/home/csc335pub/submit csc335 <full path to your code
directory>
Having trouble? Don't wait until the last minute! Come see me and get
your
$80 worth.
Administrative statement
Programming assignments, like homework
assignments, are individual projects. I encourage you to
talk to
others about the general nature of the project and ideas about how to
pursue it.
However, the technical work, the writing, and the inspiration behind
these must
be substantially your own. If any person besides you contributes in any
way to
the project, you must credit their work on your homework. Similarly, if
you
include information that you have gleaned from other published sources,
you must
cite them as references. Looking at, and/or copying, other people's
programs or
written work is inappropriate, and will be considered cheating.
Return to
Project Index