[ This content is protected and may not be shared, uploaded, or distributed. ]
(Please also check out PA4 FAQ.)
Lab 9 has three parts:
Part A (lab9a) - introduction to condition variable (useful for PA4):
This part has no coding.
Please do the following:
Make sure you read the code and understand how to use a mutex, condition variable, and lock so that
one thread can correctly send "work" to a target thread while the target thread can correctly wait for
work to arrive into its "work queue" when you do the next part of this lab.
Part B (lab9b) - better reaper thread (useful for PA4):
This lab is web server-only.
Please continue to use "lab4data" as the root directory of your server for the rest of this lab.
In this part of the lab, you need to modify your reaper thread to only wake up when there is "work" to perform
(i.e., when there is a dead connection thread to join).
Please first do the following:
- Change directory into the "lab9" directory mentioned above.
- Copy your code for Part B of Lab 8 into the current directory, create a Makefile
so that when you type "make lab9b", an executable called lab9b will be created.
- We will use the test data from Lab 4.
Please download lab4data.tar.gz into that directory and type:
tar xvf lab4data.tar.gz
This should create a subdirectory called "lab4data" with a bunch of files in it.
The commandline syntax for lab9b is: "lab9b PORT LOGFILE".
Your multithreaded web server with console in Part B of Lab 8 was
sending data to clients at the rate of about 1 KB per second. This means that most of the time,
your connection-handling threads are sleeping and there was very little contention among all your
threads. So, even if you have synchronization bugs in your code, chances are, it won't show up.
In this part of the lab, we will first speed things up so that if you have synchronization bugs in your
code, bad things would be more likely to happen.
Since we are done with PA3, we no longer need to control the data sending rate of your server.
Please remove all your token bucket related throttling code.
To expose potential synchronization bugs in your code, please do the following to make your connection-handling threads more busy.
When you are writing 1 KB into the socket, don't write all 1 KB in one shot but stay in a loop and write one byte at a time and call usleep(250) to sleep
for a quarter of a millisecond before writing the next byte.
Let's also fix the reaper thread in Part B of Lab 8.
The reaper thread wakes up every 250ms. Most of the time, there's nothing for it to do. It would be nicer if
it only wakes up when there is "work" to be performed. The only "work" for a reaper thread is to join with a connection-handling thread.
Who is providing work for the reaper thread? When a connection-handling thread dies, it produces work for the reaper thread.
We will have the connection-handling thread sends a notification to the reaper thread when such work has been produced
and has been added to a "work queue".
The recipe for sending a notification and using a "work queue" is in Part A above.
Please review the "cv.cpp" code and make sure you understand what the add_work() and wait_for_work() functions are doing
and why this is the only safe way for one thread to send data to another thread.
In "cv.cpp", we have a queue of integers. In general, this queue can be a queue of "work objects".
For this part of the lab, please copy the add_work() and wait_for_work() functions
(and whatever global variables you need) from "cv.cpp" into your code
and make sure it compiles without error. Then change the queue of integers into a queue of Connection objects
and change the argument of add_work() to be a shared pointer to a Connection object and have
wait_for_work() returns a shared pointer to a Connection object.
It would also make sense to rename these functions to be reaper_add_work() and reaper_wait_for_work()
since we will have all kinds of work queues in future labs and assignments!
Then the reaper thread's infinite loop would look like the following:
do forever
c = reaper_wait_for_work()
if (c == NULL) then
break out of infinite loop
else
join with c.thread
lock mutex
print required message
close c.orig_socket
remove c from conneciton list
unlock mutex
end-if
end-do
do forever
lock mutex
if connection list is empty then
unlock mutex
break out of infinite loop
endif
c2 = any connection in connection list
unlock mutex
join with c2.thread
lock mutex
print required message
close c2.orig_socket
remove c2 from conneciton list
unlock mutex
end-do
Please note that in the above pseudo-code, reaper_wait_for_work() can return a NULL pointer
to break out of the first infinite loop. The idea here is that we need a way to ask the reaper thread
to self-terminate and the can be achieved by adding a NULL pointer (or a special Connection object)
to the reaper thread's work queue. When should this be done? It should be done when no more new connection
is possible (i.e., the connection list can never grow longer). Therefore,
this can be done with the user entered the "quit" console command and set the
listening socket to (-1) or any time afterwards (since no new connection will be possible).
Of course, this should be done before the main thread joins with the reaper thread.
The 2nd infinite loop is there to make sure that the reaper thread will not self-terminate until all connection-handling threads are joined.
When a connection-handling thread sets its socket file descriptor in a Connection object to (-1),
since this thread is about to self-terminate, it should do (make sure that you are doing
this when the mutex is unlocked):
reaper_add_work(c)
This will add the Connection object to the queue of Connection objects and will notify/wakeup the reaper thread
to join with this thread and the reaper thread will remove the connection from the connection list.
When you are done with implementing lab9b, please do the following:
- Change directory into the "lab9" directory mentioned above.
- Type "script lab9b.script" to start a transcript. Then type:
make clean
make lab9b
./lab9b 12345 lab9b.logfile
- You should see the command prompt. Type the following commands:
help
status
- Run two more Terminal windows and cd into the same "lab9" directory.
- In the second Terminal window, type:
wget -O x http://127.0.0.1:12345/textbooks-1-small.jpg
- In the third Terminal window, type:
wget -O y http://127.0.0.1:12345/viterbi-seal-rev-770x360.png
- The download speed no longer matters. Wait for downloads to start in both the 2nd and 3rd windows.
- In the first window, type "status" and make sure that it sees two active connections (connections 1 and 2).
- In the first window, type "close 1". One of the wget programs should reconnect using a different connection.
- In the first window, type "status" and make sure that it sees two active connections (connections 2 and 3).
- In the first window, type "close 2". The other wget program should reconnect using a different connection.
- In the first window, type "status" and make sure that it sees two active connections (connections 3 and 4).
- In the first window, type "quit" and make sure that the server has gracefully shutdown.
- Type:
cat lab9b.logfile
and make sure that you can see the following:
- there is exactly one "Connection closed" log entry for each connection number
- there is exactly one "Reaper has joined with connection thread" log entry for each connection number
- Repeat by starting your server in the first window and starting the same downloads in the 2nd and 3rd window. Wait 5 seconds and type the following in the first window:
status
quit
- Type "exit" to close the transcript.
Alternatively, you can also do everything inside one Terminal and run tmux.
You can split the screen vertically into 3 panes.
Part C (lab9c) - two threads per connection (useful for PA4):
A socket is bidirectional and full duplex, meaning that you can read from a socket and write to the socket simulatneously!
So far, we have been implementing a client-server model where a client initiates a request, then the server responds, then
the client sends another request, then the server responds, and so on. There is on need to read and write a socket simulatneously
in a typical client-server model of network communication.
To get ready for PA4 where you will be implementing router functionalities in a peer-to-peer network architecture,
you will first use two threads to handle a connection. One socket-reading thread to read from the socket and one socket-writing thread
to write to the socket. For this lab, we will continue from Part B above to implement the final version of our multithreaded web server!
Please first do the following:
- Change directory into the "lab9" directory mentioned above.
- Copy your "lab9b.cpp" into "lab9c.cpp" and modify the Makefile
so that when you type "make lab9c", an executable called lab9c will be created.
The commandline syntax for lab9c is: "lab9c PORT LOGFILE".
Since we are still implementing a web server, when the socket-reading thread gets a request from a client, it needs to ask the corresponding
socket-writing thread to send a response. In order for these two threads to "communicate", the socket-writing thread will use a "work queue"
similar to the work queue for the reaper thread. The difference is that, in this case, the "work" is not a connection, but an HTTP request message
(or can simply be the request line in the HTTP request header, since we are ignoring everything else in the request header).
The would mean that there needs to be a "work queue" (with its own mutex and CV) within each connection!
Now, in the larger picture would be that we have a top-level mutex, which we have been using to access the connection list and to synchronize the
main thread, the reaper thread, the console threads, and connection-handling threads.
The mutex within a connection would then be a 2nd-level mutex. In order to avoid deadlocks, if you need to lock both the top-level mutex
AND a 2nd-level mutex, you must follow the rule
for using a "lock hierarchy" (which we have covered in class when we discussed "multithreading part 3 - mutex").
For this lab, it means that you must not try to lock the top-level mutex when you are holding a 2nd-level mutex.
If you violate this rule, you may end up with a deadlock.
Your new Connection object should look like the following (you need to define a Message class to represent an HTTP request message,
and you need to fill out the code for add_work() and wait_for_work() member functions for this new Connect object):
class Connection {
public:
int conn_number; /* -1 means that the connection is not initialized properly */
int socket_fd; /* -1 means closed by connection-handling thread, -2 means close by console thread and connection may still be active */
int orig_socket_fd; /* copy of the original socket_fd */
int bytes_sent; /* number of bytes of response body written into the socket */
int response_body_length; /* size of the response body in bytes, i.e., Content-Length */
shared_ptr<thread> read_thread_ptr; /* shared pointer to a socket-reading thread */
shared_ptr<thread> write_thread_ptr; /* shared pointer to a socket-writing thread */
/* the next 3 objects are for the socket-reading thread to send work to the corresponding socket-writing thread */
shared_ptr<mutex> m; /* this is a "2nd-level" mutex */
shared_ptr<condition_variable> cv;
queue<shared_ptr<Message>> q;
Connection() : conn_number(-1), socket_fd(-1), read_thread_ptr(NULL), write_thread_ptr(NULL), m(NULL), cv(NULL) { }
Connection(int c, int s, shared_ptr<thread> tr, shared_ptr<thread> tw) {
conn_number = c;
socket_fd = orig_socket_fd = s;
read_thread_ptr = tr;
write_thread_ptr = tw;
bytes_sent = response_body_length = 0;
m = make_shared<mutex>();
cv = make_shared<condition_variable>();
}
void add_work(shared_ptr<Message> msg) { /* your code goes here */ }
shared_ptr<Message> wait_for_work() { /* your code goes here */ }
};
Since we are replacing a connection-handling thread with a socket-reading thread and a socket-writing thread,
you would replace talk_to_client() with two first procedures, which I will call read_from_client() and write_to_client(),
and split the code in talk_to_client() so that read_from_client() only reads from the socket and write_to_client()
on writes to the socket.
The basic idea is that in read_from_client(), you read one HTTP request message and you create and send a Message object
to the socket-writing thread to send back an HTTP response. We can derive the pseudo-code for read_from_client() and write_to_client()
from the pseudo-code for talk_to_client() in Lab 8. But we need to add some code to make
the socket-reading thread and the socket-writing thread work together.
One thing we need to figure out is how these threads should self-terminate and who is going to join with which dead thread.
When you shutdown and close a socket, how would the
socket-reading thread and the socket-writing thread respond? Which thread would typical die first?
Should the socket-reading thread join with the socket-writing thread or the other way around? Or should the reaper thread join with both of them?
There seems to be quite a few design choices to make. I'm going to go with one way that seems to work.
The basic idea is that we can have the socket-reading thread join with the socket-writing thread and have the
reaper thread join with the socket-reading thread. So, when you enter a "close" command in the console
and the console thread closes a socket and set it to (-1), what should be the sequence of events that would happen?
For the socket-writing thread, soon after you close the socket, better_write() should return with an error code
and you should use that to get out of its infinite loop and self-terminate.
For the socket-reading thread, soon after you close the socket, read_a_line() should also return with an error code
and you should use that to get out of its infinite loop. Once it gets out of the infinite loop, it should join with
the socket-writing thread because it knows that the socket-writing thread is either dead already or is about to die!
Right before the socket-reading self-terminates, it should wake up the reaper thread so that the reaper thread can join with it.
(The reaper thread for this part of the lab must not join with the socket-writing thread since it has already been joined.)
The pseudo-code (not necessarily complete) for the socket-reading thread would look like the following:
c = connection object in argument of first procedure
lock mutex
unlock mutex
do forever /* persistent connection */
msg = read HTTP request
if msg is NULL or error then
break;
end-if
c.add_work(msg)
end-do
lock mutex
if c.socket ≥ 0 then
shutdown c.socket
end-if
set c.socket to (-1)
unlock mutex
/* signal the socket-writing thread to self-terminate */
c.add_work(NULL)
/* join with the socket-writing thread */
c.write_thread_ptr.join()
/* add dead connection to the reaper work queue */
reaper_add_work(c)
Please note that reaper_add_work(c) in the last line above is to
add work for the reaper thread in part B of this lab (where the top-level mutex is involved)
and c.add_work(msg) is to add work for the socket-writing thread (where a 2nd-level mutex is involved).
The socket-reading thread would break out of the infinite loop when it cannot read an HTTP request message.
When that happens, it uses c.add_work(NULL) to send a NULL message to the socket-writing thread.
The idea here is that when the socket-writing thread sess a NULL message, it should break out of its infinite loop and self-terminate.
The pseudo-code (not necessarily complete) for the socket-writing thread would look like the following:
c = connection object in argument of first procedure
lock mutex
unlock mutex
do forever /* persistent connection */
msg = c.wait_for_work()
if msg is NULL or error then
break;
end-if
... /* send response, break out of infinite loop if socket_fd is < 0 */
end-do
Pleaes also make the following minor changes from before since we have two threads per connection and we are sending HTTP response body one byte at a time:
- When the "status" console command is entered, please print the following to cout for each active connection:
[#]\tClient: IP:PORT, Socket: SOCKET_FD, Bytes sent: BYTES_SENT out of FILESIZE\n
where "#" is a connection number, BYTES_SENT is the number of bytes of the current response body you have written into the socket,
and FILESIZE is the size of the response body in bytes.
- When the socket-writing thread is about to self-terminate, you must print the following line into the log file:
[TIMESTAMP] [#]\tSocket-writing thread has terminated\n
- After the socket-reading thread has joined with the socket writing thread and is about to self-terminate, you must print the following line into the log file:
[TIMESTAMP] [#]\tSocket-reading thread has joined with socket-writing thread\n
- After the reaper thread has joined with a socket-reading thread, you must print the following line into the log file:
[TIMESTAMP] [#]\tReaper thread has joined with socket-reading thread\n
When you are done with implementing lab9c, please do the following:
- Change directory into the "lab9" directory mentioned above.
- Type "script lab9c.script" to start a transcript. Then type:
make clean
make lab9c
./lab9c 12345 lab9c.logfile
- You should see the command prompt. Type the following commands:
help
status
- Run two more Terminal windows and cd into the same "lab9" directory.
- In the second Terminal window, type:
rm -rf dir2; mkdir dir2; cd dir2
wget -r -l 1 http://127.0.0.1:12345/persistent.html
cd ..
- In the third Terminal window, type:
rm -rf dir3; mkdir dir3; cd dir3
wget -r -l 1 http://127.0.0.1:12345/persistent.html
cd ..
- The download speed no longer matters. Wait for downloads to start in both the 2nd and 3rd windows.
- In the first window, type "status" and make sure that it sees two active connections (connections 1 and 2).
- In the first window, type "close 1". One of the wget programs should reconnect using a different connection.
- In the first window, type "status" and make sure that it sees two active connections (connections 2 and 3).
- In the first window, type "close 2". The other wget program should reconnect using a different connection.
- In the first window, type "status" and make sure that it sees two active connections (connections 3 and 4).
- In the first window, type "quit" and make sure that the server has gracefully shutdown.
- Type:
cat lab9c.logfile
and make sure that you can see the following:
- there is exactly one "Connection closed" log entry for each connection number
- there is exactly one "Socket-writing thread has terminated" log entry for each connection number
- there is exactly one "Socket-reading thread has joined with the socket-writing thread" log entry for each connection number
- there is exactly one "Reaper has joined with connection thread" log entry for each connection number
- Repeat by starting your server in the first window and starting the same downloads in the 2nd and 3rd window. Wait 10 seconds and type the following in the first window:
status
quit
- Type "exit" to close the transcript.
Alternatively, you can also do everything inside one Terminal and run tmux.
You can split the screen vertically into 3 panes.
All pseudo-code is incomplete and error checking is often left out in pseudo-code.
Since some details are left out, depending on you how write your code, you may create race conditions
and you may need to fix your code so that your program won't freeze or crash.
Feel free to send your questions (and not your code) to the instructor.
Pseudo-code for lab9b:
Please see the above for the pseudo-code for the improved reaper thread.
Pseudo-code for lab9c:
Please see the above for the pseudo-code for the socket reading thread
and the socket writing thread.
Now that we are using lock hierarchy, we are in the position to fix a race condition bug that
was created in Lab 8 when we added a console thread. In the pseudo-code for the console thread in Lab 8,
we shutdown c.socket and set c.socket to -2. Since we have the main mutex locked, these two operations is locked togehter inside an atomic operation
with respect to the main mutex. But when we write to the socket (using better_write()), we don't have any mutex locked! This means that it's possible
that the socket writing thread can write to the socket right after the socket is shutdown but before it's set to -2. This would cause a SIGPIPE signal to get delivered
to your process and your process will abort and crash!
Since we are using a 2nd-level mutex starting with Part C of this lab, we can use that to solve this race condition problem by putting
the code for shutting down c.socket and setting it to -2 inside a critical section with respect to the mutex inside the same connection (we will refer to
this mutex as c.m). Also, we should only write to c.socket when we have c.m locked.
Now the relevant part of the console thread code would become (blue color is for critical section code with respect to c.m and
read color is for critical section code with respect to the main mutex):
c.m.lock // level 2 mutex
shutdown c.socket
c.socket = (-2)
c.m.unlock // level 2 mutex
and every time you need to call better_write(c.socket,...), you must do:
c.m.lock // level 2 mutex
better_write(c.socket,...)
c.m.unlock // level 2 mutex
This solves the race condition problem. Although there is still one problem.
There was a reason why we would only call better_write() when the main mutex was unlocked and the reason was that
better_write() may take along time to return and we didn't want to lock out other threads for a long time.
Now if better_write() takes a lont time to return, then c.m.lock in the console thread can take a long time to return
and since that code is executed with the main mutex locked, we are back to the same problem! Therefore, what we should do is that, before we call c.m.lock
in the console thread, we should unlock the main mutex. So, the relevant part of the console thread code now become (please note
that you only have to do this in the console thread when you call shutdown() on the socket and set it to -2):
// we are still inside a critical section with respect to the main mutex
unlock mutex // level 1 mutex
c.m.lock // level 2 mutex
shutdown c.socket
c.socket = (-2)
c.m.unlock // level 2 mutex
lock mutex // level 1 mutex
// we are again inside the critical section with respect to the main mutex
VERY IMPORTANT: If you use the code above, please remember that if you unlock the main mutex,
the connection list may be modified by another thread and Connection objects can be added or deleted!
Therefore, if for some reason you are iterating through the connection list when the above code is executed,
you must not touch the iterator you were using (or if you are using an array index, you should not use it any more) since it may have become invalid or out-dated.
It may be better that you don't lock the main mutex at the end and just break out of the loop.
What about the socket-reading thread? If read_a_line() returns -1, you must shutdown() the socket and set the socket_fd inside the Connection
object to -1. To avoid SIGPIPE, you should do this with critical section code. Since you don't have the main mutex locked at this point, you should do:
c.m.lock // level 2 mutex
shutdown c.socket
c.socket = (-1)
c.m.unlock // level 2 mutex
Below is the grading breakdown:
- (1 pt) submitted a valid lab9.tar.gz file with all the required files using the submission procedure below
- (1 pt) content in "lab9a.script", "lab9b.script", and "lab9c.script" are correct
- (1 pt) content in "lab9b.logfile" and "lab9c.logfile" are correct
- (1 pt) "Makefile" works for "make lab9b" and "make lab9c"
- (1 pt) source code of your server program looks right
Minimum deduction is 0.5 pt for anything that's incorrect.
Please note that for the " Makefile" item, you can only get credit for it if your "source code" is relevant to this lab; therefore, you can only get as many points as the "source code" item
in the best case.
Please keep in mind that even though lab grading is "light", it doesn't mean that you can just put anything
into your submission! It's still your responsibility to make sure that the files in your submission contains
information that's relevant to the tests you were supposed to run.
Use the "more" command to view your script/log files to make sure that they contain the right information.
If a file has the wrong stuff in it, you should delete it and create the file again and verify.
If most of the stuff in your script/log files are wrong and you did not notice it, we will most likely have to take points off.
To submit your work, you must first tar all the files you want to submit into a tarball and
gzip it to create a gzipped tarfile named " lab9.tar.gz".
Then you upload " lab9.tar.gz" to our Bistro submission server.
Change into the "lab9" directory you have created above and enter the following command
to create your submission file "lab9.tar.gz" (if you don't have any ".h" files, don't include "*.h*" at the end):
tar cvzf lab9.tar.gz lab9*.script lab9*.logfile Makefile *.c* *.h*
ls -l lab9.tar.gz
The last command shows you how big the created " lab9.tar.gz" file is.
If " lab9.tar.gz" is larger than 1MB in size, the submission server will not accept it.
If you use an IDE, the IDE may put your source code in subdirectories. In that case,
you need to modify the commands above so that you include ALL
the necessary source files and subdirectories (and don't include any binary files)
ane make sure that your code can be compiled without the IDE since the grader is not allowed to use an IDE to compile your code.
You should read the output of the above commands carefully to make sure that "lab9.tar.gz" is created properly.
If you don't understand the output of the above commands, you need to learn how to read it!
It's your responsibility to ensure that "lab9.tar.gz" is created properly.
To check the content of "lab9.tar.gz", you can use the following command:
tar tvf lab9.tar.gz
Please read the output of the above command carefully to see what files were included in " lab9.tar.gz"
and what are their file sizes and make sure that they make sense.
Please enter your USC e-mail address and your submission PIN below. Then click on the Browse button
and locate and select your submission file (i.e., "lab9.tar.gz").
Then click on the Upload button to submit your "lab9.tar.gz".
(Be careful what you click! Do NOT submit the wrong file!)
If you see an error message, please read the dialogbox carefully and fix what needs to be fixed and repeat the procedure.
If you don't know your submission PIN, please visit this web site to have your PIN e-mailed to your USC e-mail address.
When this web page was last loaded, the time at the submission server at merlot.usc.edu was
27Nov2025-18:59:12.
Reload this web page to see the current time on merlot.usc.edu.
If the command is executed successfully and if everything checks out,
a ticket will be issued to you to let you know "what" and "when"
your submission made it to the Bistro server. The next web page you
see would display such a ticket and the ticket should look like
the sample shown in the submission web page
(of course, the actual text would be different, but the format should be similar).
Make sure you follow the Verify Your Ticket instructions
to verify the SHA1 hash of your submission to make sure what you did not accidentally submit the wrong file.
Also, an e-mail (showing the ticket) will be sent to your USC e-mail address.
Please read the ticket carefully to know exactly "what" and "when"
your submission made it to the Bistro server.
If there are problems, please contact the instructor.
It is extreme important that you also verify your submission
after you have submitted "lab9.tar.gz" electronically to make
sure that every you have submitted is everything you wanted us to grade.
If you don't verify your submission and
you ended up submit the wrong files, please understand that due to our fairness policy,
there's absolutely nothing we can do.
Finally, please be familiar with the Electronic Submission Guidelines
and information on the bsubmit web page.
|