Lab #9

[ This content is protected and may not be shared, uploaded, or distributed. ]

(Please also check out PA4 FAQ.)

Lab 9 has three parts:

Part A - introduction to condition variable
Part B - better reaper thread
Part C - two threads per connection

Part A (`lab9a`) - introduction to condition variable (useful for PA4):

This part has no coding. Please do the following:
Create an empty directory (call it "lab9") and change directory into it.
Download lab9data.tar.gz into the current directory and type:
    tar xvf lab9data.tar.gz
This should create a subdirectory called "lab9data" with a Makefile and several .cpp files in it which we have covered in class when we discussed "multithreading part 4 - condition variables".
Type "script lab9a.script" to start a transcript. (If your command shell is bash, the "foreach" command below will not work and you should first type "tcsh" to change your command shell to tcsh before proceeding.) Then type:
    uname -a
    cat /etc/os-release
    cd lab9data
    make clean
    foreach f (cv)
        make $f
        ./a.out
    end
    make clean
    cd ..
Type "exit" to close the transcript.
Make sure you read the code and understand how to use a mutex, condition variable, and lock so that one thread can correctly send "work" to a target thread while the target thread can correctly wait for work to arrive into its "work queue" when you do the next part of this lab.

Part B (`lab9b`) - better reaper thread (useful for PA4):

This lab is web server-only. Please continue to use "lab4data" as the root directory of your server for the rest of this lab. In this part of the lab, you need to modify your reaper thread to only wake up when there is "work" to perform (i.e., when there is a dead connection thread to join). Please first do the following:
Change directory into the "lab9" directory mentioned above.
Copy your code for Part B of Lab 8 into the current directory, create a Makefile so that when you type "make lab9b", an executable called lab9b will be created.
We will use the test data from Lab 4. Please download lab4data.tar.gz into that directory and type:
    tar xvf lab4data.tar.gz
This should create a subdirectory called "lab4data" with a bunch of files in it.
The commandline syntax for lab9b is: "lab9b PORT LOGFILE".
Your multithreaded web server with console in Part B of Lab 8 was sending data to clients at the rate of about 1 KB per second. This means that most of the time, your connection-handling threads are sleeping and there was very little contention among all your threads. So, even if you have synchronization bugs in your code, chances are, it won't show up.
In this part of the lab, we will first speed things up so that if you have synchronization bugs in your code, bad things would be more likely to happen. Since we are done with PA3, we no longer need to control the data sending rate of your server. Please remove all your token bucket related throttling code. To expose potential synchronization bugs in your code, please do the following to make your connection-handling threads more busy. When you are writing 1 KB into the socket, don't write all 1 KB in one shot but stay in a loop and write one byte at a time and call usleep(250) to sleep for a quarter of a millisecond before writing the next byte.
Let's also fix the reaper thread in Part B of Lab 8. The reaper thread wakes up every 250ms. Most of the time, there's nothing for it to do. It would be nicer if it only wakes up when there is "work" to be performed. The only "work" for a reaper thread is to join with a connection-handling thread.
Who is providing work for the reaper thread? When a connection-handling thread dies, it produces work for the reaper thread. We will have the connection-handling thread sends a notification to the reaper thread when such work has been produced and has been added to a "work queue". The recipe for sending a notification and using a "work queue" is in Part A above. Please review the "cv.cpp" code and make sure you understand what the add_work() and wait_for_work() functions are doing and why this is the only safe way for one thread to send data to another thread. In "cv.cpp", we have a queue of integers. In general, this queue can be a queue of "work objects". For this part of the lab, please copy the add_work() and wait_for_work() functions (and whatever global variables you need) from "cv.cpp" into your code and make sure it compiles without error. Then change the queue of integers into a queue of Connection objects and change the argument of add_work() to be a shared pointer to a Connection object and have wait_for_work() returns a shared pointer to a Connection object. It would also make sense to rename these functions to be reaper_add_work() and reaper_wait_for_work() since we will have all kinds of work queues in future labs and assignments! Then the reaper thread's infinite loop would look like the following:
    do forever
        c = reaper_wait_for_work()
        if (c == NULL) then
            break out of infinite loop
        else
            join with c.thread
            lock mutex
            print required message
            close c.orig_socket
            remove c from conneciton list
            unlock mutex
        end-if
    end-do
    do forever
        lock mutex
        if connection list is empty then
            unlock mutex
            break out of infinite loop
        endif
        c2 = any connection in connection list
        unlock mutex
        join with c2.thread
        lock mutex
        print required message
        close c2.orig_socket
        remove c2 from conneciton list
        unlock mutex
    end-do
Please note that in the above pseudo-code, reaper_wait_for_work() can return a NULL pointer to break out of the first infinite loop. The idea here is that we need a way to ask the reaper thread to self-terminate and the can be achieved by adding a NULL pointer (or a special Connection object) to the reaper thread's work queue. When should this be done? It should be done when no more new connection is possible (i.e., the connection list can never grow longer). Therefore, this can be done with the user entered the "quit" console command and set the listening socket to (-1) or any time afterwards (since no new connection will be possible). Of course, this should be done before the main thread joins with the reaper thread. The 2nd infinite loop is there to make sure that the reaper thread will not self-terminate until all connection-handling threads are joined.
When a connection-handling thread sets its socket file descriptor in a Connection object to (-1), since this thread is about to self-terminate, it should do (make sure that you are doing this when the mutex is unlocked):
    reaper_add_work(c)
This will add the Connection object to the queue of Connection objects and will notify/wakeup the reaper thread to join with this thread and the reaper thread will remove the connection from the connection list.
When you are done with implementing lab9b, please do the following:
Change directory into the "lab9" directory mentioned above.
Type "script lab9b.script" to start a transcript. Then type:
    make clean
    make lab9b
    ./lab9b 12345 lab9b.logfile
You should see the command prompt. Type the following commands:
    help
    status
Run two more Terminal windows and cd into the same "lab9" directory.
In the second Terminal window, type:
    wget -O x http://127.0.0.1:12345/textbooks-1-small.jpg
In the third Terminal window, type:
    wget -O y http://127.0.0.1:12345/viterbi-seal-rev-770x360.png
The download speed no longer matters. Wait for downloads to start in both the 2nd and 3rd windows.
In the first window, type "status" and make sure that it sees two active connections (connections 1 and 2).
In the first window, type "close 1". One of the wget programs should reconnect using a different connection.
In the first window, type "status" and make sure that it sees two active connections (connections 2 and 3).
In the first window, type "close 2". The other wget program should reconnect using a different connection.
In the first window, type "status" and make sure that it sees two active connections (connections 3 and 4).
In the first window, type "quit" and make sure that the server has gracefully shutdown.
Type:
    cat lab9b.logfile
and make sure that you can see the following:

there is exactly one "Connection closed" log entry for each connection number
there is exactly one "Reaper has joined with connection thread" log entry for each connection number
Repeat by starting your server in the first window and starting the same downloads in the 2nd and 3rd window. Wait 5 seconds and type the following in the first window:
    status
    quit
Type "exit" to close the transcript.
Alternatively, you can also do everything inside one Terminal and run tmux. You can split the screen vertically into 3 panes.

Part C (`lab9c`) - two threads per connection (useful for PA4):

A socket is bidirectional and full duplex, meaning that you can read from a socket and write to the socket simulatneously! So far, we have been implementing a client-server model where a client initiates a request, then the server responds, then the client sends another request, then the server responds, and so on. There is on need to read and write a socket simulatneously in a typical client-server model of network communication.
To get ready for PA4 where you will be implementing router functionalities in a peer-to-peer network architecture, you will first use two threads to handle a connection. One socket-reading thread to read from the socket and one socket-writing thread to write to the socket. For this lab, we will continue from Part B above to implement the final version of our multithreaded web server!
Please first do the following:

Change directory into the "lab9" directory mentioned above.
Copy your "lab9b.cpp" into "lab9c.cpp" and modify the Makefile so that when you type "make lab9c", an executable called lab9c will be created.
The commandline syntax for lab9c is: "lab9c PORT LOGFILE".
Since we are still implementing a web server, when the socket-reading thread gets a request from a client, it needs to ask the corresponding socket-writing thread to send a response. In order for these two threads to "communicate", the socket-writing thread will use a "work queue" similar to the work queue for the reaper thread. The difference is that, in this case, the "work" is not a connection, but an HTTP request message (or can simply be the request line in the HTTP request header, since we are ignoring everything else in the request header). The would mean that there needs to be a "work queue" (with its own mutex and CV) within each connection! Now, in the larger picture would be that we have a top-level mutex, which we have been using to access the connection list and to synchronize the main thread, the reaper thread, the console threads, and connection-handling threads. The mutex within a connection would then be a 2nd-level mutex. In order to avoid deadlocks, if you need to lock both the top-level mutex AND a 2nd-level mutex, you must follow the rule for using a "lock hierarchy" (which we have covered in class when we discussed "multithreading part 3 - mutex"). For this lab, it means that you must not try to lock the top-level mutex when you are holding a 2nd-level mutex. If you violate this rule, you may end up with a deadlock.
Your new Connection object should look like the following (you need to define a Message class to represent an HTTP request message, and you need to fill out the code for add_work() and wait_for_work() member functions for this new Connect object):
    class Connection {
    public:
        int conn_number; /* -1 means that the connection is not initialized properly */
        int socket_fd; /* -1 means closed by connection-handling thread, -2 means close by console thread and connection may still be active */
        int orig_socket_fd; /* copy of the original socket_fd */
        int bytes_sent; /* number of bytes of response body written into the socket */
        int response_body_length; /* size of the response body in bytes, i.e., Content-Length */

        shared_ptr<thread> read_thread_ptr; /* shared pointer to a socket-reading thread */
        shared_ptr<thread> write_thread_ptr; /* shared pointer to a socket-writing thread */

        /* the next 3 objects are for the socket-reading thread to send work to the corresponding socket-writing thread */
        shared_ptr<mutex> m; /* this is a "2nd-level" mutex */ 
        shared_ptr<condition_variable> cv;
        queue<shared_ptr<Message>> q;

        Connection() : conn_number(-1), socket_fd(-1), read_thread_ptr(NULL), write_thread_ptr(NULL), m(NULL), cv(NULL) { }
        Connection(int c, int s, shared_ptr<thread> tr, shared_ptr<thread> tw) {
            conn_number = c;
            socket_fd = orig_socket_fd = s;
            read_thread_ptr = tr;
            write_thread_ptr = tw;
            bytes_sent = response_body_length = 0;
            m = make_shared<mutex>();
            cv = make_shared<condition_variable>();
        }
        void add_work(shared_ptr<Message> msg) { /* your code goes here */ }
        shared_ptr<Message> wait_for_work() { /* your code goes here */ }
    };
Since we are replacing a connection-handling thread with a socket-reading thread and a socket-writing thread, you would replace talk_to_client() with two first procedures, which I will call read_from_client() and write_to_client(), and split the code in talk_to_client() so that read_from_client() only reads from the socket and write_to_client() on writes to the socket. The basic idea is that in read_from_client(), you read one HTTP request message and you create and send a Message object to the socket-writing thread to send back an HTTP response. We can derive the pseudo-code for read_from_client() and write_to_client() from the pseudo-code for talk_to_client() in Lab 8. But we need to add some code to make the socket-reading thread and the socket-writing thread work together.
One thing we need to figure out is how these threads should self-terminate and who is going to join with which dead thread. When you shutdown and close a socket, how would the socket-reading thread and the socket-writing thread respond? Which thread would typical die first? Should the socket-reading thread join with the socket-writing thread or the other way around? Or should the reaper thread join with both of them? There seems to be quite a few design choices to make. I'm going to go with one way that seems to work.
The basic idea is that we can have the socket-reading thread join with the socket-writing thread and have the reaper thread join with the socket-reading thread. So, when you enter a "close" command in the console and the console thread closes a socket and set it to (-1), what should be the sequence of events that would happen? For the socket-writing thread, soon after you close the socket, better_write() should return with an error code and you should use that to get out of its infinite loop and self-terminate. For the socket-reading thread, soon after you close the socket, read_a_line() should also return with an error code and you should use that to get out of its infinite loop. Once it gets out of the infinite loop, it should join with the socket-writing thread because it knows that the socket-writing thread is either dead already or is about to die! Right before the socket-reading self-terminates, it should wake up the reaper thread so that the reaper thread can join with it. (The reaper thread for this part of the lab must not join with the socket-writing thread since it has already been joined.)
The pseudo-code (not necessarily complete) for the socket-reading thread would look like the following:
    c = connection object in argument of first procedure
    lock mutex
    unlock mutex
    do forever /* persistent connection */
        msg = read HTTP request
        if msg is NULL or error then
            break;
        end-if
        c.add_work(msg)
    end-do
    lock mutex
    if c.socket ≥ 0 then
        shutdown c.socket
    end-if
    set c.socket to (-1)
    unlock mutex
    /* signal the socket-writing thread to self-terminate */
    c.add_work(NULL)
    /* join with the socket-writing thread */
    c.write_thread_ptr.join()
    /* add dead connection to the reaper work queue */
    reaper_add_work(c)
Please note that reaper_add_work(c) in the last line above is to add work for the reaper thread in part B of this lab (where the top-level mutex is involved) and c.add_work(msg) is to add work for the socket-writing thread (where a 2nd-level mutex is involved). The socket-reading thread would break out of the infinite loop when it cannot read an HTTP request message. When that happens, it uses c.add_work(NULL) to send a NULL message to the socket-writing thread. The idea here is that when the socket-writing thread sess a NULL message, it should break out of its infinite loop and self-terminate.
The pseudo-code (not necessarily complete) for the socket-writing thread would look like the following:
    c = connection object in argument of first procedure
    lock mutex
    unlock mutex
    do forever /* persistent connection */
        msg = c.wait_for_work()
        if msg is NULL or error then
            break;
        end-if
        ... /* send response, break out of infinite loop if socket_fd is < 0 */
    end-do
Pleaes also make the following minor changes from before since we have two threads per connection and we are sending HTTP response body one byte at a time:
When the "status" console command is entered, please print the following to cout for each active connection:
    [#]\tClient: IP:PORT, Socket: SOCKET_FD, Bytes sent: BYTES_SENT out of FILESIZE\n
where "#" is a connection number, BYTES_SENT is the number of bytes of the current response body you have written into the socket, and FILESIZE is the size of the response body in bytes.
When the socket-writing thread is about to self-terminate, you must print the following line into the log file:
    [TIMESTAMP] [#]\tSocket-writing thread has terminated\n
After the socket-reading thread has joined with the socket writing thread and is about to self-terminate, you must print the following line into the log file:
    [TIMESTAMP] [#]\tSocket-reading thread has joined with socket-writing thread\n
After the reaper thread has joined with a socket-reading thread, you must print the following line into the log file:
    [TIMESTAMP] [#]\tReaper thread has joined with socket-reading thread\n
When you are done with implementing lab9c, please do the following:
Change directory into the "lab9" directory mentioned above.
Type "script lab9c.script" to start a transcript. Then type:
    make clean
    make lab9c
    ./lab9c 12345 lab9c.logfile
You should see the command prompt. Type the following commands:
    help
    status
Run two more Terminal windows and cd into the same "lab9" directory.
In the second Terminal window, type:
    rm -rf dir2; mkdir dir2; cd dir2
    wget -r -l 1 http://127.0.0.1:12345/persistent.html
    cd ..
In the third Terminal window, type:
    rm -rf dir3; mkdir dir3; cd dir3
    wget -r -l 1 http://127.0.0.1:12345/persistent.html
    cd ..
The download speed no longer matters. Wait for downloads to start in both the 2nd and 3rd windows.
In the first window, type "status" and make sure that it sees two active connections (connections 1 and 2).
In the first window, type "close 1". One of the wget programs should reconnect using a different connection.
In the first window, type "status" and make sure that it sees two active connections (connections 2 and 3).
In the first window, type "close 2". The other wget program should reconnect using a different connection.
In the first window, type "status" and make sure that it sees two active connections (connections 3 and 4).
In the first window, type "quit" and make sure that the server has gracefully shutdown.
Type:
    cat lab9c.logfile
and make sure that you can see the following:

there is exactly one "Connection closed" log entry for each connection number
there is exactly one "Socket-writing thread has terminated" log entry for each connection number
there is exactly one "Socket-reading thread has joined with the socket-writing thread" log entry for each connection number
there is exactly one "Reaper has joined with connection thread" log entry for each connection number
Repeat by starting your server in the first window and starting the same downloads in the 2nd and 3rd window. Wait 10 seconds and type the following in the first window:
    status
    quit
Type "exit" to close the transcript.
Alternatively, you can also do everything inside one Terminal and run tmux. You can split the screen vertically into 3 panes.

Templates

All pseudo-code is incomplete and error checking is often left out in pseudo-code. Since some details are left out, depending on you how write your code, you may create race conditions and you may need to fix your code so that your program won't freeze or crash. Feel free to send your questions (and not your code) to the instructor.

Pseudo-code for `lab9b`:

Please see the above for the pseudo-code for the improved reaper thread.

Pseudo-code for `lab9c`:

Please see the above for the pseudo-code for the socket reading thread and the socket writing thread.
Now that we are using lock hierarchy, we are in the position to fix a race condition bug that was created in Lab 8 when we added a console thread. In the pseudo-code for the console thread in Lab 8, we shutdown c.socket and set c.socket to -2. Since we have the main mutex locked, these two operations is locked togehter inside an atomic operation with respect to the main mutex. But when we write to the socket (using better_write()), we don't have any mutex locked! This means that it's possible that the socket writing thread can write to the socket right after the socket is shutdown but before it's set to -2. This would cause a SIGPIPE signal to get delivered to your process and your process will abort and crash!
Since we are using a 2nd-level mutex starting with Part C of this lab, we can use that to solve this race condition problem by putting the code for shutting down c.socket and setting it to -2 inside a critical section with respect to the mutex inside the same connection (we will refer to this mutex as c.m). Also, we should only write to c.socket when we have c.m locked. Now the relevant part of the console thread code would become (blue color is for critical section code with respect to c.m and read color is for critical section code with respect to the main mutex):
                c.m.lock // level 2 mutex
                shutdown c.socket
                c.socket = (-2)
                c.m.unlock // level 2 mutex
and every time you need to call better_write(c.socket,...), you must do:
                c.m.lock // level 2 mutex
                better_write(c.socket,...)
                c.m.unlock // level 2 mutex
This solves the race condition problem. Although there is still one problem. There was a reason why we would only call better_write() when the main mutex was unlocked and the reason was that better_write() may take along time to return and we didn't want to lock out other threads for a long time. Now if better_write() takes a lont time to return, then c.m.lock in the console thread can take a long time to return and since that code is executed with the main mutex locked, we are back to the same problem! Therefore, what we should do is that, before we call c.m.lock in the console thread, we should unlock the main mutex. So, the relevant part of the console thread code now become (please note that you only have to do this in the console thread when you call shutdown() on the socket and set it to -2):
                // we are still inside a critical section with respect to the main mutex
                unlock mutex // level 1 mutex
                c.m.lock // level 2 mutex
                shutdown c.socket
                c.socket = (-2)
                c.m.unlock // level 2 mutex
                lock mutex // level 1 mutex
                // we are again inside the critical section with respect to the main mutex
VERY IMPORTANT: If you use the code above, please remember that if you unlock the main mutex, the connection list may be modified by another thread and Connection objects can be added or deleted! Therefore, if for some reason you are iterating through the connection list when the above code is executed, you must not touch the iterator you were using (or if you are using an array index, you should not use it any more) since it may have become invalid or out-dated. It may be better that you don't lock the main mutex at the end and just break out of the loop.
What about the socket-reading thread? If read_a_line() returns -1, you must shutdown() the socket and set the socket_fd inside the Connection object to -1. To avoid SIGPIPE, you should do this with critical section code. Since you don't have the main mutex locked at this point, you should do:
                c.m.lock // level 2 mutex
                shutdown c.socket
                c.socket = (-1)
                c.m.unlock // level 2 mutex

Grading

Below is the grading breakdown:

(1 pt) submitted a valid lab9.tar.gz file with all the required files using the submission procedure below
(1 pt) content in "lab9a.script", "lab9b.script", and "lab9c.script" are correct
(1 pt) content in "lab9b.logfile" and "lab9c.logfile" are correct
(1 pt) "Makefile" works for "make lab9b" and "make lab9c"
(1 pt) source code of your server program looks right

Minimum deduction is 0.5 pt for anything that's incorrect. Please note that for the "Makefile" item, you can only get credit for it if your "source code" is relevant to this lab; therefore, you can only get as many points as the "source code" item in the best case.

Please keep in mind that even though lab grading is "light", it doesn't mean that you can just put anything into your submission! It's still your responsibility to make sure that the files in your submission contains information that's relevant to the tests you were supposed to run. Use the "more" command to view your script/log files to make sure that they contain the right information. If a file has the wrong stuff in it, you should delete it and create the file again and verify. If most of the stuff in your script/log files are wrong and you did not notice it, we will most likely have to take points off.

Submission

To submit your work, you must first tar all the files you want to submit into a tarball and gzip it to create a gzipped tarfile named "lab9.tar.gz". Then you upload "lab9.tar.gz" to our Bistro submission server.

Change into the "lab9" directory you have created above and enter the following command to create your submission file "lab9.tar.gz" (if you don't have any ".h" files, don't include "*.h*" at the end):

    tar cvzf lab9.tar.gz lab9*.script lab9*.logfile Makefile *.c* *.h*
    ls -l lab9.tar.gz

The last command shows you how big the created "lab9.tar.gz" file is. If "lab9.tar.gz" is larger than 1MB in size, the submission server will not accept it.

If you use an IDE, the IDE may put your source code in subdirectories. In that case, you need to modify the commands above so that you include ALL the necessary source files and subdirectories (and don't include any binary files) ane make sure that your code can be compiled without the IDE since the grader is not allowed to use an IDE to compile your code.

You should read the output of the above commands carefully to make sure that "lab9.tar.gz" is created properly. If you don't understand the output of the above commands, you need to learn how to read it! It's your responsibility to ensure that "lab9.tar.gz" is created properly.

To check the content of "lab9.tar.gz", you can use the following command:

    tar tvf lab9.tar.gz

Please read the output of the above command carefully to see what files were included in "lab9.tar.gz" and what are their file sizes and make sure that they make sense.

Please enter your USC e-mail address and your submission PIN below. Then click on the Browse button and locate and select your submission file (i.e., "lab9.tar.gz"). Then click on the Upload button to submit your "lab9.tar.gz". (Be careful what you click! Do NOT submit the wrong file!) If you see an error message, please read the dialogbox carefully and fix what needs to be fixed and repeat the procedure. If you don't know your submission PIN, please visit this web site to have your PIN e-mailed to your USC e-mail address.

When this web page was last loaded, the time at the submission server at merlot.usc.edu was 27Nov2025-18:59:12. Reload this web page to see the current time on merlot.usc.edu.

If the command is executed successfully and if everything checks out, a ticket will be issued to you to let you know "what" and "when" your submission made it to the Bistro server. The next web page you see would display such a ticket and the ticket should look like the sample shown in the submission web page (of course, the actual text would be different, but the format should be similar). Make sure you follow the Verify Your Ticket instructions to verify the SHA1 hash of your submission to make sure what you did not accidentally submit the wrong file. Also, an e-mail (showing the ticket) will be sent to your USC e-mail address. Please read the ticket carefully to know exactly "what" and "when" your submission made it to the Bistro server. If there are problems, please contact the instructor.

It is extreme important that you also verify your submission after you have submitted "lab9.tar.gz" electronically to make sure that every you have submitted is everything you wanted us to grade. If you don't verify your submission and you ended up submit the wrong files, please understand that due to our fairness policy, there's absolutely nothing we can do.

Finally, please be familiar with the Electronic Submission Guidelines and information on the bsubmit web page.

(5 points total)

Condition Variable

Part A (`lab9a`) - introduction to condition variable (useful for PA4):

Part B (`lab9b`) - better reaper thread (useful for PA4):

Part C (`lab9c`) - two threads per connection (useful for PA4):

Pseudo-code for `lab9b`:

Pseudo-code for `lab9c`:

Lab #9

(5 points total)

Condition Variable

Part A (lab9a) - introduction to condition variable (useful for PA4):

Part B (lab9b) - better reaper thread (useful for PA4):

Part C (lab9c) - two threads per connection (useful for PA4):

Pseudo-code for lab9b:

Pseudo-code for lab9c:

Part A (`lab9a`) - introduction to condition variable (useful for PA4):

Part B (`lab9b`) - better reaper thread (useful for PA4):

Part C (`lab9c`) - two threads per connection (useful for PA4):

Pseudo-code for `lab9b`:

Pseudo-code for `lab9c`: