Lab #13

[ This content is protected and may not be shared, uploaded, or distributed. ]

(Please also check out PA5 FAQ.)

Lab 13 has two parts:

Part A - traceroute, a 353NET application
Part B - more tests

Part A (`lab13a`) - traceroute, a `UDT` application (useful for PA5):

In this lab, we will impmlement a UDP-like application for the 353NET.
At the end of Part B of Lab 12, there was no UDT receiver to which to delivery UDT messages. We will add UDT receivers in this part of the lab. We will continue with Part B of your Lab 12 code and add UDT receivers and implement a traceroute application that runs on top of UDT (on the real Internet, traceroute runs on top of UDP).
Please start by make a copy of your "lab12b.cpp" code and call it "lab13a.cpp" and add another target to the Makefile so that when you type "make lab13a", an executable called lab13a will be created. The usage information (i.e., commandline syntax) for "lab13a" is as follows:
    lab13a CONFIGFILE
where CONFIGFILE is a configuration file similar to the ones used in Lab 12.
You can delete the code for the udtsend console command (but keep all the other UDT related code) since we will be sending real UDT messages starting from this lab. There is no need to send UDT test messages any more.
In Lab 12, the message body of a UCASTAPP message is just a single line of text (not containing characters such as "\r", "\n", and "\0") because we were just testing things out. For a real UCASTAPP message, the message body must have a structure. If NEXT_LAYER in a UCASTAPP message is 1 (i.e., next layer is UDT), the mssage body can either be a single-line UDT application message or a multiple-line UDT application message.
A single-line UDT application message has the following format:
    353UDT/1.0 UDTMSGTYPE ARGS
where ARGS are arguments and the number of arguments and their meaning depends on the value of UDTMSGTYPE and there must not be any "\r", "\n", or "\0" characters anywhere in this line (and there is no delimiter that tells you where this line ends in this part of a message).
A UDT application message can also be a multi-line message and it has the following format:
    353UDT/1.0 UDTMSGTYPE\r\n
    ...\r\n
    UDT-Content-Length: NUMBER\r\n
    \r\n
    NUMBER bytes of UDT application data
Using UDTMSGTYPE, you can determine if you have a single-line or a multiple-line UDT application message.
The way you should think about this is that a UDT application message is encapsulated inside a UCASTAPP message, and/or that a UDT application message is the payload of a UCASTAPP message that encapsulates it. We have a message within a message, and that's how encapsulation works.
When you receive such a message, you should read the entire message body of a UCASTAPP message as binary data (similar to what we do with HTTP response messages). The number of bytes of binary data you must read is the value of the Content-Length key in the UCASTAPP message header. Since all 353NET messages are relatively short, you can just allocate a buffer of Content-Length bytes and then read the data from the socket and copy into the buffer.
Some students notice that all 353NET messages are text message and would use a C++ string to store the entire message. Please note that if you don't do this very carefully, since all connection are persistent connections, you may end up copying data from the next message into the C++ string and you may not be able to find the beginning of the next message later! Therefore, if you want to use C++ string to store a UDT application message, you can allocate a buffer one byte bigger than the value of the Content-Length key in the UCASTAPP message header. After you copy the correct number of bytes from the socket into this buffer, you can put a "\0" into the last byte of the buffer and convert the entire buffer into a C++ string. (You can modify the ReadBinaryFromSocket() function in the PA2 FAQ by allocating a buffer that's one byte longer.) To make coding simpler, you can even look to see if there is any "\n" characters anywhere in the C++ string. If there is, then the payload is a multi-line UDT application message. Otherwise, it's a single-line UDT application message.
For this lab, we only have the following single-line UDT application messages and they are:

353UDT/1.0 PING SESID
353UDT/1.0 PONG SESID
353UDT/1.0 TTLZERO SESID
In the above, SESID is a unique session identifier. (For this lab, since a node cannot run multiple 353NET applications, we can just use a sequence/serial number to distinguish different invocation of the traceroute application at a node. Alternatively, you can call GetObjID() with the 2nd argument being something like "ses" to get a network-wide unique identifier, although this maybe an overkill.)
In Lab 12, when a node decrements TTL of a UCASTAPP message to zero, it simply drop the message. For this lab, in addition to dropping the message, this node must first check to see if the message is a PING UDT application message. If it is, it must initiate a new UCASTAPP message whose destination node is the source node of the UCASTAPP message it dropped, TTL is the max_ttl value its configuration file, NEXT_LAYER is 1, and the message body being a TTLZERO UDT application message with SESID extracted from the PING UDT application message. This message informs the source node of the PING message that the PING message failed to find its target.
As mentioned in Lab 12, you should write a function called udt_send() to initiate a UCASTAPP message with a UDT payload and start sending it towards the destination node using current node's fowarding table (udt_send() was mentioned on slide 4 of the lecture slides on section (3.4) of the textbook on the principles of reliable data transfer). If you didn't do this in Lab 12, it's highly recommended that you first write this function and call this function when your UDT application needs to send a UDT application message to a target node.
If your node's NodeID equals to the DEST_NODEID in a UCASTAPP message you get, you must deliver this message to its upper layer. For this lab, if the NEXT_LAYER is 1, you must deliver this message to the application layer that handle UDT application messages. This application layer would parse the message and figure out what to do. There are 3 cases.

If the UDT application message is a PING message, this node must initiate a new UCASTAPP message whose destination node is the source node of the UCASTAPP message it received, TTL is the max_ttl value its configuration file, NEXT_LAYER is 1, and the message body being a PONG UDT application message with SESID being the one extracted from the PING UDT application message. This message informs the source node of the PING message that the target of the PING message has been reached successfully.

If the UDT application message is a TTLZERO message, then this node must be the node that's running the traceroute application. This node should then compare the SESID in the message with its session ID. If they are not equal, you should ignore this message. Otherwise, you must deliver the message to the traceroute application (this is where traceroute would return from line [7] of the pseudo-code).

If the UDT application message is a PONG message, then this node must be the node that's running the traceroute application. This node should then compare the SESID in the message with its session ID. If they are not equal, you should ignore this message. Otherwise, you must print a required message to the console.
Please add the following "traceroute" console command:
Name Arguments Description
traceroute

target
(Please ignore additional arguments if they are present.) Run the traceroute application. Although instead of sending 3 packets as in the real traceroute Linux/Unix program, you must only send one PING message in each round. Also, before running this command, you must run BFS to clean up the adjacency list since there may be unreachable nodes in your adjacency list data structure. If target is the NodeID of the node whose console you are typing into. You must print the following to the console:
    Cannot traceroute to yourself\n
Otherwise, you must start the traceroute application, even if target is the node ID of a node that's not running (i.e., unreachable). As you run the traceroute application, you must display the result in each step of traceroute according to the traceroute pseudo-code.
You should use the str_timestamp_diff_in_seconds() function in "my_timestamp.cpp" to format the difference (in seconds) between two timestamps (which were obtained by calling gettimeofday()).
The traceroute pseudo-code looks pretty straight-forward. The question is, how should you wait for a response to come back and how do you wait for timeout simultaneously? One common way to implement an application that wait for responses and timeouts is to use an event-driven framework. Your application should sit in an infinite loop and wait for events to happen and handle the events one at a time.
How do you wait for an event? You create a work queue with a mutex (you can share the main mutex and you don't have to create a new one) and a condition variable and you call wait_for_work()! In this case, the work is an Event object and you can design your own Event objects. You can put a UDT application message (or just part of it) inside an Event object.
Where do events come from? They come from lower layers of the protocol stack! They can also come from timeouts. In every iteration of the traceroute application, you need to start a timer to expire after msg_lifetime seconds has elapsed. To implement this, you can create a timer thread to simply call usleep(msg_lifetime×1000000). When usleep() returns, you create a timeout event and call add_work() to add the timeoutevent to the event queue mentioned above! At the end of the iteration, you must join() with this timer thread before you go on to the next iteration.
It's important to understand that events can also come from hardware! In a way, you can think of timers as hardware devices and you can think of messages from the Internet as something that came from the Network Interface Card (NIC). When a timer goes off or when data arrives at the NIC, these "events" arrive asynchronously with respect to the timing of your code. Using the concept of a "work queue", we can turn these asynchronous hardware events into synchronous software events (and they would appear as Event objects for your software to deal with). Also, these hardware events, in a way, are not "compatible" with each other. But turning them into Event objects, your code can deal with them in a uniform way.
Here's another issue. If you implement a timer thread as mentioned above, you need to be able to cancel a timer if you have received either a TTLZERO message or a PONG message. How can you cancel a timer? There's something called "thread cancellation" in the pthread library, but unfortunately, the standard C++ that we are required to use (i.e., c++11) does not support thread cancellation! (The latest C++ standard, c++20, finally have started to support thread cancellation, but you have to use a different thread object call jthread. Please do not use jthread since the grader is required to compile your code with "g++ -g -Wall -std=c++11".)
Since there is no way to cancel a timer thread, we need to have the timer thread self-terminate. One way to do this is to use a global variable (conceptually) to indicate if the timer thread need to self-terminate and have the timer thread wake up periodically to check the value of this global variable. So, instead of sleeping for the full msg_lifetime, you must use a countdown timer that that you can timeout in a timely fashion. For example, you can have a timer that "ticks" every 250 milliseconds. In that case, in order to sleep for msg_lifetime seconds, you can simply wait for msg_lifetime×4 "timer ticks" to occur. So, when you start the timer thread, you would set the timer's counter to be msg_lifetime×4. Even time you return from usleep(250000), you decrement the counter. If the counter reaches 0, you generated a timeout event. If not, you check the global variable to see if the timer is cancelled and self-terminate if the timer has been canceled. In this case, you must not generate a timeout event. For this lab, using a global variable is fine since we can only run one traceroute application at any particular node. (Please note that this creates a race condition if the timer expires at about the same time you received a TTLZERO message or a PONG message. In order to avoid this race condition, a simple solution is to set msg_lifetime large enough so that this race condition is unlikely to occur. For the purpose of this lab, we will assume that msg_lifetime is large enough. If for some reason your distributed algorithm is running too slowly and you keep getting timeouts when you are not supposed to, you should debug your code and find out why your code is running too slow.)
Please see the the PA5 spec regarding what information to log for the PING, PONG, and TTLZERO messages.
We will use a set of slightly different configuration files from what's used in Lab 12. Please do the following:
Create an empty directory (call it "lab13") and change directory into it.
Download lab13data.tar.gz into that directory and type:
    tar xvf lab13data.tar.gz
This should create a subdirectory called "lab13data" with several .ini files in it which we will use as configuration files for our nodes.
When you are done with implementing lab13a, please do the following:
Change directory into the "lab13" directory mentioned above.
Open 6 Terminals and change directory into the same directory.
In the first Terminal window, type "script lab13a-12000.script" to start a transcript.
In the 2nd Terminal window, type "script lab13a-12002.script" to start another transcript.
In the 3rd Terminal window, type "script lab13a-12004.script" to start another transcript.
In the 4th Terminal window, type "script lab13a-12006.script" to start another transcript.
In the 5th Terminal window, type "script lab13a-12008.script" to start another transcript.
In the 6th Terminal window, type "script lab13a-12010.script" to start another transcript.
In the first Terminal window, type:
    uname -a
    cat /etc/os-release
    make clean
    make lab13a
    ./lab13a lab13data/lab13-12000.ini
In the 2nd Terminal window, type "./lab13a lab13data/lab13-12002.ini".
In the 3rd Terminal window, type "./lab13a lab13data/lab13-12004.ini".
In the 4th Terminal window, type "./lab13a lab13data/lab13-12006.ini".
In the 5th Terminal window, type "./lab13a lab13data/lab13-12008.ini".
In the 6th Terminal window, type "./lab13a lab13data/lab13-12010.ini".
Your network should look like the following (from the configuration files):
                       +-------+
                    /--+ 12010 +--------------------------\
                    |  +-------+                          |
                    |                                     |
    +-------+   +---+---+     +-------+   +-------+   +---+---+
    | 12000 +---+ 12002 +-----+ 12004 +---+ 12006 +---+ 12008 |
    +-------+   +-------+     +-------+   +-------+   +-------+
Type "netgraph" and "forwarding" in all windows to make sure that they all agree on the same network topology and all the forwarding tables look correct.
In the first window (where you are running :12000), type "traceroute :12000" and you should see:
    Cannot traceroute to yourself
In the first window, type "traceroute :12002" and In the first window, you should see the following printout at the end:
    [TIMESTAMP] i UCASTAPP :12002 1 - 42 MSGID1 :12000 :12002 1 353UDT/1.0 PING SESID1
    [TIMESTAMP] r UCASTAPP :12002 9 - 42 MSGID2 :12002 :12000 1 353UDT/1.0 PONG SESID1
    1 - :12002, RTT1
    :12002 is reached in 1 steps
where TIMESTAMP, MSGID1, MSGID2, SESID1 would depend on the actual value of the timestamp, message IDs, and session ID, and RTT1 should be much less than a second (and the last line does not have to be grammatically correct). The values of all TIMESTAMP should be distinct and the session IDs must match. The first two lines above are UCASTAPP log messages that are printed to cout. They are showing that node :12000 initiated a PING message with TTL of 1 and node :12002 received that PING message and initiated a PONG message (with a TTL of 9) and send it back to node :12000.
In the 2nd window, you should see:
    [TIMESTAMP] r UCASTAPP :12000 1 - 42 MSGID1 :12000 :12002 1 353UDT/1.0 PING SESID1
    [TIMESTAMP] i UCASTAPP :12000 9 - 42 MSGID2 :12002 :12000 1 353UDT/1.0 PONG SESID1
where the message IDs and sessions IDs should match what you saw in the 1st window.
In the first window, type "traceroute :12012". Since :12012 is not running, you should get a timeout every 5 seconds (and 8 seconds between iterations) you should see the following printout:
    1 - *
    2 - *
    3 - *
    4 - *
    5 - *
    6 - *
    7 - *
    8 - *
    9 - *
    traceroute: :12012 not reached after 9 steps
Please note that since node :12012 is not running, you should not see any UCASTAPP log messages in any window.
In the first window, type "traceroute :12008" and you should see the following in the 1st window:
    [TIMESTAMP] i UCASTAPP :12002 1 - 43 MSGID3 :12000 :12008 1 353UDT/1.0 PING SESID2
    [TIMESTAMP] r UCASTAPP :12002 9 - 46 MSGID4 :12002 :12000 1 353UDT/1.0 TTLZERO SESID2
    1 - :12002, RTT2
    [TIMESTAMP] i UCASTAPP :12002 2 - 43 MSGID5 :12000 :12008 1 353UDT/1.0 PING SESID2
    [TIMESTAMP] r UCASTAPP :12002 8 - 46 MSGID6 :12010 :12000 1 353UDT/1.0 TTLZERO SESID2
    2 - :12010, RTT3
    [TIMESTAMP] i UCASTAPP :12002 3 - 43 MSGID7 :12000 :12008 1 353UDT/1.0 PING SESID2
    [TIMESTAMP] r UCASTAPP :12002 7 - 43 MSGID8 :12008 :12000 1 353UDT/1.0 PONG SESID2
    3 - :12008, RTT4
    :12008 is reached in 3 steps
and the following in the 2nd window:
    [TIMESTAMP] r UCASTAPP :12000 1 - 43 MSGID3 :12000 :12008 1 353UDT/1.0 PING SESID2
    [TIMESTAMP] i UCASTAPP :12000 9 - 46 MSGID4 :12002 :12000 1 353UDT/1.0 TTLZERO SESID2
    [TIMESTAMP] r UCASTAPP :12000 2 - 43 MSGID5 :12000 :12008 1 353UDT/1.0 PING SESID2
    [TIMESTAMP] f UCASTAPP :12010 1 - 43 MSGID5 :12000 :12008 1 353UDT/1.0 PING SESID2
    [TIMESTAMP] r UCASTAPP :12010 9 - 46 MSGID6 :12010 :12000 1 353UDT/1.0 TTLZERO SESID2
    [TIMESTAMP] f UCASTAPP :12000 8 - 46 MSGID6 :12010 :12000 1 353UDT/1.0 TTLZERO SESID2
    [TIMESTAMP] r UCASTAPP :12000 3 - 43 MSGID7 :12000 :12008 1 353UDT/1.0 PING SESID2
    [TIMESTAMP] f UCASTAPP :12010 2 - 43 MSGID7 :12000 :12008 1 353UDT/1.0 PING SESID2
    [TIMESTAMP] r UCASTAPP :12010 8 - 43 MSGID8 :12008 :12000 1 353UDT/1.0 PONG SESID2
    [TIMESTAMP] f UCASTAPP :12000 7 - 43 MSGID8 :12008 :12000 1 353UDT/1.0 PONG SESID2
and the following in the 5th window:
    [TIMESTAMP] r UCASTAPP :12010 1 - 43 MSGID7 :12000 :12008 1 353UDT/1.0 PING SESID2
    [TIMESTAMP] i UCASTAPP :12010 9 - 43 MSGID8 :12008 :12000 1 353UDT/1.0 PONG SESID2
and the following in the 6th window:
    [TIMESTAMP] r UCASTAPP :12002 1 - 43 MSGID5 :12000 :12008 1 353UDT/1.0 PING SESID2
    [TIMESTAMP] i UCASTAPP :12002 9 - 46 MSGID6 :12010 :12000 1 353UDT/1.0 TTLZERO SESID2
    [TIMESTAMP] r UCASTAPP :12002 2 - 43 MSGID7 :12000 :12008 1 353UDT/1.0 PING SESID2
    [TIMESTAMP] f UCASTAPP :12008 1 - 43 MSGID7 :12000 :12008 1 353UDT/1.0 PING SESID2
    [TIMESTAMP] r UCASTAPP :12008 9 - 43 MSGID8 :12008 :12000 1 353UDT/1.0 PONG SESID2
    [TIMESTAMP] f UCASTAPP :12002 8 - 43 MSGID8 :12008 :12000 1 353UDT/1.0 PONG SESID2
What's going on above is that node :12000 initiated a PING message with TTL of 1 (MSGID3) and node :12002 received that PING message and initiated a TTLZERO message with a TTL of 9 (MSGID4) and send it back to node :12000. In the first window, you should see a line of printout that corresponds to receiving a TTLZERO message.
Then node :12000 initiated another PING message with TTL of 2 (MSGID5) and node :12002 received that PING message and forward it to node :12010. Node :12010 then initiated a TTLZERO message with a TTL of 9 (MSGID6) and sends it to node :12002 and node :12002 decrements the TTL and forwards it to node :12000. In the first window, you should see a line of printout that corresponds to receiving a TTLZERO message.
Then node :12000 initiated another PING message with TTL of 3 (MSGID7) and node :12002 received that PING message and forward it to node :12010. Node :12010 received that PING message and forward it to node :12008. Node :12008 then initiated a PONG message with a TTL of 9 (MSGID8) and sends it to node :12010, node :12010 decrements the TTL and forwards it to node :12002, and node :12002 decrements the TTL and forwards it to node :12000. In the first window, you should see a line of printout that corresponds to receiving a PONG message.
In the 4th window (where you are running :12006), type "traceroute :12000" and you should see the following in the 4th window:
    [TIMESTAMP] i UCASTAPP :12004 1 - 43 MSGID9 :12006 :12000 1 353UDT/1.0 PING SESID3
    [TIMESTAMP] r UCASTAPP :12004 9 - 46 MSGID10 :12004 :12006 1 353UDT/1.0 TTLZERO SESID3
    1 - :12004, RTT5
    [TIMESTAMP] i UCASTAPP :12004 2 - 43 MSGID11 :12006 :12000 1 353UDT/1.0 PING SESID3
    [TIMESTAMP] r UCASTAPP :12004 8 - 46 MSGID12 :12002 :12006 1 353UDT/1.0 TTLZERO SESID3
    2 - :12002, RTT6
    [TIMESTAMP] i UCASTAPP :12004 3 - 43 MSGID13 :12006 :12000 1 353UDT/1.0 PING SESID3
    [TIMESTAMP] r UCASTAPP :12004 7 - 43 MSGID14 :12000 :12006 1 353UDT/1.0 PONG SESID3
    3 - :12000, RTT7
    :12000 is reached in 3 steps
and the following in the 3rd window:
    [TIMESTAMP] r UCASTAPP :12006 1 - 43 MSGID9 :12006 :12000 1 353UDT/1.0 PING SESID3
    [TIMESTAMP] i UCASTAPP :12006 9 - 46 MSGID10 :12004 :12006 1 353UDT/1.0 TTLZERO SESID3
    [TIMESTAMP] r UCASTAPP :12006 2 - 43 MSGID11 :12006 :12000 1 353UDT/1.0 PING SESID3
    [TIMESTAMP] f UCASTAPP :12002 1 - 43 MSGID11 :12006 :12000 1 353UDT/1.0 PING SESID3
    [TIMESTAMP] r UCASTAPP :12002 9 - 46 MSGID12 :12002 :12006 1 353UDT/1.0 TTLZERO SESID3
    [TIMESTAMP] f UCASTAPP :12006 8 - 46 MSGID12 :12002 :12006 1 353UDT/1.0 TTLZERO SESID3
    [TIMESTAMP] r UCASTAPP :12006 3 - 43 MSGID13 :12006 :12000 1 353UDT/1.0 PING SESID3
    [TIMESTAMP] f UCASTAPP :12002 2 - 43 MSGID13 :12006 :12000 1 353UDT/1.0 PING SESID3
    [TIMESTAMP] r UCASTAPP :12002 8 - 43 MSGID14 :12000 :12006 1 353UDT/1.0 PONG SESID3
    [TIMESTAMP] f UCASTAPP :12006 7 - 43 MSGID14 :12000 :12006 1 353UDT/1.0 PONG SESID3
and the following in the 2nd window:
    [TIMESTAMP] r UCASTAPP :12004 1 - 43 MSGID11 :12006 :12000 1 353UDT/1.0 PING SESID3
    [TIMESTAMP] i UCASTAPP :12004 9 - 46 MSGID12 :12002 :12006 1 353UDT/1.0 TTLZERO SESID3
    [TIMESTAMP] r UCASTAPP :12004 2 - 43 MSGID13 :12006 :12000 1 353UDT/1.0 PING SESID3
    [TIMESTAMP] f UCASTAPP :12000 1 - 43 MSGID13 :12006 :12000 1 353UDT/1.0 PING SESID3
    [TIMESTAMP] r UCASTAPP :12000 9 - 43 MSGID14 :12000 :12006 1 353UDT/1.0 PONG SESID3
    [TIMESTAMP] f UCASTAPP :12004 8 - 43 MSGID14 :12000 :12006 1 353UDT/1.0 PONG SESID3
and the following in the 1st window:
    [TIMESTAMP] r UCASTAPP :12002 1 - 43 MSGID13 :12006 :12000 1 353UDT/1.0 PING SESID3
    [TIMESTAMP] i UCASTAPP :12002 9 - 43 MSGID14 :12000 :12006 1 353UDT/1.0 PONG SESID3
In the 4th window (where you are running :12006), type "traceroute :12000". As soon as you see:
    1 - :12004, RTT8
type "quit" in the 3rd window (where you are running :12004). You should then see the following printout in the 4th window:
    2 - :12010, RTT9
    3 - :12002, RTT10
    4 - :12000, RTT11
    :12000 is reached in 4 steps
You should see UCASTAPP log messages in all windows and they are omitted here.
In the 3rd window, type "./lab13a lab13data/lab13-12004.ini" to restart :12004.
In the 4th window (where you are running :12006), type "traceroute :12000" and you should see the following printout:
    1 - :12004, RTT12
    2 - :12002, RTT13
    3 - :12000, RTT14
    :12000 is reached in 3 steps
You should see UCASTAPP log messages in all windows and they are omitted here.
Type "quit" in all six windows.
In the 6th Terminal window, type the following to display all the log files (to see that SAYHELLO and LSUPDATE messages are properly logged):
    more lab13data/*.log
Type "exit" in all six windows to close the transcripts.
To save typing all some of the commands above, a tmux script, "tmux-lab13a.txt" is provided in the "lab13data" directory created above and you can run it by typing:
    lab13data/tmux-lab13a.txt
Plesae note that it's provided for your convenience (i.e., to save typing) and it may not be exactly the same as the above sequence. Please see PA2 FAQ regarding how to use tmux in general.

Part B (`lab13b`) - more tests (useful for PA5):

There is no new code to write. Just want to make sure that your code works correctly under different test cases.
In a Terminal window, change directory into the "lab13" directory mentioned above and type the following command:
    lab13data/changelogging.csh UCASTAPP 1
The above command will modify all the configuration files inside the lab13data directory and change the value of the UCASTAPP key in the [logging] section to 1 to mean that all UCASTAPP log messages must be logged into the log file specified by the "logfile" key in the [startup] section of the configuration file.
Open 5 more Terminals and change directory into the same directory.
In the first Terminal window, type "script lab13b-12000.script" to start a transcript.
In the 2nd Terminal window, type "script lab13b-12002.script" to start another transcript.
In the 3rd Terminal window, type "script lab13b-12004.script" to start another transcript.
In the 4th Terminal window, type "script lab13b-12006.script" to start another transcript.
In the 5th Terminal window, type "script lab13b-12008.script" to start another transcript.
In the 6th Terminal window, type "script lab13b-12010.script" to start another transcript.
In the 1st Terminal window, type "./lab13a lab13data/lab13-12000.ini".
In the 2nd Terminal window, type "./lab13a lab13data/lab13-12002.ini".
In the 3rd Terminal window, type "./lab13a lab13data/lab13-12004.ini".
In the 4th Terminal window, type "./lab13a lab13data/lab13-12006.ini".
In the 5th Terminal window, type "./lab13a lab13data/lab13-12008.ini".
In the 6th Terminal window, type "./lab13a lab13data/lab13-12010.ini".
Your network should look like the following:
                       +-------+
                    /--+ 12010 +--------------------------\
                    |  +-------+                          |
                    |                                     |
    +-------+   +---+---+     +-------+   +-------+   +---+---+
    | 12000 +---+ 12002 +-----+ 12004 +---+ 12006 +---+ 12008 |
    +-------+   +-------+     +-------+   +-------+   +-------+
Type "netgraph" and "forwarding" in all windows to make sure that they all agree on the same network topology and all the forwarding tables look correct.
do the following in the 1st, 2nd, and 4th windows as quickly as possible to run traceroute in parallel:
In the 1st window, type "traceroute :12008".
In the 2nd window, type "traceroute :12008".
In the 4th window, type "traceroute :12000".
In the 1st window, you should eventually see:
    1 - :12002, RTT1
    2 - :12010, RTT2
    3 - :12008, RTT3
    :12008 is reached in 3 steps
In the 2nd window, you should eventually see:
    1 - :12010, RTT4
    2 - :12008, RTT5
    :12008 is reached in 2 steps
In the 4th window, you should eventually see:
    1 - :12004, RTT6
    2 - :12002, RTT7
    3 - :12000, RTT8
    :12000 is reached in 3 steps
do the following as quickly as possible to run traceroute in parallel in the 4th and 1st windows:
In the 4th window (where you are running :12006), type "traceroute :12000". As soon as you see:
    1 - :12004, RTT9
type "quit" in the 3rd window (where you are running :12004).
As quickly as you can, type "traceroute :12006" in the 1st window.
In the 4th window, you should eventually see:
    2 - :12010, RTT10
    3 - :12002, RTT11
    4 - :12000, RTT12
    :12000 is reached in 4 steps
In the 1st window, you should eventually see:
    1 - :12002, RTT13
    2 - :12010, RTT14
    3 - :12008, RTT15
    4 - :12006, RTT16
    :12006 is reached in 4 steps
Type "quit" in all six windows.
In the 6th Terminal window, type the following to display all the log files (to see that SAYHELLO, LSUPDATE, and UCASTAPP messages are properly logged):
    more lab13data/*.log
Type "exit" in all six windows to close the transcripts.
To save typing all some of the commands above, a tmux script, "tmux-lab13b.txt" is provided in the "lab13data" directory created above and you can run it by typing:
    lab13data/tmux-lab13b.txt
Plesae note that it's provided for your convenience (i.e., to save typing) and it may not be exactly the same as the above sequence. Please see PA2 FAQ regarding how to use tmux in general.

Templates

All pseudo-code is incomplete and error checking is often left out in pseudo-code. Since some details are left out, depending on you how write your code, you may create race conditions and you may need to fix your code so that your program won't freeze or crash. Feel free to send your questions (and not your code) to the instructor.

Pseudo-code for `traceroute`:

Let's say that in the console of node X, the user typed a traceroute Y command (i.e., Y is the NodeID of the target node), node X must first run BFS and then execute the following algorithm (please note that the code below is written as if it's a function):
        [ 1]    call GetObjID() to get a sesid (or use a serial/sequence number)
        [ 2]    for (i=1; i ≤ max_ttl; i++) {
        [ 3]        run BFS to make sure the adjacency list is up-to-date
        [ 4]        start_time = gettimeofday();
        [ 5]        start msg_lifetime timer
        [ 6]        if Y is reachable, send PING(sesid) message with ttl=i to Y
        [ 7]        wait for response or timeout
        [ 8]        if (response is TTLZERO(sesid)) {
        [ 9]            cancel timer
        [10]            now = gettimeofday();
        [11]            print message to console with RTT = now - start_time
        [12]        } else if (response is PONG(sesid)) {
        [13]            cancel timer
        [14]            now = gettimeofday();
        [15]            print message to console with RTT = now - start_time
        [16]            join with timer thread
        [17]            return
        [18]        } else if (msg_lifetime timer expires) {
        [19]            print timeout information
        [20]        }
        [21]        join with timer thread
        [22]        sleep 3 seconds
        [23]    }
        [24]    print finished message
In line [2] above, max_ttl is the Max TTL value of your node.
In line [5] above, msg_lifetime is the Message Life Time of your node.
In line [11] above, you must print the following to the console:
        i - NODEID, RTT\n
where NODEID is the SRC_NODEID in the TTLZERO message and RTT is measured in seconds.
In line [15] above, you must print the following to the console:
        i - NODEID, RTT\n
        NODEID is reached in i steps\n
where NODEID is the SRC_NODEID in the PONG message (which must be node Y in this case) and RTT is measured in seconds.
In line [19] above, you must print the following to the console:
        i - *\n
where the asterisk character (i.e., "*") means that no response was received in step i (i.e., got a timeout instead).
In line [24] above, you must print the following to the console:
        traceroute: Y not reached after MAXTTL steps\n
where MAXTTL is the value of max_ttl used above.
In the real world Traceroute application, ICMP messages are involved. For this lab, our UDT application messages are the equivalent of these ICMP messages.
Please note that when a node is running the traceroute application, its console must not respond to any other console commands.
The reason that a session ID (sesid) is needed is because it's possible (although very unlikely) that you may recevie a TTLZERO or a PONG message for the wrong session of traceroute! Let's say that immediately after you finished one traceroute command, you run another traceroute command from your console. In this case, there may be TTLZERO or PONG messages for the previous instance of traceroute floating around in the 353NET and somehow got delivered to your node when you are running the second instance of traceroute. Therefore, when you are running traceroute, you should only pay attention to TTLZERO or PONG messages with the current session ID.

Timer Object

You should probably use a Timer object of some sort (or you can design your own). It should probably have the following parts (the list may not be complete):

It should have a timer thread object in it (or a shared pointer to a thread).
It should have an expiration time.
It should have a cancellation flag so that another thread can "cancel" the timer.
It should have a start() function to begin the count down.
It should have a stop() function to set the cancellation flag.
It should know which work queue to add_work() to if the timer has expired.
In a way, this sounds a little bit similiar to a Connection object! In thast case, do you need a list to keep track of all the Timer objects? Do you need a reaper thread to reap dead timers? If you use a timer reaper thread to join with dead timer threads, then you must not attempt to join with a timer thread anywhere else in your code (such as in the pseudo-code above and you should probably modify the structure of the pseudo-code).
In general, the pseudo-code for a timer thread (that would wake up every 0.1 second to see if it has been canceled) would look something like the following:
    t = Timer object in argument of timer thread's first procedure
    ticks_remaining = t.expiration_time * 10
    do forever
        sleep for 0.1 seconds
        lock mutex
        if t.canceled then
            unlock mutex
            return
        end-if
        unlock mutex
        ticks_remaining = ticks_remaining - 1
        if ticks_remaining ≤ 0 then
            break;
        end-if
    end-do
    create a timeout event and add_work() to wake up some thread

Socket-reading Thread

Starting with this lab, your node needs to act like a router, which means that you need to decide whether you need to forward a packet or send a packet "up the protocol stack". Since a UCASTAPP message needs to be "routed", you can do the following in your socket-reading thread (some details are left out, you need to make sure to do the right things).
When you have received a UCASTAPP message, you check to see if your node is the DEST_NODEID. If it is, then you send the message "up the protocol stack" (i.e., add it to a work queue, if applicable, in the upper layer of the "protocol stack"). Adding something to a work queue is what "demultiplexing" is all about! When the lower level of a protocol stack creates and event or a message that needs to be "delivered" to a higher level of the protocol stack, you need to figure out which work queue to add work to. Since this lab and PA5 are just toy exercises, you don't have to design and structure your code to have a real "protocol stack". But the idea is still valid.
If your node is not the DEST_NODEID, then your node should help to route the message towards the DEST_NODEID. How do you do that? You use the forwarding table (make sure it's up-to-date) to see who is the "next-hop router" and find the Connection object that corresponds to that active neighbor and call add_work() to the work queue of the socket-writing thread inside that Connection object.

Grading

Below is the grading breakdown:

(1 pt) submitted a valid lab13.tar.gz file with all the required files using the submission procedure below
(1 pt) content in "lab13a-12000.script", "lab13a-12002.script", "lab13a-12004.script", "lab13a-12006.script", "lab13a-12008.script", "lab13a-12010.script" are correct
(1 pt) content in "lab13b-12000.script", "lab13b-12002.script", "lab13b-12004.script", "lab13b-12006.script", "lab13b-12008.script", "lab13b-12010.script" are correct
(1 pt) "Makefile" works for "make lab13a"
(1 pt) source code of your server program looks right

Minimum deduction is 0.5 pt for anything that's incorrect. Please note that for the "Makefile" item, you can only get credit for it if your "source code" is relevant to this lab; therefore, you can only get as many points as the "source code" item in the best case.

Please keep in mind that even though lab grading is "light", it doesn't mean that you can just put anything into your submission! It's still your responsibility to make sure that the files in your submission contains information that's relevant to the tests you were supposed to run. Use the "more" command to view your script/log files to make sure that they contain the right information. If a file has the wrong stuff in it, you should delete it and create the file again and verify. If most of the stuff in your script/log files are wrong and you did not notice it, we will most likely have to take points off.

Submission

To submit your work, you must first tar all the files you want to submit into a tarball and gzip it to create a gzipped tarfile named "lab13.tar.gz". Then you upload "lab13.tar.gz" to our Bistro submission server.

Change into the "lab13" directory you have created above and enter the following command to create your submission file "lab13.tar.gz" (if you don't have any ".h" files, don't include "*.h*" at the end):

    tar cvzf lab13.tar.gz lab13*.script Makefile *.c* *.h*
    ls -l lab13.tar.gz

The last command shows you how big the created "lab13.tar.gz" file is. If "lab13.tar.gz" is larger than 1MB in size, the submission server will not accept it.

If you use an IDE, the IDE may put your source code in subdirectories. In that case, you need to modify the commands above so that you include ALL the necessary source files and subdirectories (and don't include any binary files) ane make sure that your code can be compiled without the IDE since the grader is not allowed to use an IDE to compile your code.

You should read the output of the above commands carefully to make sure that "lab13.tar.gz" is created properly. If you don't understand the output of the above commands, you need to learn how to read it! It's your responsibility to ensure that "lab13.tar.gz" is created properly.

To check the content of "lab13.tar.gz", you can use the following command:

    tar tvf lab13.tar.gz

Please read the output of the above command carefully to see what files were included in "lab13.tar.gz" and what are their file sizes and make sure that they make sense.

Please enter your USC e-mail address and your submission PIN below. Then click on the Browse button and locate and select your submission file (i.e., "lab13.tar.gz"). Then click on the Upload button to submit your "lab13.tar.gz". (Be careful what you click! Do NOT submit the wrong file!) If you see an error message, please read the dialogbox carefully and fix what needs to be fixed and repeat the procedure. If you don't know your submission PIN, please visit this web site to have your PIN e-mailed to your USC e-mail address.

When this web page was last loaded, the time at the submission server at merlot.usc.edu was 27Nov2025-18:59:08. Reload this web page to see the current time on merlot.usc.edu.

If the command is executed successfully and if everything checks out, a ticket will be issued to you to let you know "what" and "when" your submission made it to the Bistro server. The next web page you see would display such a ticket and the ticket should look like the sample shown in the submission web page (of course, the actual text would be different, but the format should be similar). Make sure you follow the Verify Your Ticket instructions to verify the SHA1 hash of your submission to make sure what you did not accidentally submit the wrong file. Also, an e-mail (showing the ticket) will be sent to your USC e-mail address. Please read the ticket carefully to know exactly "what" and "when" your submission made it to the Bistro server. If there are problems, please contact the instructor.

It is extreme important that you also verify your submission after you have submitted "lab13.tar.gz" electronically to make sure that every you have submitted is everything you wanted us to grade. If you don't verify your submission and you ended up submit the wrong files, please understand that due to our fairness policy, there's absolutely nothing we can do.

Finally, please be familiar with the Electronic Submission Guidelines and information on the bsubmit web page.

(5 points total)

The Traceroute UDT Application

Part A (`lab13a`) - traceroute, a `UDT` application (useful for PA5):

Part B (`lab13b`) - more tests (useful for PA5):

Pseudo-code for `traceroute`:

Timer Object

Socket-reading Thread

Lab #13

(5 points total)

The Traceroute UDT Application

Part A (lab13a) - traceroute, a UDT application (useful for PA5):

Part B (lab13b) - more tests (useful for PA5):

Pseudo-code for traceroute:

Timer Object

Socket-reading Thread

Part A (`lab13a`) - traceroute, a `UDT` application (useful for PA5):

Part B (`lab13b`) - more tests (useful for PA5):

Pseudo-code for `traceroute`: