[ This content is protected and may not be shared, uploaded, or distributed. ]
(Please also check out PA4 FAQ.)
Lab 10 has two parts:
Part A (lab10a) - saying hello (useful for PA4):
Starting with this lab, we will impmlement layers 2 and 3 router functionalities. In the real world,
layer 4 and below are implemented inside the operating system. Since we are not doing kernel hacking
in this class, we will implement these layers 2 and 3 functionalities in an overlay network, i.e.,
an application-level network, layered on top of the Internet (also known as "over-the-top network" mentioned in Ch 2).
We will use graph terminology to describe the topology of an overlay network.
An end-system in an overlay network will be referred to as a node.
Nodes in this network are part server, part client, and part router.
In this lab, we will impmlement layer 2 (linke layer) functionalities in such an overlay network.
In Ch 2, we talked about the peer-to-peer architecture, which can also be considered as an overlay network.
We will start with your Part C of Lab 9 code and add a little bit of
your Part B of Lab 4 code to it. Also, we will be running our own
application-layer protocol, which is similar to the HTTP protocol in terms of message formats, but that's
where the similarity stops. We will call this the "353NET protocol".
The spec for the 353NET protocol is part of PA4.
So, we will be referring to the PA4 spec.
Please do the following:
- Create an empty directory (call it "lab10") and change directory into it.
- Download lab10data.tar.gz into that directory and type:
tar xvf lab10data.tar.gz
This should create a subdirectory called "lab10data" with several .ini files in it
which we will use as configuration files for our nodes.
The usage information (i.e., commandline syntax) for "lab10a" is as follows:
lab10a CONFIGFILE
where CONFIGFILE is a configuration file (in the INI format in Lab 5)
that contains all the information your node needs in order to function for this lab.
Every node in your overlay network will use a slightly different configuration file.
Here's one of the configuration files lab10data/lab10-12000.ini you will use for this lab:
[startup]
host=
port=12000
logfile=lab10data/12000.log
[topology]
:12000=:12002
:12002=:12000,:12004,:12010,:12012
:12004=:12002,:12006,:12012
:12006=:12004,:12008
:12008=:12006,:12010
:12010=:12002,:12008
:12012=:12002,:12004
[map]
; +-------+
; /--+ 12010 +--------------------------\
; | +-------+ |
; | |
; +-------+ +---+---+ +-------+ +-------+ +---+---+
; | 12000 +---+ 12002 +-----+ 12004 +---+ 12006 +---+ 12008 |
; +-------+ +---+---+ +---+---+ +-------+ +-------+
; | |
; | +-------+ |
; \--+ 12012 +--/
; +-------+
[params]
max_ttl=9
msg_lifetime=8
neighbor_retry_interval=4
[logging]
SAYHELLO=0
LSUPDATE=0
UCASTAPP=0
For the definitions of the sections and keys in this file, please see the PA4 spec.
For this lab, you can igore the max_ttl and msg_lifetime keys in the [params] section and
the LSUPDATE and UCASTAPP keys in the [logging] section.
The file name ("lab10data/lab10-12000.ini") contains "12000" because the
"port" key in the [startup] section of this file has a value of "12000".
All other configuration files will be almost identical to this one, except for the "port" and
"logfile" keys in the [startup] section.
In the [startup] section,
the value for the "host" key is the hostname of the node that's using this configuration file
and the value for the "port" key is the TCP port number this node listens on (i.e., it's the welcome/well-known port of the node that's using this configuration file).
The unique identifier (known as the peer-to-peer NodeID, or simply the NodeID)
of this node is simply the value of its "host" key,
followed by the colon character (i.e., ":"), followed by the value of its "port" key.
In the above example, the NodeID of the node that's using this configuration file would be ":12000".
Please note that a hostname may be an empty string. When you write your code and if you need to connect to a host whose hostname
is an empty string, you must use the compiler-defined string LOCALHOST as the actual hostname when you call create_client_socket_and_connect().
The [topology] section describes the network topology of the 353NET if every node in the 353NET is up and running.
You must assume that every node in the 353NET has the same [topology] section in its CONFIGFILE.
The [topology] section is structured as an adjacency list!
Each key in this section is a NodeID and the corresponding
value is a comma-separated list of NodeIDs that are the neighbors of this node. If you have the NodeID of a node,
you can simply use that as a key to lookup its neighbors in this section of the CONFIGFILE. Therefore, each node only needs to use
one line in this section of its configuration file. When your node comes up, you should find out who its neighbors
are supposed to be and should try to connect to all of them.
Each connection is a persistent TCP connection where you can send and receive multiple 353NET protocol messages.
In case not all of your neighbors are up and running, you
should set a timer that equals to the number of seconds specified by the "neighbor_retry_interval" key in the [params] section
of the configuration file. When the timer expires, you should attempt to connect to all the neighbors to which you are not currently connected
and repeat this process over and over again.
The [map] section only has commented lines. It shows you what the overall network map should look like
if all nodes are up and running. If only some nodes are running, you can simply remove the nodes that are down
and all edges connecting to their neighbors.
For this lab, the main job is for each node to connect to all its neighbors, if they are up and running. After you have established
a TCP connection with your neighbor, you must exchange a "SAYHELLO" message to introduce yourself to your neighbor.
Recall that a TCP server can only know the ephemeral port number used by a connected peer (by calling getpeername()
or get_ip_and_port_for_server() in my_socket.cpp).
In order for it to know the NodeID of the connecting node, the connecting node must explicitly send the NodeID information over the connection.
This is done using a "SAYHELLO" message in the 353NET which looks like the following:
353NET/1.0 SAYHELLO\r\n
TTL: 1\r\n
Flood: 0\r\n
From: NODEID\r\n
Content-Length: 0\r\n
\r\n
Since we don't really have client and servers in the 353NET and all nodes are peers, we don't make a distinction between request and response messages.
The format of a 353NET message is very similar to the format of an HTTP message. There's the 353NET message header (which are just lines of text,
each ends with "\r\n"), followed by an empty line (which is just "\r\n"), then followed by a message body.
The size of the message body, in bytes, corresponds to the "Content-Length" key in the message header.
The "NODEID" is the NodeID of the sender for this particular message type.
When you disconnect from your neighbor, it would be nice to be polite to say "good-bye". Although typically, there is no time to say "good-bye"
and we will not do that.
You should start with your code in Part C of Lab 9 and change your code to parse SAYHELLO message
so you can find out who you are connected to (using the "From" key in the message header). You need to add a string to your
Connection object to remember the NodeID of this neighbor. I will refer to this field as "neighbor_nodeid".
For this lab, once you have received a SAYHELLO message from your neighbor and updated your data structure, what do you need to do?
Nothing! The only thing that can happen with this connection is a disconnect!
When that happens, read_a_line() would return a value ≤ 0 and your socket-reading thread would proceed to terminate the
connection (as you did in Part C of Lab 9). Simple! Right? Well, there is one complication. Where are the clients?
For this lab,
the job of a node is to make sure that they are connected to all its neighbors if they are up and running!
One way to do this is to write a "neighbors thread", whose job is to keep trying to connect to all the node's neighbors.
Periodically, the neighbors thread would wake
up and go down a list of neighbors to see which ones are active and which ones are not (i.e., inactive).
For each inactive neighbor, say node X, the neighbors thread would act like a client (see your Part B of Lab 4 code)
and try to connect to it. If create_client_socket_and_connect() fails, it would mean that node X is not up, and your neighbors thread
will move on to the next inactive neighbor.
If create_client_socket_and_connect() succeeds and you got a new socket, you must then
create a Connection object to maintain the new socket and create socket-reading and socket-writing threads for it
(similar to what you have done in your Part C of Lab 9 code).
The difference between this connection and the connection you create
inside the infinite loop in the main thread is that, for this connection, you already know the "neighbor_nodeid" and it's X
since you are the one who initiated the connection!
You must immediately send a SAYHELLO message into the connection and set the value of the From key in the message header
to be this node's NodeID to introduce yourself to
node X and you must wait for a SAYHELLO message back from node X before making this connection an
active connection. Please note that in this case, when you get a SAYHELLO
message from node X, it's not very useful since you already know that you are connected to node X. But we have to wait for a SAYHELLO
message from node X because it's the protocol, i.e., you must exchange a pair of SAYHELLO messages on a connection before declaring
that connection an active connection.
What about the connection you've made inside the infinite loop in the main thread? When that Connection object is created, it
doesn't even know what to put into the "neighbor_nodeid" field. Clearly, that connection cannot be declared to
be an active connection since there is missing information. You must immediately send a
SAYHELLO message into the connection and wait for a SAYHELLO message to come back. Once you
get a SAYHELLO message from the other side, you can set the "neighbor_nodeid" field inside the
Connection object and declare this connection an active connection.
Therefore, the bottomline is that in both cases, you can only declare a connection active when you have receive a SAYHELLO message
over a connection.
There is one more issue. What if two neighboring nodes got started at exactly the same time
and simultaneously connect to each other and we would end up with two connections between these two nodes.
This would mess up the network topology because now you have two edges between a pair of nodes.
To make sure that this must never happen, we need to modify the algorithm in main thread a little bit.
Before you set the "neighbor_nodeid" field inside the Connection object to NODEID in the SAYHELLO message header,
you must check to see if there is already a connection to NODEID in the list of connections. Please note that this list of connections
may contain a connection created by the neighbors thread which may not be active, yet (because it's waiting
for a SAYHELLO message from its peer, although it already knows the identify of the peer). If it turns out that such a connection already
exists, you must shutdown and close this connection immediately to avoid having two connections to the same neighbor.
For logging of messages, you must use the log message format specified in PA4
and use the values in the [logging] section of the CONFIGFILE to
determine if you should log a particular message into a log file or print it to cout.
When you receive a SAYHELLO message, you must print the following log message to cout or the log file:
[TIMESTAMP] r SAYHELLO NODEID 1 - 0\n
where TIMESTAMP is the current time in the same format as a PA2 timestamp,
r means that the message was received,
NODEID is the value of the From key in the message header,
1 is the value of the TTL key in the message header,
- means that the value of the Flood key in the message header is 0, and the
0 there is the value of the Content-Length key in the message header.
Similarly, when you send a SAYHELLO message, you must print the following log message to cout or the log file:
[TIMESTAMP] i SAYHELLO NODEID 1 - 0\n
where i means that the message was initiated by this node.
As far as the console goes, you need to implement the "neighbors" and "quit" commands of PA4.
If the user enters a command that you don't recognize, you must print:
Command not recognized. Valid commands are:\n
\tneighbors\n
\tquit\n
Since you will be running multiple nodes, each in its Terminal window,
you must use "NODEID> " (i.e., the NodeID of the node you are interacting with,
followed by a "greater than" symbol, followed by a space character) as the command prompt for your console thread.
When you are done with implementing lab10a, please do the following:
- Change directory into the "lab10" directory mentioned above.
- Type "script lab10a-12000.script" to start a transcript. Then type:
uname -a
cat /etc/os-release
make clean
make lab10a
- For this part of the lab, if all nodes are up, the network topology would look like:
+-------+ +-------+
| 12000 +---+ 12002 +
+-------+ +-------+
- In the Terminal window, type:
./lab10a lab10data/lab10-12000.ini
- When you get the command prompt (which should look like ":12000> "), type the following:
neighbors
You should see the following:
:12000 has no active neighbors
- Run another Terminal window and cd into the same "lab10" directory.
- Type "script lab10a-12002.script" to start a transcript. Then type:
./lab10a lab10data/lab10-12002.ini
- In both windows, you should see that both nodes have both sent and received SAYHELLO messages.
- When you get the command prompt in the 2nd window (which should look like ":12002> "),
wait for a couple of seconds and type:
neighbors
You should see the following:
Active neighbors of :12002:
:12000
- Type "neighbors" in the first window and you should see the following:
Active neighbors of :12000:
:12002
- In the first window, type:
quit
and :12000 should self-terminate.
- Type "neighbors" in the 2nd window and it should say that it has no active neighbors.
- In the 1st window, type:
./lab10a lab10data/lab10-12000.ini
- You should see SAYHELLO messages being exchanged between nodes :12000 and :12002.
- In both Terminal windows, type:
neighbors
to make sure that they are connected with each other.
- Type "quit" in both windows and make sure that both servers self-terminate gracefully.
- Type "exit" in both windows to close the transcripts.
To save typing all some of the commands above, a tmux script, "tmux-lab10a.txt" is provided
in the "lab10data" directory created above and you can run it by typing:
lab10data/tmux-lab10a.txt
Plesae note that it's provided for your convenience (i.e., to save typing) and it may not be exactly the same as the above sequence.
Please see PA2 FAQ regarding how to use tmux in general.
Part B (lab10b) - four nodes (useful for PA4):
There is no new code to write for this part of the lab. Just want to make sure that your code works correctly when more nodes are activated.
For this part of the lab, if all nodes are up, the network topology would look like:
+-------+ +-------+ +-------+
| 12000 +---+ 12002 +-----+ 12004 |
+-------+ +---+---+ +---+---+
| |
| +-------+ |
\--+ 12012 +--/
+-------+
- Change directory into the "lab10" directory mentioned above.
- Open three more Terminal windows and change directory into the same directory.
- In the first Terminal window, type "script lab10b-12000.script" to start a transcript.
- In the 2nd Terminal window, type "script lab10b-12002.script" to start another transcript.
- In the 3rd Terminal window, type "script lab10b-12004.script" to start another transcript.
- In the 4th Terminal window, type "script lab10b-12012.script" to start another transcript.
- In the first Terminal window, type:
./lab10a lab10data/lab10-12000.ini
- In the 2nd Terminal window, type:
./lab10a lab10data/lab10-12002.ini
- In the 3rd Terminal window, type:
./lab10a lab10data/lab10-12004.ini
- In the 4th Terminal window, type:
./lab10a lab10data/lab10-12012.ini
- Type "neighbors" in the 1st Terminal window and you should see:
Active neighbors of :12000:
:12002
- Type "neighbors" in the 2nd Terminal window and you should see (order does not matter):
Active neighbors of :12002:
:12000,:12004,:12012
- Type "neighbors" in the 3rd Terminal window and you should see (order does not matter):
Active neighbors of :12004:
:12002,:12012
- Type "neighbors" in the 4th Terminal window and you should see (order does not matter):
Active neighbors of :12012:
:12002,:12004
- Now type "quit" in the 2nd Terminal window to bring down node :12002.
- Type "neighbors" in the 1st Terminal window and you should see:
:12000 has no active neighbors
- Type "neighbors" in the 3rd Terminal window and you should see:
Active neighbors of :12004:
:12012
- Type "neighbors" in the 4th Terminal window and you should see:
Active neighbors of :12012:
:12004
- In the 2nd Terminal window, type:
./lab10a lab10data/lab10-12002.ini
- You should see SAYHELLO messages being exchanged in all other windows.
- Wait for a few seconds and type "neighbors" in all four windows to make sure that the network is formed correctly again.
- Type "quit" in all four windows.
The "more" command will pause when it shows a screen full of data or finishes displaying the current file.
Just keep pressing the spacebar on your keyboard to see the next page of data or the next file.
- In the 4th Terminal window, type the follow to display all the log files (and press the <SPACE> bar to advance to the next file):
more lab10data/*.log
You should see that all the log files are empty.
- Type "exit" in all four windows to close the transcripts.
To save typing all some of the commands above, a tmux script, "tmux-lab10b.txt" is provided
in the "lab10data" directory created above and you can run it by typing:
lab10data/tmux-lab10b.txt
Plesae note that it's provided for your convenience (i.e., to save typing) and it may not be exactly the same as the above sequence.
Please see PA2 FAQ regarding how to use tmux in general.
Part C (lab10c) - logging into log file (useful for PA4):
There is no new code to write for this part of the lab. We will just repeat everything in Par B of this lab
but log everything into the log file instead.
You can also run the provided "tmux-lab10c.txt" script by doing:
lab10data/tmux-lab10c.txt
All pseudo-code is incomplete and error checking is often left out in pseudo-code.
Since some details are left out, depending on you how write your code, you may create race conditions
and you may need to fix your code so that your program won't freeze or crash.
Feel free to send your questions (and not your code) to the instructor.
Pseudo-code
Code in red are critical section codes.
Please note that the pseudo-code is not necessarily complete.
It is very important to understsand that it is the design of this distribute protocol that
when node X initiates a connection with node Y (i.e., node X acts like a client and node Y acts like a server),
node X must send SAYHELLO first. When node Y has received SAYHELLO, it will then send SAYHELLO
to node X. The pseudo-code below assumes that this is the design and this is the protocol.
This design choice affects the design of the main thread, the socket-reading thread,
and the neighbors thread.
Main Thread
create console thread
create reaper thread
create neighbors thread
do forever
client socket = my_accept()
if client socket is (-1) then
break out of infinite loop
end-if
lock mutex
if listening socket is (-1) then
shutdown and close client socket
unlock mutex
break out of infinite loop
end-if
create connection object c
create connection-reading thread tr
set c.read_thr to tr
create connection-writing thread tw
set c.write_thr to tw
c.neighbor_nodeid = null
add c to connection list
unlock mutex
end-do
join with console thread
join with reaper thread
join with neighbors thread
Socket-reading Thread
c = connection object in argument of first procedure
lock mutex
unlock mutex
/* first message read must be a SAYHELLO message */
m = read message from c.socket
log SAYHELLO received message
if c.neighbor_nodeid == null then
/* connection c was created in main thread */
lock mutex
foreach connection d ≠ c in connection list do
if d.neighbor_nodeid != null and d.neighbor_nodeid == m.from then
declare c a duplicate connection
end-if
end-for
unlock mutex
if c is not a duplicate connection then
lock mutex
c.neighbor_nodeid = m.from
w = create_message(SAYHELLO)
/* say hello back */
c.add_work(w)
unlock mutex
end-if
else
/* connection c was created in neighbors thread */
lock mutex
foreach connection d ≠ c in connection list do
if d.neighbor_nodeid != null and d.neighbor_nodeid == m.from then
declare c a duplicate connection
end-if
end-for
unlock mutex
end-if
if c is not a duplicate connection then
do forever
m = read message from c.socket
lock mutex
if listening socket is (-1) or m is NULL then
unlock mutex
break;
end-if
unlock mutex
/* process message, should never get here for this lab */
end-do
end-if
lock mutex
if c.socket ≥ 0 then
shutdown c.socket
end-if
set c.socket to (-1)
unlock mutex
/* signal the socket-writing thread to self-terminate */
c.add_work(NULL)
/* join with the socket-writing thread */
c.write_thread_ptr.join();
/* add dead connection to the reaper work queue */
reaper_add_work(c)
Socket-writing Thread
c = connection object in argument of first procedure
lock mutex
unlock mutex
do forever
w = c.wait_for_work()
if w == NULL then
break;
else if w.msg_type == SAYHELLO then
write SAYHELLO message into c.socket
end-if
end-do
Neighbors Thread
read CONFIGFILE to create a list of potential neighbors
do forever
l = copy of list of potential neighbors
lock mutex
if listening socket is (-1) then
unlock mutex
break;
else
for each connection c in connection list
if c.neighbor_nodeid is not null then
l.remove(c.neighbor_nodeid)
end-if
end-for
end-if
unlock mutex
/* l is now a list of inactive neighbors */
for each node n in l
socket = create_client_socket_and_connect(n)
if socket ≥ 0 then
lock mutex
create connection object c
create connection-reading thread tr
set c.read_thr to tr
create connection-writing thread tw
set c.write_thr to tw
c.neighbor_nodeid = n.nodeid
add c to connection list
/* say hello first */
w = create_message(SAYHELLO)
c.add_work(w)
unlock mutex
end-if
end-for
sleep for neighbor_retry_interval seconds
end-do
Please note that since the above code first checks to see if the listening socket is (-1) or not,
you must make sure that you create the listening socket before you create the neighbors thread (or any other child thread).
Below is the grading breakdown:
- (1 pt) submitted a valid lab10.tar.gz file with all the required
files using the submission procedure below
- (1 pt) content in "lab10a-12000.script" and
"lab10a-12002.script" are correct
- (1 pt) content in "lab10b-12000.script",
"lab10b-12002.script",
"lab10b-12004.script",
"lab10b-12012.script",
"lab10c-12000.script",
"lab10c-12002.script",
"lab10c-12004.script", and
"lab10c-12012.script", are correct
- (1 pt) "Makefile" works for "make lab10a"
- (1 pt) source code of your node looks right
Minimum deduction is 0.5 pt for anything that's incorrect.
Please note that for the " Makefile" item, you can only get credit for it if your "source code" is relevant to this lab; therefore, you can only get as many points as the "source code" item
in the best case.
Please keep in mind that even though lab grading is "light", it doesn't mean that you can just put anything
into your submission! It's still your responsibility to make sure that the files in your submission contains
information that's relevant to the tests you were supposed to run.
Use the "more" command to view your script/log files to make sure that they contain the right information.
If a file has the wrong stuff in it, you should delete it and create the file again and verify.
If most of the stuff in your script/log files are wrong and you did not notice it, we will most likely have to take points off.
To submit your work, you must first tar all the files you want to submit into a tarball and
gzip it to create a gzipped tarfile named " lab10.tar.gz".
Then you upload " lab10.tar.gz" to our Bistro submission server.
Change into the "lab10" directory you have created above and enter the following command
to create your submission file "lab10.tar.gz" (if you don't have any ".h" files, don't include "*.h*" at the end):
tar cvzf lab10.tar.gz lab10*.script Makefile *.c* *.h*
ls -l lab10.tar.gz
The last command shows you how big the created " lab10.tar.gz" file is.
If " lab10.tar.gz" is larger than 1MB in size, the submission server will not accept it.
If you use an IDE, the IDE may put your source code in subdirectories. In that case,
you need to modify the commands above so that you include ALL
the necessary source files and subdirectories (and don't include any binary files)
ane make sure that your code can be compiled without the IDE since the grader is not allowed to use an IDE to compile your code.
You should read the output of the above commands carefully to make sure that "lab10.tar.gz" is created properly.
If you don't understand the output of the above commands, you need to learn how to read it!
It's your responsibility to ensure that "lab10.tar.gz" is created properly.
To check the content of "lab10.tar.gz", you can use the following command:
tar tvf lab10.tar.gz
Please read the output of the above command carefully to see what files were included in " lab10.tar.gz"
and what are their file sizes and make sure that they make sense.
Please enter your USC e-mail address and your submission PIN below. Then click on the Browse button
and locate and select your submission file (i.e., "lab10.tar.gz").
Then click on the Upload button to submit your "lab10.tar.gz".
(Be careful what you click! Do NOT submit the wrong file!)
If you see an error message, please read the dialogbox carefully and fix what needs to be fixed and repeat the procedure.
If you don't know your submission PIN, please visit this web site to have your PIN e-mailed to your USC e-mail address.
When this web page was last loaded, the time at the submission server at merlot.usc.edu was
27Nov2025-18:59:17.
Reload this web page to see the current time on merlot.usc.edu.
If the command is executed successfully and if everything checks out,
a ticket will be issued to you to let you know "what" and "when"
your submission made it to the Bistro server. The next web page you
see would display such a ticket and the ticket should look like
the sample shown in the submission web page
(of course, the actual text would be different, but the format should be similar).
Make sure you follow the Verify Your Ticket instructions
to verify the SHA1 hash of your submission to make sure what you did not accidentally submit the wrong file.
Also, an e-mail (showing the ticket) will be sent to your USC e-mail address.
Please read the ticket carefully to know exactly "what" and "when"
your submission made it to the Bistro server.
If there are problems, please contact the instructor.
It is extreme important that you also verify your submission
after you have submitted "lab10.tar.gz" electronically to make
sure that every you have submitted is everything you wanted us to grade.
If you don't verify your submission and
you ended up submit the wrong files, please understand that due to our fairness policy,
there's absolutely nothing we can do.
Finally, please be familiar with the Electronic Submission Guidelines
and information on the bsubmit web page.
|