Lab #1

[ This content is protected and may not be shared, uploaded, or distributed. ]

(Please also check out PA1 FAQ.)

Lab 1 has three parts:

Part A - install and get familiar with Ubuntu
Part B - hello, Makefile, gdb
Part C - line counting program
Part D - using system calls to count lines

Part A (`lab1a`) - install and get familiar with Ubuntu:

This is no coding for this part of the lab.
Download time can multiply when multiple people are downloading large files simultaneously! Therefore, it's a good idea that before you come to the first lab, download ubuntu-16.04.6-desktop-i386.iso onto your desktop to save time during the lab (this file is 1.64GB in size). You should also download VirtualBox onto your desktop as well. Go to the VirtualBox "older builds" download web page, and download version 6.0.14 (or newer) for your platform. Last I checked, for Windows, this file (VirtualBox-6.0.14-*-Win.exe) is 170MB in size, and for Mac OS X, this file (VirtualBox-6.0.14-*-OSX.dmg) is 138MB in size. By the way, you should always download a "stable release" and not download anything experimental, no matter how great it sounds!
Follow the instructions as closely as possible to install a standard 32-bit Ubuntu 16.04 into VirtualBox on your laptop or desktop. The expectation for this class is that you will do all your programming assignments and have them graded in a standard 32-bit Ubuntu 16.04 system running inside VirtualBox. If you don't have a laptop or a desktop that meets the system requirement, you must send an e-mail to the instructor to let the instructor know to figure out a way to proceed (and if you are thinking about purchasing a new laptop, I would recommend a machine that runs an Intel/AMD CPU, as opposed to an ARM-based CPU since an ARM-based CPU will most likely have trouble running VirtualBox).
Start a Terminal program and try out some commonly used Unix/Linux commands:
    ls
    ls -a
    ls -l
    echo "hello"
    echo -n "hello"
    echo `date`
    echo `date +%m%d%y-%H%M%S`
    cat /etc/os-release
    more /etc/os-release
    mkdir tmp
    pwd
    cd tmp
    pwd
    ls
    cd ..
    cp /etc/os-release tmp
    cp tmp/os-release tmp/abc
    ls -aF tmp
    mv tmp/os-release tmp/xyz
    ls -aF tmp
    man g++
    set i = 1
    @ i = $i + 4
    echo "i is now $i"
    rm tmp/abc
    touch tmp/defg
    ps -x
    ps -auxw
    pico tmp/xyz
    exit
You need to make sure that you understand exactly what each of these command does.
If you are doing your labs on viterbi-scf1.usc.edu (most likely because the only machine you have access to is the new Mac with an M1 CPU), in addition to running the above commands, you should try out the following on your laptop (from a "terminal" program) and make sure you understand what's going on:
    echo "Hello" > hello.txt
    scp hello.txt YOURLOGIN@viterbi-scf1.usc.edu:
where YOURLOGIN is local-part of your USC E-mail address (and use your USC E-mail password when you are prompted for a password). [BC: added 1/13/2023 (from slide 5 of Lecture 2)] Then ssh to viterbi-scf1.usc.edu by doing:
    ssh -X -Y YOURLOGIN@viterbi-scf1.usc.edu
Then do the following on viterbi-scf1.usc.edu:
    more ~/hello.txt
    cp ~/hello.txt ~/hello-copy.txt
    ls -l ~/hello.txt ~/hello-copy.txt
Then do the following on your laptop:
    scp YOURLOGIN@viterbi-scf1.usc.edu:hello-copy.txt .
    ls -l hello.txt hello-copy.txt
    diff hello.txt hello-copy.txt
If you see any error messages, you should investigate and figure out how to fix the issues.
Please note that from the content of /etc/os-release, the grader can tell what system you are running on. On Ubuntu, it would say that the OS name is "Ubuntu". On viterbi-scf1.usc.edu, it would say that the OS name is "CentOS Linux".

Part B (`lab1b`) - hello, Makefile, `gdb`:

This is very little coding for this part of the lab since you can just copy everything from the provided template below or from a downloaded file below.
Create an empty directory (call it "lab1") and change directory into it.
Download lab1data.tar.gz into that directory and type:
    tar xvf lab1data.tar.gz
This should create a subdirectory called "lab1data" with a bunch of text files in it.
Write a standard "hello" program in "lab1b.cpp" (in C++) and write a "Makefile" so that when you type "make lab1b", an executable called lab1b will get created, (please note that these files are in lab1data.tar.gz already, so you really don't have to write them. Just do "cd lab1data"). When you type "make clean", all ".o" files and lab1b will be deleted, and when you type "./lab1b", "Hello World!" will get printed (the string you print needs to be "Hello World!\n" so that the commandline prompt is displayed at the beginning of the commandline).
Create a "transcript" of compiling and running your "hello" program and include it in your submission. To create a "transcript", first cd (change directory) into the directory what has your files. Then type "script lab1b.script" to start your transcript. In the terminal, please enter the following commands:
    uname -a
    cat /etc/os-release
    make clean
    make lab1b
    ./lab1b
    gdb lab1b
    (gdb) break main
    (gdb) run
    (gdb) next
    (gdb) where
    (gdb) list
    (gdb) next
    (gdb) next
    (gdb) cont
    (gdb) run aa bbb cccc
    (gdb) next
    (gdb) cont
    (gdb) quit
    make clean
    exit
Please note that "(gdb) " above is the prompt used by the debugger (gdb). Please spend some time understanding what you are seeing.
A file called "lab1b.script" will be created and it's a "transcript" of the session you created with the "script" command. You can use a text editor to view this file.

Part C (`lab1c`) - line counting program:

This is only a few lines of code to write for this part of the lab since you just need to convert the pseudo-code below into C++ code.
Write a program in "lab1c.cpp" and modify the "Makefile" in lab1b by add a new target/section so that when you type "make lab1c", an executable called lab1c will get created, when you type "make clean", all ".o" files, lab1b, and lab1c will be deleted, and when you type "lab1c filename", it will print the number of characters in each line in filename and finally print the number of lines in filename. For example, if filename contains 3 lines and the line lengths are 9, 19, and 29, your printout should look like:
    9
    19
    29
    number of lines read: 3
Please note that the legnth of a line must not include the '\n' character at the end of that line. Also, if the last line of a text file is not empty and it's not be terminated with a '\n' character, it's known as an "incomplete last line" and that's not an error condition! If you have an incomplete last line, you must increment the line count and count that line as a line! If filename does not contain an incomplete last line, the number in the last line in your printout must match the printout of the "wc" program (since the "wc" program counts the number of '\n' characters in the input file):
    wc -l filename
On the other hand, if filename does contain an incomplete last line, the number in the last line in your printout must be one more than in the printout of the "wc" program above.
Here is an example of a file with an incomplete last name called "incomplete.txt" (this file is created in Notepad in Windows and this is known as a DOS/Windows text file which always has an incomplete last line). Please right-click on the link and select Save As to save it into a file to get the exact content of this file (7 bytes long). If you don't save this file correctly and end up with a file whose file size is not 7, you must download it again in the correct way. If you do:
    wc -l incomplete.txt
it will say that this file has 2 lines. But your code must produce the following printout:
    2
    2
    1
    number of lines read: 3
To see a hexdump of "incomplete.txt" so you know exactly what's in it, do:
    xxd -g 1 incomplete.txt
and you should see:
    00000000: 31 0d 0a 32 0d 0a 33                             1..2..3
In the above printout, the first column is the byte offset into the file (in hex) of the first byte of the current line, the middle column is the hex value of 16 bytes of data, and the 3rd column is the ASCII representation of the data bytes (a period is used if there is no "printable" version of the corresponding data byte). A hex value of 0d corresponds to the '\r' character (also known as <CR>) and 0a corresponds to the '\n' character (also known as <LF>).
The xxd program is an important debugging tool when you are getting data that doesn't match what you are supposed to get. When that happens, you can do a hexdump of the file you have created and compare that against the hexdump of the data you are supposed to get and find the very first byte that's different between them and figure out where your code went wrong.
Create a "transcript" of compiling and running your "lab1c" program and include it in your submission. To create a "transcript", first cd (change directory) into the directory what has your files. Then type "script lab1c.script" to start your transcript. If your command shell is bash, the "foreach" command below will not work and you should first type "tcsh" to change your command shell to tcsh before proceeding (and type "exit" afterwards). In the terminal, please enter the following commands:
    uname -a
    make clean
    make lab1c
    foreach f (Makefile *.c* *.h*)
        echo "=== $f ==="
        ./lab1c $f
        echo -n "wc: "
        wc -l $f
    end
    make clean
    exit
A file called "lab1c.script" will be created and it's a "transcript" of the session you created with the "script" command. You can use a text editor to view this file.

Part D (`lab1d`) - using system calls to count lines:

Write a program in "lab1d.cpp" and modify the "Makefile" in lab1c by add a new target/section so that when you type "make lab1d", an executable called lab1d will get created, when you type "make clean", all ".o" files, lab1b, lab1c, and lab1d will be deleted, and when you type "lab1d filename", its behavior should be exactly the same as lab1c. The main difference here is that you are required to make system calls. First, you must use the open() system to open "filename" in the following way:
    int fd = open(argv[1], O_RDONLY);
To see the manual for the open() system call, please do:
    man -s 2 open
If open() failed, fd would be a negative value and you must do the following immediately to print an error message and then quit your program:
    perror("open");
    exit(1);
The perror() function prints an error-specific message for the most recent system call. The exit() system call quits your program.
Otherwise (i.e., if fd ≥ 0), fd is known as a file descriptor that represents an opened file maintained by the OS. For Linux, the OS maintains a file descriptor table for each running program, each entry inside this table points to a file opened with the open() system call. The array index to access this table is called a file descriptor (or fd for short). For a Linux program, the size of this file descriptor table is typically 1024. Therefore, a valid file descriptor is ≥ 0 and < 1024.
Second, please copy the read_a_line() function in "my_readwrite.cpp" into your code. For this part of the lab, your code must contain the following loop structure to read lines from the input file (equivalent variation is fine):
    for (;;) {
        string line;
        if (read_a_line(fd, line) <= 0) {
            break;
        }
        // your code to print line length
    }
    // your code to print number of lines read
Remember, a line length must not count a tailing '\n' character.
The read_a_line() function is a very important function for this class. When you are doing "sockets programming", you will need to use this function as is. Therefore, it's worthwhile to understand exactly what's going on inside of this function. Before we go into that, you must first understand that you should ONLY use this function if you are expecting to read a line of ASCII text characters! Do NOT use this funciton to read binary data! If you use this function to read binary data, you may get unexpected results!
Below is the read_a_line() code (I started with the code in "my_readwrite.cpp" and changed some variable names and got rid of all the comments):
    [ 1]  int read_a_line(int fd, string& line)
    [ 2]  {
    [ 3]      string s = "";
    [ 4]      int idx = 0;
    [ 5]      char ch = '\0';
    [ 6]
    [ 7]      for (;;) {
    [ 8]          int bytes_read = read(fd, &ch, 1);
    [ 9]          if (bytes_read < 0) {
    [10]              if (errno == EINTR) {
    [11]                  continue;
    [12]              }
    [13]              return (-1);
    [14]          } else if (bytes_read == 0) {
    [15]              if (idx == 0) return (-1);
    [16]              break;
    [17]          } else {
    [18]              s += ch;
    [19]              idx++;
    [20]              if (ch == '\n') {
    [21]                  break;
    [22]              }
    [23]          }
    [24]      }
    [25]      line = s;
    [26]      return idx;
    [27]  }
Each time this function is called, it read one line of ASCII text from the file descriptor fd and return it in the line (i.e., the 2nd argument). The return value of this function is either the number of bytes read from the file descriptor (which is also the length of the returned string), or -1 if there is an error of if end-of-file has been reached and nothing can be returned.
This function stays in an infinite loop keep calling the read() system call on line 8 to read one byte at a time from the file descriptor. The read() system call is supposed to returns the number of bytes read from the file descriptor fd. Since we are reading the maximum of one byte (given by the 3rd argument in our call to read()), the only possible return value of read() would be either 1, 0, or -1 in this scenario. If read() returns 1, it means that a byte was successfully read into ch and this byte is appended to the result string. If read() returns 0, it means that end-of-file has been reached. If we haven't read anything since we have entered this function, we return -1 on line 15 to tell the application that it should not call this function again. (If the application calls this function again, it will keep returning -1 and we would end up in an infinite loop.) If read() returns -1, it means that there is an error reading from the file descriptor. As it turns out, there is a special error condition (indicated by errno == EINTR on line 10) where we are just supposed to ignore the error and try again. This is kind of weird, but this can happen since we are doing "system programming". More importantly, if we don't handle this condition correctly, bad things can happen!
Please read the above code carefully and make sure you fully understand how it works and why it works. Later on in the semester, when you are doing "sockets programming", the fd argument will be a "socket file descriptor", which you get when you open a connection to another application (running either on the same machine or on a machine across the Internet). So, reading a line of text from a file is exactly the same as reading a line of text from a Internet connection (if you do it this way)! In Linux, data coming from a file and data coming from a network/Internet connection have the same byte stream abstraction, i.e., the data is just a stream of bytes. In a byte stream abstraction, we don't know when or where the byte stream will end until we hit the end of the byte stream or we got an error! Also, there is no way to "go back" when you are reading data from a stream of bytes.
Finally, when you are done with the file, you should do:
    close(fd);
Once you have closed a file, you should not use fd.
When you are done with your code, please create a "transcript" of compiling and running your "lab1d" program and include it in your submission. To create a "transcript", first cd (change directory) into the directory what has your files. Then type "script lab1d.script" to start your transcript. If your command shell is bash, the "foreach" command below will not work and you should first type "tcsh" to change your command shell to tcsh before proceeding. In the terminal, please enter the following commands:
    make clean
    make lab1d
    foreach f (Makefile *.c* *.h*)
        echo "=== $f ==="
        ./lab1d $f
        echo -n "wc: "
        wc -l $f
    end
    make clean
    exit
A file called "lab1d.script" will be created and it's a "transcript" of the session you created with the "script" command. You can use a text editor to view this file.
One important take-away for this lab is that library functions such as getline() can hide data from the input stream from you. If you want to make sure that you don't miss a single byte from the input stream, using system calls such as read() may be the best way. Later on, when you do your networking labs and programming assignments, it's important that you don't miss a single byte of data from the input stream.

Templates

All pseudo-code is incomplete and error checking is often left out in pseudo-code. Feel free to send your questions (and not your code) to the instructor.

This is your first lab, so the code for lab1b.cpp and Makefile are provided in the "lab1data" directory!

Code for `lab1b`:

Your "lab1b.cpp" should look like:
        /* C++ standard include files first */
        #include <iostream>
        #include <iomanip>
        #include <string>

        using namespace std;
    
        /* C system include files next */
        #include <sys/time.h>
        #include <arpa/inet.h>
        #include <netdb.h>

        /* C standard include files next */
        #include <errno.h>
        #include <unistd.h>
        #include <stdlib.h>
        #include <stdio.h>
        #include <fcntl.h>
        #include <string.h>
        #include <signal.h>

        /* your local include files next */
        /* nothing to to #include for this program */

        int main(int argc, char *argv[])
        {
            /* this is how to iterate through commandline arguments */
            for (int i=1; i < argc; i++) cout << argv[i] << endl;

            cout << "Hello World!" << endl;
            return 0;
        }
Please note that the above code is a good starting point for all your lab and programming assignments! When you need to add an additional #include line for your assignment, make sure you add it at the bottom of the right section. The term "system include files" refers to include file names that are surrounded by a pair of angled brackets. The term "local include files" refers to include file names that are surrounded by a pair of double quotes.
Your "Makefile" should look like (the indented lines must be indented with <TAB> characters and non-indented lines must not have leading blank characters):
        MYDEFS = -g -Wall -std=c++11 -DLOCALHOST=\"127.0.0.1\"

        lab1b: lab1b.cpp 
                g++ ${MYDEFS} -o lab1b lab1b.cpp
    
        clean:
                rm -f lab1b *.o

Pseudo-code for `lab1c`:

This is pretty straight-forward. Although one thing you need to be careful about is that a file stream's fail() condition is set to true when getline() cannot read anything from file. Therefore, in the code below, we call std::getline() first, and then we check the fail() condition of the file stream.
    count = 0
    while (true) do:
        getline(file, line)
        if (file.fail()) then
            break
        end-if
        print line length
        count++
    end-while
    print "number of lines read: " + count

Pseudo-code for `lab1d`:

Please note the similarity between this code and the pseudo-code of lab1c above.

    count = 0
    while (true) do:
        return_value = read_a_line(fd, line)
        if (return_value <= 0) then
            break
        end-if
        print line length
        count++
    end-while
    print "number of lines read: " + count

Grading

Below is the grading breakdown:

(1 pt) install standard 32-bit Ubuntu 16.04 into VirtualBox on your laptop or desktop (or do the lab on viterbi-scf1.usc.edu)
(1 pt) submitted a valid lab1.tar.gz file with all the required files using the submission procedure below
(1 pt) content in "lab1b.script", "lab1c.script", and "lab1d.script" are correct (including all the gdb printout)
(1 pt) "Makefile" works for "make lab1c" and "make lab1d"
(1 pt) source code of your "lab1c.cpp" and "lab1d.cpp" looks right

Minimum deduction is 0.5 pt for anything that's incorrect. Please note that for the "Makefile" item, you can only get credit for it if your "source code" is relevant to this lab; therefore, you can only get as many points as the "source code" item in the best case.

Please keep in mind that even though lab grading is "light", it doesn't mean that you can just put anything into your submission! It's still your responsibility to make sure that the files in your submission contains information that's relevant to the tests you were supposed to run. Use the "more" command to view your script/log files to make sure that they contain the right information. If a file has the wrong stuff in it, you should delete it and create the file again and verify. If most of the stuff in your script/log files are wrong and you did not notice it, we will most likely have to take points off.

Submission

To submit your work, you must first tar all the files you want to submit into a tarball and gzip it to create a gzipped tarfile named "lab1.tar.gz". Then you upload "lab1.tar.gz" to our Bistro submission server.

Change into the "lab1" directory you have created above and enter the following command to create your submission file "lab1.tar.gz" (if you don't have any ".h" files, don't include "*.h*" at the end):

    tar cvzf lab1.tar.gz lab1*.script Makefile *.c* *.h*
    ls -l lab1.tar.gz

The last command shows you how big the created "lab1.tar.gz" file is. If "lab1.tar.gz" is larger than 1MB in size, the submission server will not accept it.

If you use an IDE, the IDE may put your source code in subdirectories. In that case, you need to modify the commands above so that you include ALL the necessary source files and subdirectories (and don't include any binary files) ane make sure that your code can be compiled without the IDE since the grader is not allowed to use an IDE to compile your code.

You should read the output of the above commands carefully to make sure that "lab1.tar.gz" is created properly. If you don't understand the output of the above commands, you need to learn how to read it! It's your responsibility to ensure that "lab1.tar.gz" is created properly.

To check the content of "lab1.tar.gz", you can use the following command:

    tar tvf lab1.tar.gz

Please read the output of the above command carefully to see what files were included in "lab1.tar.gz" and what are their file sizes and make sure that they make sense.

Please enter your USC e-mail address and your submission PIN below. Then click on the Browse button and locate and select your submission file (i.e., "lab1.tar.gz"). Then click on the Upload button to submit your "lab1.tar.gz". (Be careful what you click! Do NOT submit the wrong file!) If you see an error message, please read the dialogbox carefully and fix what needs to be fixed and repeat the procedure. If you don't know your submission PIN, please visit this web site to have your PIN e-mailed to your USC e-mail address.

When this web page was last loaded, the time at the submission server at merlot.usc.edu was 27Nov2025-18:59:27. Reload this web page to see the current time on merlot.usc.edu.

If the command is executed successfully and if everything checks out, a ticket will be issued to you to let you know "what" and "when" your submission made it to the Bistro server. The next web page you see would display such a ticket and the ticket should look like the sample shown in the submission web page (of course, the actual text would be different, but the format should be similar). Make sure you follow the Verify Your Ticket instructions to verify the SHA1 hash of your submission to make sure what you did not accidentally submit the wrong file. Also, an e-mail (showing the ticket) will be sent to your USC e-mail address. Please read the ticket carefully to know exactly "what" and "when" your submission made it to the Bistro server. If there are problems, please contact the instructor.

It is extreme important that you also verify your submission after you have submitted "lab1.tar.gz" electronically to make sure that every you have submitted is everything you wanted us to grade. If you don't verify your submission and you ended up submit the wrong files, please understand that due to our fairness policy, there's absolutely nothing we can do.

Finally, please be familiar with the Electronic Submission Guidelines and information on the bsubmit web page.

(5 points total)

Install Standard 32-bit Ubuntu 16.04, Unix/Linux Commands, Hello, Counting, Makefile, GDB

Part A (`lab1a`) - install and get familiar with Ubuntu:

Part B (`lab1b`) - hello, Makefile, `gdb`:

Part C (`lab1c`) - line counting program:

Part D (`lab1d`) - using system calls to count lines:

Code for `lab1b`:

Pseudo-code for `lab1c`:

Pseudo-code for `lab1d`:

Lab #1

(5 points total)

Install Standard 32-bit Ubuntu 16.04, Unix/Linux Commands, Hello, Counting, Makefile, GDB

Part A (lab1a) - install and get familiar with Ubuntu:

Part B (lab1b) - hello, Makefile, gdb:

Part C (lab1c) - line counting program:

Part D (lab1d) - using system calls to count lines:

Code for lab1b:

Pseudo-code for lab1c:

Pseudo-code for lab1d:

Part A (`lab1a`) - install and get familiar with Ubuntu:

Part B (`lab1b`) - hello, Makefile, `gdb`:

Part C (`lab1c`) - line counting program:

Part D (`lab1d`) - using system calls to count lines:

Code for `lab1b`:

Pseudo-code for `lab1c`:

Pseudo-code for `lab1d`: