Programming Assignment #5

(100 points total)

Memory Management

Due 11:45PM 7/29/2025 (firm)

[ This is a BACKUP server. Do NOT use unless the primary server is DOWN. ]

[ This content is protected and may not be shared, uploaded, or distributed. ]

This spec is private (i.e., only for students who took or are taking CSCI 350 at USC). You do not have permissions to display this spec at a public place (such as a public bitbucket/github). You also do not have permissions to display the code you write to implementation this spec at a public place since your code was written to implement a private spec. (If a prospective employer asks you to post your code, please tell them that you do not have permissions to do so; but you can send them a private copy and ask them to keep it private.)

Shortcuts:

The goal of this assignment is to thoroughly understand how memory management and paging works in operating systems. We will achieve this goal by implementing Copy-on-Write (CoW) fork feature in xv6.

Start by reading Chapter 2 of the xv6 book. This is a very good guide to the details of how paging works in xv6.

Your "standard" system already has all the tools you need. To begin, open a Terminal and type the following: By default, your command shell is bash and your current working directory is /home/student.

    cd ~/cs350
    mkdir pa5
    cd pa5
    wget --user=USERNAME --password=PASSWORD http://bourbon.usc.edu/cs350-m25/programming/pa5/xv6-pa5-src.tar.gz
where USERNAME and PASSWORD are the username and password used to access protected content from our class web site.

After you have downloaded the xv6-pa5-src.tar.gz file, do:

    tar xvf xv6-pa5-src.tar.gz
    cd xv6-pa5-src
To make your life easier and our testing easier, you should run xv6 on only a single CPU (the default is two). To do this, in your Makefile, replace CPUS := 2 with CPUS := 1. When the grader grade this assignment, the grader will only grade with CPUS := 1.

Now you are all set up!

Make sure you watch the lecture that covers the PA5 slides.

Prior to updating the memory system of xv6, you will develop a tool to assist in testing the current memory usage of the processes in the system. To do so, enhance the capability of the xv6’s process details viewer. Start by running xv6 and then press <Ctrl+p>l;. You should see a detailed account of the currently running processes. Try running any program (e.g., usertests) and press <Ctrl+p>l; again (during and after its execution). The <Ctrl+p>l; command provides important information on each of the processes. However, it reveals no information regarding the current memory state of each process. Add the required changes to the function handling <Ctrl+p>l; (see procdump() in proc.c) so that the user space memory state of all the processes in the system will also be printed as written below. The memory state includes the mapping from virtual pages to physical pages (for all the used pages) and the value of the writable flag.

For each process, the following should be displayed (words in italics should be replaced by the appropriate values): Process information currently printed with <Ctrl+p>l; page mappings:

    virtual page number -> physical page number, writable?
    ...
    virtual page number -> physical page number, writable?
For example, in a system with 2 processes, the information should be displayed as follows:
    1 sleep init 80104907 80104647 8010600a 80105216 801063d9 801061dd
    1 -> 300, y
    200 -> 500, n
    2 sleep sh 80104907 80100966 80101d9e 8010113d 80105425 80105216 801063d9
    1 -> 306, y
    200 -> 500, n
The above is showing that the first process (init) has 2 pages. Virtual page number 1 is mapped to physical page number 300 and its writable bit is set. The virtual page number 200 is mapped to physical page number 500 and its writable bit is not set. The second process (sh) also has 2 pages. Virtual page number 1 is mapped to physical page number 306 and its writable bit is set. The virtual page number 200 is mapped to physical page number 500 and its writable bit is not set. The hex numbers after a process name are return addresses in the process's call stack.

Note that:

  • The values above are made up.
  • Don’t print anything for pages or page tables that are currently unused.
  • Only print pages that are user pages (PTE_U).
  • The information obtained from ^P will be used for the grading.

Most changes will be in vm.c.

Upon execution of a fork() system call, a process is duplicated together with its memory. The XV6 implementation of fork() entails copying all the memory pages to the child process, a scheme which may require a long time to complete due to the many memory accesses needed. In a normal program, many memory pages will have no change at all following fork() (e.g., the text segment), furthermore, many times fork() is followed by a call to exec, in which copying the memory becomes redundant. In order to account for these cases, modern operating systems employ a scheme of copy-on-write, in which a memory page is copied only when it needs to be modified. This way, the child and parent processes share equal memory pages and unneeded memory accesses are prevented.

In this task you will implement the COW optimization in XV6. To accomplish this task, change the behavior of the fork() system call so that it will not copy memory pages, and the virtual memory of the parent and child processes will point to the same physical pages. Note that it is OK to copy page tables of a parent process (though you may also share/cow the page tables if you would like to).

In order to prevent a change to a shared page (and be able to copy it), render each shared page as read-only for both the parent and the child. Now, when a process would try and write to the page, a page fault will be raised and then the page should be copied to a new place (and the previous page should become writeable once again).

In order to be able to distinguish between a shared read-only page and a non-shared read-only page, add another flag to each virtual page to mark it as a shared page. Notice that a process may have many children, which in turn, also have many children, thus a page may be shared by many processes. For this reason, it becomes difficult to decide when a page should become writeable. To facilitate such a decision, add a counter for each physical page to keep track of the number of processes sharing it (it is up to you to decide where to keep these counters and how to handle them properly). Since a limit of 64 processes is defined in the system, a counter of 1 byte per page should suffice. Since the wait system call deallocates all of the process's user space memory, it requires additional attention – remember do not deallocate the shared pages.

Note that:

  • When a page fault occurs, the faulty address is written to the cr2 register.
  • The CR3 register holds a pointer to the top-level page directory, and entries from this page table are cached in the hardware managed TLB. The OS has no control over the TLB; it can only build the page table. Whenever any changes are made to the page table, the TLB entries may not be valid anymore. So, whenever you make any changes to the page table of a process, you must re-install that page table by writing its address into CR3, using the following function provided by xv6.
        lcr3(V2P(pgdir));
    This operation ensures that the old TLB entries are flushed as well. Note that xv6 already does this TLB flush when switching context and address spaces, but you may have to do it additionally in your code when you modify any page table entries as part of your CoW implementation.

Use testcow-parent.c (already included in your repository):
    $ testcow-parent
    testcow-parent: space=40952
    1 sleep  init 80103db7 80103e59 801048b7 80105979 801056b4
    0 -> 57210, y
    2 -> 57206, y
    2 sleep  sh 80103db7 80103e59 801048b7 80105979 801056b4
    0 -> 57140, y
    1 -> 57138, n
    3 -> 57135, y
    3 run    testcow-parent
    0 -> 57060, y
    2 -> 57057, y
    3 -> 57276, y
    4 -> 57277, y
    5 -> 57279, y
    6 -> 57280, y
    7 -> 57281, y
    8 -> 57282, y
    9 -> 57283, y
    10 -> 57284, y

    Parent process before changing values

    1 sleep  init 80103db7 80103e59 801048b7 80105979 801056b4
    0 -> 57210, y
    2 -> 57206, y
    2 sleep  sh 80103db7 80103e59 801048b7 80105979 801056b4
    0 -> 57140, y
    1 -> 57138, n
    3 -> 57135, y
    3 run    testcow-parent
    0 -> 57060, n
    2 -> 57134, y
    3 -> 57276, n
    4 -> 57277, n
    5 -> 57279, n
    6 -> 57280, n
    7 -> 57281, n
    8 -> 57282, n
    9 -> 57283, n
    10 -> 57284, n
    4 zombie testcow-parent
    0 -> 57060, n
    2 -> 57057, y
    3 -> 57276, n
    4 -> 57277, n
    5 -> 57279, n
    6 -> 57280, n
    7 -> 57281, n
    8 -> 57282, n
    9 -> 57283, n
    10 -> 57284, n

    Parent process after changing values

    1 sleep  init 80103db7 80103e59 801048b7 80105979 801056b4
    0 -> 57210, y
    2 -> 57206, y
    2 sleep  sh 80103db7 80103e59 801048b7 80105979 801056b4
    0 -> 57140, y
    1 -> 57138, n
    3 -> 57135, y
    3 run    testcow-parent
    0 -> 57060, n
    2 -> 57134, y
    3 -> 57276, n
    4 -> 57277, n
    5 -> 57279, n
    6 -> 57280, n
    7 -> 57281, n
    8 -> 57282, n
    9 -> 57056, y
    10 -> 57055, y
    4 zombie testcow-parent
    0 -> 57060, n
    2 -> 57057, y
    3 -> 57276, n
    4 -> 57277, n
    5 -> 57279, n
    6 -> 57280, n
    7 -> 57281, n
    8 -> 57282, n
    9 -> 57283, n
    10 -> 57284, n
    $ 
Use testcow-child.c (already included in your repository)
    $ testcow-child
    testcow-child: space=40952
    1 sleep  init 80103db7 80103e59 801048b7 80105979 801056b4
    0 -> 57210, y
    2 -> 57206, y
    2 sleep  sh 80103db7 80103e59 801048b7 80105979 801056b4
    0 -> 57140, y
    1 -> 57138, n
    3 -> 57135, y
    3 run    testcow-child
    0 -> 57060, y
    2 -> 57057, y
    3 -> 57276, y
    4 -> 57277, y
    5 -> 57279, y
    6 -> 57280, y
    7 -> 57281, y
    8 -> 57282, y
    9 -> 57283, y
    10 -> 57284, y

    Child process before changing values

    1 sleep  init 80103db7 80103e59 801048b7 80105979 801056b4
    0 -> 57210, y
    2 -> 57206, y
    2 sleep  sh 80103db7 80103e59 801048b7 80105979 801056b4
    0 -> 57140, y
    1 -> 57138, n
    3 -> 57135, y
    3 sleep  testcow-child 80103db7 80103e59 801048b7 80105979 801056b4
    0 -> 57060, n
    2 -> 57134, y
    3 -> 57276, n
    4 -> 57277, n
    5 -> 57279, n
    6 -> 57280, n
    7 -> 57281, n
    8 -> 57282, n
    9 -> 57283, n
    10 -> 57284, n
    4 run    testcow-child
    0 -> 57060, n
    2 -> 57057, y
    3 -> 57276, n
    4 -> 57277, n
    5 -> 57279, n
    6 -> 57280, n
    7 -> 57281, n
    8 -> 57282, n
    9 -> 57283, n
    10 -> 57284, n

    Child process after changing values

    1 sleep  init 80103db7 80103e59 801048b7 80105979 801056b4
    0 -> 57210, y
    2 -> 57206, y
    2 sleep  sh 80103db7 80103e59 801048b7 80105979 801056b4
    0 -> 57140, y
    1 -> 57138, n
    3 -> 57135, y
    3 sleep  testcow-child 80103db7 80103e59 801048b7 80105979 801056b4
    0 -> 57060, n
    2 -> 57134, y
    3 -> 57276, n
    4 -> 57277, n
    5 -> 57279, n
    6 -> 57280, n
    7 -> 57281, n
    8 -> 57282, n
    9 -> 57283, n
    10 -> 57284, n
    4 run    testcow-child
    0 -> 57060, n
    2 -> 57057, y
    3 -> 57276, n
    4 -> 57277, n
    5 -> 57279, n
    6 -> 57280, n
    7 -> 57281, n
    8 -> 57282, n
    9 -> 57056, y
    10 -> 57055, y
    $

The grading guidelines has been made available. Please run the commands in the grading guidelines on a standard 32-bit Ubuntu 16.04 system. It is possible that there are bugs in the guidelines. If you find bugs, please let the instructor know as soon as possible. (Note: if I change the grading guidelines, I will make an announcement in the class Piazza. Please remember that you are required to read my posts in the class Piazza.)

The grading guidelines is the only grading procedure we will use to grade your program. No other grading procedure will be used. Please note that the grader may use a different set of test programs and commandline arguments when grading your submission. (We may make minor changes if we discover bugs in the script or things that we forgot to test.) It is strongly recommended that you run your code through the commands in the grading guidelines.

If you have made all your code changes in-place (which you are requied to do), it will be very easy to create a gzipped tarfile for submission using the Bistro system. To create your submission file, just do the following:
    make pa5-submit
and the following command should run:
    tar cvzf pa5-submit.tar.gz \
            Makefile \
            pa5-README.txt \
            proc.c \
            proc.h \
            syscall.c \
            syscall.h \
            sysproc.c \
            user.h \
            usys.S \
            defs.h \
            trap.c \
            mmu.h \
            vm.c
Please note that you are not permitted to change any other existing files. If you submit another existing file, it will be deleted before grading. If you give instructions to the grader to modify another existing file (other than Config.mk), it will disregarded.

A pa5-README.txt template file is provided here. You must save it as your pa5-README.txt file and follow the instructions in it to fill it out with appropriate information and include it in your submission. You must not delete a single line from w1-README.txt. Please make sure that you satisfy all the README requirements.

To check the content of pa5-submit.tar.gz, you can use the following command:

    tar tvzf pa5-submit.tar.gz
Please read the output of the above command carefully to see what files were included in pa5-submit.tar.gz and what are their file sizes and make sure that they make sense.

Please enter your USC e-mail address and your submission PIN below. Then click on the Browse button and locate and select your submission file (i.e., pa5-submit.tar.gz). Then click on the Upload button to submit your pa5-submit.tar.gz. (Be careful what you click! Do NOT submit the wrong file!) If you see an error message, please read the dialogbox carefully and fix what needs to be fixed and repeat the procedure. If you don't know your submission PIN, please visit this web site to have your PIN e-mailed to your USC e-mail address.

When this web page was last loaded, the time at the submission server (merlot.usc.edu) was 28Nov2025-01:56:08. Reload this web page to see the current time on merlot.usc.edu.

USC Email:
 @usc.edu
Submission PIN:
Event ID (read-only):
Submission File Full Path:
 
 
If the command is executed successfully and if everything checks out, a ticket will be issued to you to let you know "what" and "when" your submission made it to the Bistro server. The next web page you see would display such a ticket and the ticket should look like the sample shown in the submission web page (of course, the actual text would be different, but the format should be similar). Make sure you follow the Verify Your Ticket instructions to verify the SHA1 hash of your submission to make sure what you did not accidentally submit the wrong file. Also, an e-mail (showing the ticket) will be sent to your USC e-mail address. Please read the ticket carefully to know exactly "what" and "when" your submission made it to the Bistro server. If there are problems, please contact the instructor.

It is extreme important that you also verify your submission after you have submitted pa5-submit.tar.gz electronically to make sure that every you have submitted is everything you wanted us to grade. If you don't verify your submission and you ended up submit the wrong files, please understand that due to our fairness policy, there's absolutely nothing we can do.

Finally, please be familiar with the Electronic Submission Guidelines and information on the bsubmit web page.