Operating Systems and Systems Integration Tutorial on Memory Management, Deadlock and Operating System Types — Solutions 1 1.1 Background Memory management Virtual memory: is a method of managing memory automatically by the operating sytem so that the combined size of memory available to all processes running on the computer may be more than the physical memory available by using the hard disk to hold what is not in physical memory. swapping: involves moving all the content of memory associated with a process from ram to hard disk, and back as the operating system needs memory to run other processes. This is inefficient for large processes. Results in holes, where ram allocated to two large processes may have a hole between them which is too small to use for any process. paging: all memory is divided into chunks called pages. All pages are of a fixed size. The pages in ram used by a process do not have to be contiguous (next to one another), so no problem of holes. Memory Management Unit: is hardware that sits between the cpu and the system buses, translating virtual addresses used by programs into physical addresses. The mmu is connected as shown in figure 1. The mmu and cpu organisation fix the size of the pages, page tables and various other aspects of paging. CPU virtual address MMU memory disk controller physical address bus Figure 1: The memory management unit is shown here converting virtual addresses from the cpu to physical addresses. Virtual addresses: are the addresses used by a user’s program. The programmer can write the program as if the program has access to as much memory as required, without worrying about whether the addresses are used by other processes. All addresses in the program are virtual addresses, and are translated by the mmu to physical addresses. Nick Urbanik ver. 1.4 Solutions Tutorial on Memory Management, Deadlock and Operating System Types Operating Systems and Systems Integration Virtual address page number offset page 2 page table + + cr3 CPU reg Figure 2: A single-level paging system. Virtual memory addresses are 32-bit. Pages are 4K each. Virtual address directory page number offset page page table + page directory + + cr3 CPU reg Figure 3: A multi-level paging system. On the Intel platform, virtual memory addresses are 32-bit. Pages are 4K each. The page tables are themselves pages, also of 4K each. Since each page table entry is 4 bytes in size on the Intel platform, there are 1024 = 210 entries in each of the page tables. So there are ten bits required for the page number part of the virtual address. The page directory itself is a 4K page, each entry is 4 bytes, so there are 1024 entries, one for each page table. So there are ten bits required for the directory part of the virtual address. What is going on in those diagrams? To make sense of figure 3, let’s look at some of the main points. First, this is the representation of an Intel memory management system. Other architectures have different sizes of pages, different numbers of levels of page tables (e.g, the Compaq Alpha has three levels—see later). • The virtual address is the address used by any user programs running on the computer. • The page contains any data used by user programs, including the executable programs themselves. • The directory table and page table are both pages themselves, and are the same size. Nick Urbanik ver. 1.4 Solutions Tutorial on Memory Management, Deadlock and Operating System Types Operating Systems and Systems Integration 3 • On the Intel architecture, each entry in the page and directory tables is four bytes in size. • The most significant 10 bits of the virtual address are an index into the directory table, and are in the range of 0 to 210 − 1 = 1023. Let’s call it the directory number. • Similarly, the ten bits for the page index number are also an index in the range 0 to 1023. • The cr3 register is a cpu register that stores the address of the directory table. • To determine the address of the current page table entry, the mmu adds the contents of cr3 to the directory number × the size of a directory table entry (4). • To determine the address of the current page table, the mmu reads the content of the current directory table entry. • To determine the address of the current page, the mmu adds the number × the size of a page table entry to the address of the current page. • To determine the actual entry in the page (the content of the virtual address), the mmu adds the offset from the virtual address to the address of the current page table. Why we need multilevel paging (single level paging won’t do): Many people seemed unable to understand why the single level page table shown in the lectures needs 220 × 4 bytes of ram, and why multilevel paging does not. The key is that for virtual memory to work, there must be page table entries for the entire virtual address space. If the virtual addresses are 32 bits long, and each page is 4-kilobytes (212 bytes) in size, then there must be 232 ÷ 212 = 232−12 = 220 entries, one for each page. On the Intel platform, each entry is 4 bytes, so the total size of all page table entries is 4 × 220 = 4 MB. The single-level paging scheme shown in figure 2 on the preceding page shows the page table in one piece, so it must all be in ram. It is silly to keep all this in memory at once, since most page table entries are never used. For example, virtual memory in today’s computers is unlikely to be 4 GB unless the computer is a very busy server. The solution is to have only the necessary page entries in ram, and to have any others that have been used some time ago on the hard disk. It is more sensible to split the page table into pages that can be paged in and out of memory. This is what multilevel paging achieves. Explaining the “Example of paging: Intel x86” The example in the notes has confused many people since many of you do not know the meaning of 0x20000000 (or what 0xnnnn means). The prefix 0x indicates a hexadecimal value in the C programming language. You can have statements such as: int i = 0x123; which assigns the value 12316 to the variable i. All that I am showing here is what the components of the virtual address are. The example is simply showing the components of the virtual address 0x20021406 (i.e., 2002140616 ). The offset within the page is the least significant 12 bits, i.e., 40616 . Nick Urbanik ver. 1.4 Solutions Tutorial on Memory Management, Deadlock and Operating System Types Operating Systems and Systems Integration 4 The page number is the next 10 bits, i.e., 2116 . Note that the only allowable page numbers are 0 to 3F16 , since the process has been allocated pages in the range 2000000016 –2003FFFF16 . If any address used by this process were outside that range, the os would terminate the process with a segmentation fault. The most significant ten bits give the directory: 2 0 0 0000 2 1 4 0 6 0010 0000 0010 0001 0100 0000 0110 If you count the ten most significant bits, you get 0010 0000 002 , i.e., 8016 . The Compaq Alpha architecture uses a three-level paging system: The Compaq Alpha processor has 64-bit virtual addresses, and a 64-bit data bus. It also has 8 kB pages. There are a lot of 8 kB pages in 264 bytes: 8 kB = 213 bytes, so we have 264 ÷ 213 = 264−13=51 pages. Each page must have one page table entry. So we would need 251 = 2,251,799,813,685,248 page table entries. This is too many to fit in a two-level paging system, so the Alpha memory management system uses three level paging. global directory middle directory page number offset page page table page middle directory page global directory + + + + cr3 CPU reg Figure 4: A three-level paging system. Linux uses a three-level paging system: Since some architectures, such as the Compaq Alpha architecture, use three-level paging, to ensure portability, Linux implements a three-level paging system. 1.2 Types of Operating System Types of OS There are four main types of operating system (according to Andrew Tanenbaum, Modern Operating Systems): Monolithic OS is an OS where there is one level of privelege, and where all kernel functions execute in the same address space. Andrew described this organisation as “the big mess” which would be very hard to port to new architectures. Layered Kernel is an organisation where the kernel is divided into different layers, where a special system call is required to communicate between each layer. Each layer may have a different level of privelege. However, there is some overhead in communicating between each layer, and dividing operating system functions into layers that minimise communication between the layers is hard. Nick Urbanik ver. 1.4 Solutions Tutorial on Memory Management, Deadlock and Operating System Types Operating Systems and Systems Integration 5 Microkernel client-server is an organisation where as much of the OS function is done in “user space” servers rather than in the priveleged level of the kernel. The kernel mainly routes requests from client programs to these user-space servers. This should result in a much more portable OS with a very small kernel. Virtual Machine is another organisation for operating systems; most famous example is IBM 390 mainframe. Now it runs Linux on hundreds or thousands of virtual machines with virtual hardware. If one virtual machine crashes, no problem to other virtual machines. The crashed virtual machine can simply be rebooted without affecting any others. The main advantage is that the peak demands on all the servers can be averaged out. Normally, a smaller server must be made powerful enough to handle peak demand. Most of the time, this extra capacity is unused. With hundreds of virtual servers, some virtual servers can meet peak demands (and so get more processing power from the mainframe), while others that have less than average demand can use less cpu power from the mainframe. In this way, the total cpu power of the mainframe can be less than the total cpu power of many racks of smaller servers, (and require much less electrical power, less air-conditioning and cooling, and less physical space), but still meet all the peak demands placed on the individual virtual servers. VMware provides software equivalent of a virtual machine. IBM 390 has hardware support. Andy was wrong! Linux has a monolithic organisation, whereas Windows NT/2000 has a microkernel organisation. Andrew Tanenbaum argued with Linus Torvalds about the design of Linux for a long time. However, history has shown that Andy was wrong in some respects. The Linux kernel is much smaller and simpler than the Windows kernel, and is far easier to understand. The Linux kernel divides the hardware control into dynamically loadable kernel modules. Linux runs on a huge number of hardware platforms; Windows has reduced the number of platforms it can run on to one. 2 2.1 Questions Memory Management 1. What is virtual memory? i Solution: Virtual memory is a method of managing memory automatically by the operating sytem so that the combined size of memory available to all processes running on the computer may be more than the physical memory available by using the hard disk to hold what is not in physical memory. 2. Why do we need memory management? i Solution: Because programs grow as fast as ram becomes cheaper, and because sometimes we need to run programs that need more ram than we actually have. We need virtual memory. Virtual memory removes the need for the programmer Nick Urbanik ver. 1.4 Solutions Tutorial on Memory Management, Deadlock and Operating System Types Operating Systems and Systems Integration 6 to be aware of the details of what memory a particular system will provide to each program. 3. What is a Memory Management Unit (mmu)? i Solution: A memory management unit is hardware that maps virtual addresses onto physical memory. The mmu is connected as shown in figure 1 on page 1. 4. Paging in a virtual memory system allows nearly all contents of ram to be paged to and from the hard disk. However, there must be some pages that are never swapped to disk. List one example, and give reasons. i Solution: 5. For a hypothetical computer architecture, the following assumptions are made: • • • • • we are using two-level paging; the virtual addresses are 64 bits in size; size of a page is 8 kB; size of both a page table entry, and also a directory table entry is 8 bytes; the size of the directory table and page table are as close to each other as possible, calculate the size of the page directory and page table, in bytes. Note: they will need to be bigger than a single page. i Solution: Virtual address directory 00 1000 0000 00 0010 0001 offset 0100 0000 0110 page page number page table + 0x2a page directory + 0x11a00000 + cr3 0x00af0000 0x110ff000 Figure 5: An example of an Intel two-level paging system. 6. Figure 5 shows an example of an Intel two-level memory management system, with both page directory entries and page table entries occupying four bytes. Some values are shown. If this shows a snapshot of the system while a program accesses a character variable, Nick Urbanik ver. 1.4 Solutions Tutorial on Memory Management, Deadlock and Operating System Types Operating Systems and Systems Integration 7 (a) What is the address of the variable as the program sees it? i ......................................... The virtual address, 0x20021406 i (b) What is the address of the coloured entry in the directory table? ............................................................................. The cpu register + directory index × size of page directory entry = af000016 + 8016 × 4 = af020016 . (c) What is the address of the coloured entry in the page table? ............................................................................. The content of the page directory entry + page index × size of page table entry = 110ff00016 + 2116 × 4 = 110ff08416 . (d) What is the physical address of the variable? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The content of the page table entry + offset = 11a0000016 + 40616 = 11a0040616 . (e) What is the content of the variable? The content of the current location in the page, 2a16 = 42. i i i 2.2 Deadlock 1. What is deadlock in the context of a computer operating system? i Solution: Deadlock is a condition where a set of processes are each waiting for a resource that is being exclusively held by another process in the set. 2. List four conditions for deadlock to occur in a computer’s operating system. i Solution: Mutual exclusion where only one process can use a resource at one time Hold and wait: processes holding resources given earlier can request new resources No preemption resources given to a process cannot be forcibly taken away by the operating system Circular wait: where each process is waiting for a resource held by another Process P ... Get A ... Get B ... Release A ... Release B ... Figure 6: Two processes that can deadlock. 3. Two processes are shown in figure 6. Nick Urbanik ver. 1.4 Process Q ... Get B ... Get A ... Release B ... Release A ... Solutions Tutorial on Memory Management, Deadlock and Operating System Types Operating Systems and Systems Integration 8 (a) Show some steps that each process can take so that deadlock must happen. i Solution: There are two different main scenarios where deadlock can occur, here is one: 1 Process P gets resource A 2 Process Q gets resource B 3 Process P requests resource B 4 Process Q requests resource A and here is the other: 1 Process Q gets resource B 2 Process P gets resource A 3 Process Q requests resource A 4 Process P requests resource B In fact, steps 3 and 4 could take place in either order, it doesn’t affect the fact that we have deadlock. (b) Explain one method that you could use to avoid any possibility of these two processes being deadlocked. i Solution: There are many methods—see the lecture notes. They all depend on removing one or more of the conditions for deadlock described in question 2 on the preceding page. I include a number of methods here: Prevent Circular wait: order the allocation of resources so that they are always allocated in the same order. So if both P and Q Get A, then Get B, then Release A, then Release B, circular wait is eliminated, and there is no deadlock. Prevent mutual exclusion: here we prevent the first condition. In most cases, this is not a choice; a resource can either be shared or it cannot. However, if both process P and process Q can both hold A or both hold B, then there is no deadlock. Prevent hold and wait: here we do not allow either process P or process Q to proceed until it has all the resources it needs. Alternatively, we could modify either P or Q so that it does not hold two resources at a time. Then deadlock is impossible. We could modify P as shown in figure 7 on the next page. Note that in this example, Q is not changed at all. Allow preemption: If the os can forcibly remove A from process P, then process Q can continue. Similarly if the os removes B from Q, then process P can continue, and there is no deadlock. 4. A client-server application has at least two processes, each on a different computer. The client makes requests of the server, the server responds by transferring data over the network to the client. Generally, the client will initiate the requests; the server does not initiate any requests. However, both the client and the server perform two way communication over the network. Describe how a client-server application can be deadlocked. Nick Urbanik ver. 1.4 Solutions Tutorial on Memory Management, Deadlock and Operating System Types Operating Systems and Systems Integration 9 Process P ... Get A ... Release A ... Get B ... Release B ... Process Q ... Get B ... Get A ... Release B ... Release A ... Figure 7: Two processes that can can not deadlock because one of the processes does not hold one resource while waiting for another. Nick Urbanik ver. 1.4