Document not found! Please try again

The Mechanics of Memory-Related Software Aging

40 downloads 847 Views 158KB Size Report
focusing on two memory problems that cause software aging: fragmenting and ... system resources, causing system performance degradation or crash/hang failures in .... In Linux, when a process is created the Kernel allocates an area of 132 ... We monitor the program execution using strace, a system call tracer program ...
The Mechanics of Memory-Related Software Aging

Autran Macêdo, Taís B. Ferreira, Rivalino Matias Jr. School of Computer Science Federal University of Uberlândia Uberlândia, Brazil [email protected], [email protected], [email protected]

Abstract— Software aging is a phenomenon defined as the continuing degradation of software systems during runtime, being particularly noticeable in long-running applications. Memory-related aging effects are one of the most important problems in this research field. Therefore understanding their causes and how they work is a major requirement in designing dependable software systems. In this paper we go deep into how memory management works inside application process, focusing on two memory problems that cause software aging: fragmenting and leakage. We explain the mechanics of memory-related software aging effects dissecting a real and widely adopted memory allocator. Along with the theoretical explanation, we present an experimental study that illustrates how memory fragmenting and leakage occur and how they accumulate over time in order to cause system aging-related failures. Keywords-Software aging; fragmentation; memory allocator

memory

leak;

memory

I. INTRODUCTION In the past fifteen years, the software aging phenomenon has been systematically researched and recognized as an important obstacle to achieving dependable software systems. One of its main effects is the depletion of operating system resources, causing system performance degradation or crash/hang failures in running applications. Aging in a software system, as in human beings, is an accumulative process. The accumulating effects of successive error occurrences directly influence the agingrelated failure manifestation. Software aging effects are the practical consequences of errors caused by aging-related fault activations. These faults gradually lead the system towards an erroneous state [1]. This gradual shifting is consequence of aging effects accumulation, being the fundamental nature of the software aging phenomenon. Memory-related aging effects are one of the most frequently cited in the literature (e.g., [2],[3],[4],[5],[6]). Basically, memory-related aging effects are caused by memory leaks and memory fragmentation problems. Memory leak [7] is a well-known software defect. The common cause of memory leaks is the incorrect use of memory management routines (e.g., unbalanced use of malloc/free) by programmers. On the other hand, memory fragmentation [8] is a consequence of the system runtime dynamics, being classified as a natural aging effect [2]. In operating systems implementing virtual memory, memory

fragmentation is hard to observe because the physically sparse memory can be easily arranged through page tables [9]. However, for specific memory allocation requests, physically contiguous memory pages are required and memory fragmentation is a major concern. Important to note that memory allocation operations can happen either in user-level or kernel-level [9]. In both scenarios memory leak and memory fragmentation may occur. In user-level, memory leak increases the size of the affected process with consequences on the entire system, since it reduces the system memory availability. As soon as the affected (aged) process exits, the amount of the memory reserved for the process (leaked and unleaked) returns to the operating system. Differently, memory fragmentation in user-level affects only the aged application process, since it is restricted to the process heap [9]. In kernel-level, memory leak and memory fragmentation problems also occur, but in a more critical way. An operating system kernel, like user applications, also has its internal memory allocator that is used to meet kernel requests for memory blocks. However, while at user-level the allocator problems affect only the user applications, at kernel-level the whole system is affected. In this work we describe the mechanics of memoryrelated aging effects, focusing on memory leak and memory fragmentation in user-level. To do so, we dissect the internals of memory management inside application process, providing a detailed view how it works. The rest of this paper is organized as follows. Section II discusses important aspects related to memory allocation operations, emphasizing the specifics of memory allocators. Section III presents two experiments that we conduct to show the occurrences of memory leaks and memory fragmentation, as well as the negative impacts they cause on the affected process. Section IV presents our final remarks. II. MEMORY ALLOCATION Memory allocations are one of the most ubiquitous operations in computer programs. A memory allocator [10], or simply allocator, is the code responsible for implementing memory allocation operations. The purpose of a memory allocator is the heap management [9]. Heap is a portion of main memory located in the data region of a process. This memory is used to meet the requests of a process for dynamic memory allocation.

These requests are handled by functions such as malloc(), realloc(), and free(), which are implemented as part of the memory allocator. When a memory request exceeds the available memory size in the heap, the allocator may require more memory to the operating system. This portion of additional memory is then linked into the heap of the process and managed by its memory allocator. In terms of operating system (OS) architecture, the memory allocation operations exist in two levels: kernel level and user level. In kernel level they are implemented by the KMA (kernel memory allocator) [9], which is responsible for providing the memory management to satisfy the requests for memory areas from the OS subsystems. At the user level, the memory allocation operations are implemented by an UMA (user-level memory allocator) [10], which is part of the application process. Both classes of allocators will be described in the next two subsections. A. Kernel-level Memory Allocator (KMA) Like in user-space application processes, the OS kernel code also requires dynamic memory allocations to support its operations in areas such as filesystem, process management, I/O, and so on. Structurally, there is no difference between allocators created to handle kernel memory allocations and those developed for user-space applications. Their core components (metadata and operations) are basically the same. The allocator metadata is used to keep control of allocated and freed memory areas, and the operations are used to control these areas. It is not uncommon modern operating systems to implement two or more KMA, because the memory allocation requirements are different among kernel subsystems. For example, the current Linux kernel (2.6.35) brings four main allocators: SLAB [11], SLOB [12], SLUB [12], and SLQB [13]. B. User-level Memory Allocator (UMA) As introduced above, an UMA is an integral part of userspace applications. It is usually stored in the standard library that is automatically linked to the applications, either statically or dynamically. Therefore, it is up to the programmer to choose the memory allocator to his/her application. However, due to the transparent use of the standard library, it is common for many programmers not get involved with the details related to the selection of the memory allocator to be used. Hence, these programmers do not know that plenty of heap management algorithms are available and can be used instead of the default memory allocator. Examples of memory allocators currently available are: Hoard [10], jemalloc [14], nedmalloc [15], TCMalloc [16], TSLF [17], among others. Each allocator algorithm has its specifics and several research works (e.g., [10], [17], [18]) have investigated the most used implementations. All of these algorithms are more or less vulnerable to problems like

memory fragmentation and memory leak, which are the main cause of memory-related software aging. In order to understand deeply the circumstances under which these memory-related aging problems occur, it is essential to know the internals of an UMA implementation. Therefore, next we will describe in details how a widely used UMA, named ptmalloc, works. The ptmalloc [19] is the current default memory allocator that comes embedded in glibc [20], the standard C library in the Linux operating systems. This allocator is based on another popular allocator called DLMalloc [21], and incorporates features aimed at multiprocessors running multithreaded programs. The ptmalloc allocator implements multiple heap areas to reduce contention in multithreaded programs. Currently, there are three versions of ptmalloc, and version 2 (ptmallocv2) is the default allocator implemented by glibc in Linux. In this work we chose ptmallocv2 to be used in our experimental study. Figure 1 shows the ptmallocv2 core components.

Figure 1. ptmallocv2’s heap management data structures

These components represent the process heap and its sub heaps (also known as arenas). An arena keeps free memory blocks (chunks) of the heap process. Each process has its own heap. There are two classes of blocks: Small and Large Bins. The first class manages memory blocks whose size varies from 8 bytes to 512 bytes, and the second deals with blocks up to 128 kilobytes. Usually, process’ threads share arenas what can lead to a heap contention. However, in ptmallocv2 this problem is avoided by creating new arenas on demand. Whenever a thread requests a memory block and all arenas are in use (locked by other threads), a new arena is created to solve the thread request. As soon as the request is solved the other threads can also share the recently created arena. The ptmallocv2 provides memory blocks whose sizes are in power of two, starting from 8 (23) bytes. For example, if an application requests a memory block of 20 bytes, ptmallocv2 provides a block of 32 bytes (25), the smaller block size (in power of two) greater than the requested one. The natural consequence of this approach is the internal memory fragmentation. Observe that for a 20-byte request, 12 bytes of the allocated block will be unused since

ptmallocv2 provided a 32-byte block. It is worth to say that this kind of problem is common to many other memory allocation algorithms. Internal memory fragmentation is a particular case of the memory fragmentation problem mentioned in the previous section. Another class of memory fragmentation is known as external fragmentation. External memory fragmentation happens when an application requests a block size that can not be provided by the memory allocator, although the heap area has enough space to satisfy the request. This situation happens because there are non-contiguous free chunks and none of them can satisfy the requested block size. Figure 2 illustrates a hypothetical heap memory of an application.

Figure 2. Example of a fragmented heap

In the scenario illustrated in Figure 2, if the application requests 55 bytes (e.g., malloc(55)), ptmallocv2 can not provide this block of memory immediately from the heap. Although there are 160 bytes of free blocks, the request cannot be provided from the heap because i) none of the blocks is large enough, and ii) the blocks are non contiguous so they cannot be coalesced [9]. Coalescence is a feature of ptmallocv2 that when two free blocks are contiguous they can be grouped together to form a new bigger block [19]. Therefore, in order to provide a requested block under a fragmented heap, ptmallocv2 may need to request a new memory block to the operating system, what is significantly slower than if the allocation request could be satisfied using the process’ heap area.

III. EXPERIMENTAL STUDY In this section we present the results of controlled experiments executed to forcedly cause memory fragmentation and memory leak. Our purpose is to show how these two memory-related aging effects occur internally, showing the mechanics of these effects inside the memory allocators. The test bed is composed of an Intel Pentium Dual Core, 2GHz, 1GB RAM, running Linux (Kernel 2.6.31-19) with glibc 2.10.1. We created two programs (Mfrag and Mleak) to implement test cases related to memory fragmentation and memory leak, respectively. A. Memory Fragmentation Experiment In Linux, when a process is created the Kernel allocates an area of 132 kilobytes to the process’ heap. In the beginning of the process life, ptmallocv2 organizes the heap area as only one block of free memory pointed to by a variable known as topchunk. This variable controls the size of the heap and it is decremented as the process receives memory blocks that cannot be found in one of the free lists (See Figure 2). The released blocks are put back in one of the free lists. Therefore, the value of topchunk always decreases. When topchunk is less than 16 bytes, ptmallocv2 asks for a new arena to the operating system. To experience this behavior we create the Mfrag program that controllably consumes the entire heap memory (topchunk equals to 16 bytes). Next, the Mfrag releases several blocks, in a non-contiguous way (fragmenting the heap), and then asks for a new block of memory. As none of the available (freed) blocks are large enough to accommodate the latter request, since they are not contiguous, we may see the process (allocator) issuing a request to the operating system through the sbrk() system call [19]. This system call is responsible for providing new arenas for the caller process. The Mfrag source code is presented in Figure 3. The program excerpts, labeled from 1 to 5, request 142 blocks of different memory sizes and put these blocks in an array, respectively, as follows: 64 blocks varying from 8 to 512 bytes (Small Bins Class); 31 blocks from 576 to 2048 bytes; 12 blocks; 34 blocks from 8 to 288 bytes; and one block of 200 bytes. Next, the program releases the array’s odd positions (excerpt 6) and immediately requests 8kilobyte (excerpt 7). We monitor the program execution using strace, a system call tracer program that records the system calls called by a process. We observe that when the program asks for 8kilobyte (excerpt 7), the sbrk() system call is called although there are 65,440 free bytes in the heap. It is important to highlight that although the total free memory (65,440 bytes) in the heap is larger than the amount requested (8,192 bytes), a new arena is requested because the heap is suffering from external memory fragmentation. Requesting a new arena to the operating system implies entering in kernel mode, which introduces an additional overhead that penalizes the process performance.

#define MAX 142 main(){ void *p[MAX]; int ini, i; for(i=0;i

Suggest Documents