A Framework for Automatic Memory Leak Detection Designed for Identifying and Analyzing Memory Leaks with its Statistics S. Poornima1,∗ , C. V. Guru Rao2 and S. P. Anandaraj3 2 Principal, 3 Sr. Assistant Professor SR Engineering College, Warangal, India. e-mail:
[email protected], guru− rao−
[email protected],
[email protected]
Abstract. For C/C+ applications, memory leaks still persisting as a significant challenge. The software applications, which are leaky always provides delay in execute over time, as they increase working, paging, and it cannot lead responding. During the same instance, it is problematic in debugging and huge amount of memory leaks in advanced applications. Existing techniques provides traditional error detecting mechanisms, which detects only unallocation objects. Due to this, it creates overheads to the software application such as runtime and space, also leads to the failure of lawfully C/C++ software intensive applications by preserving inappropriate memory. This paper discusses about LeakDetect tool, provides dynamic analysis for C/C++ software intensive systems to execute at more effective way, facing both reachable and unreachable memory leaks. It detecting memory leaks via inter-procedural analysis, that segregates leaked objects from non-leaked objects, allowing them to be completely paged out to disk. The major motto of LeakDetect supports the mechanism of reduction in space affecting leaks via implementing trends, which classifies various memory access errors. Therefore, LeakDetect executes effectively by reducing physical memories, classification of various memory leaks with context flow analysis of the input source code, also provides memory leak tolerance with memory statistics. Keywords:
Memory leak detection, LeakTraces, Memory statistics, Memory leak report.
1. Introduction In general, software testing, detection and resolute on of defects in an application or source code can save time, money and also to enhance quality of the software. This paper discusses on solutions identifying the memory leakages during application execution and helps the developer in developing reliable source code and to build software quality by addressing the below challenges as • Inability to understand flaws in an application architecture, before code is even written, leading to delicate architectures and applications • Manually following programming guidelines, industry standards or corporate policies leading to error prone code. • Poor performance or leaked memory in applications due to the inability to understand problems with code at runtime. • Lack of time and resources to manually check your application. The LeakDetect tool provides solutions in analyzing the software architecture, static and run-time analysis, memory leak detection and performance profiling of C/C++ allowing, • Confident deployment of applications that are thoroughly tested with automated testing tools, at the model, code, and executable level • Visibility into which requirements are impacted by problems with the code or architecture through requirements to code traceability • Applications that can be tested across heterogeneous and embedded systems. ∗ Corresponding author
© Elsevier Publications 2014.
187
S. Poornima, et al.
Figure 1. Framework for automatic leak detection.
LeakDetect offers the following tools for architectural, static and runtime analysis. Architectures may allow or disallow unaligned memory access [2]. While no special guidelines are required when unaligned memory access is allowed, if disallowed, the programmer must be careful. To improve performance, the compiler needs to be aware of the alignment of pointers and arrays in the program. In most cases, the compiler simply cannot tell the alignment. It therefore assumes the worst case scenario and avoids memory access optimization as a consequence. To overcome this lack of information, advanced compilers offer a user interface for specifying the alignment of a given pointer. The compiler then uses this information when considering memory access optimization for the pointer [13]. This paper presents LeakDetect, a tool designed for software developers and testers for detecting memory leaks and memory access errors, and also helps them to find the memory utilization. If the input source code tries to access read or writes freed memory, out of array limits or tries to read data from memory which is not initialized, LeakDetect identifies the errors at that instance. Additionally, it also supports garbage detector to find and list existing memory leaks with its statistics [4]. 2. Framework for Automatic Leak Detection When data is not released by a software application, and as the software application consumes more memory, the software application is said to have a memory leak. That is, software application may be consuming memory and either intentionally, or unintentionally, failing to release the memory resources [10]. As a result of the increasing memory usage, potential resources are taken away from other software applications, the operating system, hardware devices, etc. Memory leak detector automatically determines whether software application has a memory leak. In one example, memory leak detector is a standalone software application that is executed by the operating system [2]. In yet another example, memory leak detector is program code that may be embedded in software application. The detector monitors memory usage of the software application in order to determine whether software application has a memory leak. 2.1 Leak analyzer Figure 2 is a flow diagram represents a memory leak detector. It may include a memory usage data collector, memory usage modeler, a leak determination engine, and a leak report generator. Memory leak detector monitors a software application. In the memory usage data collector is responsible for collecting memory usage data for a software application. Memory usage data collector queries an operating system for the memory usage data [15]. The data collector stores the memory usage data in a data set to be analyzed for an indication that a software application has a memory leak. The memory usage data set includes a plurality of data points collected by the memory usage data collector. Memory usage data collector is collects an initial memory usage data (t0 , u 0 ), where u 0 is the memory usage of the software application and t0 is the time when u 0 was collected [15]. Memory usage data collector periodically queries the 188
© Elsevier Publications 2014.
A Framework for Automatic Memory Leak Detection Designed for Identifying
Figure 2. Flow representation of leak analysis and generation of memory leak report.
operating system for additional memory usage data over a period of time to obtain an optimal data set for memory usage modeler. The memory usage data set includes at least two memory usage data points. However, a plurality of data points (e.g., one hundred, one thousand, etc.) are collected by memory usage data collector based on declaration of below sample constraints Type (T) = T∗ |void∗ | . . . . . . . . . Obj (o) = ρ| . . . . . . . . . Deref (m) = (∗ρ ) · f 1 . . . . . . . . . Expr (e) = null|&o|&m| . . . . . . . . . . . . Stmt (s) = load(m, 0)|store (m, e)|newloc (ρ)| . . . . . . . . . . . . . . . . . . . . . . . . Memory usage modeler is responsible for analyzing the memory usage data set collected by memory usage data collector. The modeler performs linear regression analysis of the memory usage data set in order to generate a model for the memory usage data [15]. Although linear regression analysis is discussed herein, other modeling methods may be applied, such a non-linear regression, least squares regression, etc., to model the memory usage data. Memory usage modeler accesses the memory usage data set (t0 , u 0 ), (t1 , u 1 ) . . . (tn , u n ) for a software application over a period of time t1 through tn based on following T1 = C1 ∗ 1 + C2 ∗ 1 + · · · + Cn ∗ 1 = C1 + C2 + · · · Cn = O(1)
(1)
Since, T 1 through Tn are both O(1) are runtime for Memory accessing. Memory usage modeler then performs linear regression analysis on the data set (ti , u i ) with n data points to obtain the linear function for memory usage of the software application over time: u = mt + b (2) where m is the slope of the function, and b is the vertical axis intercept for equation (2). © Elsevier Publications 2014.
189
S. Poornima, et al. Table 1. Active – C object invocation for data structures. Type HashTable Queue Pointer Array Allocation
Object allOrder OrderQ OrderP ArrayL allocM
Symbol H N B L A
Size 256 256 256 64 256
Given the set of data points (ti , u i ) for i from 0 to n, m and b are determined by memory usage modeler by calculating: m = (n(tu) − (t)(u))/(n(t 2 ) − (t)2 )
(3)
b = ((u) − m(t))/n
(4)
where the limits of the summation are from 0 to n. once m and b are obtained by memory usage modeler, leak determination engine may utilize the resulting function (3) to determine whether a software application has a memory leak. 3. Memory Leak Detection Memory leaks are blocks of allocated memory that the program with no longer references. Memory leaks leads to overhead by wasting space by loading inaccessible data in memory and during paging, it wastes time also. For C/C++ software intensive systems, that use malloc, memory leaks are errors and should be rectified [6]. The object oriented softwares which invokes Active C Objects, the compiler allocates and deallocates objects for the systems and thereby avoids memory leaks. Anyway, the object oriented Softwares, that invoke Active C objects and C based structures must manage the ownership of objects more directly to ensure that the objects are not leaked [1]. Table 1 represents the processing of sample data structures via Active-C Objects. The malloc library reclaims the memory routine that allocates memory, and deallocates when free is declared. Similarly, leaks occurs, when the memory is not deallocated in a typical data structure. In LeakDetect, the leaks command-line tool searches the virtual memory space of a process for buffers that were allocated by malloc but are no longer referenced. For each leaked buffer it finds and displays the following information: • the address of the leaked memory • the size of the leak (in bytes) • the contents of the leaked buffer If leaks can determine that the object is an instance of an Objective-C or Core Foundation object, it also displays the name of the object. If the user is not willing to view the contents of each leaked buffer, it can be done by specifying the no context option when calling leaks. If the MallocStackLogging environment variable is set and you are running your application in gdb, leaks displays a stack trace describing where the buffer was allocated. All these conditions, arises during the execution of user-defined source code, and occupies the memory for accessing. Figure 3 shows an example of Heap allocation during runtime and analyzed as per SPECjbb2000 Benchmark [9]. 3.1 Enabling the Malloc debugging features Detection and analyzing of memory related leaks and memory access errors are time consuming. The debugging options along with its functionality are defined in table 2. The MLR is Memory Leak Report and MS is Memory Statistics [12]. The tool LeakDetect provides more choices for detecting memory related problems, and also monitors those problems more closely, when they actually happen. 4. Implementation For a tool such as LeakDetect to be worth using, a user must first be using a language such as C, C++, etc, that is susceptible to the kinds of memory errors LeakDetect Finds. Then, the benefits of use must outweigh the costs. 190
© Elsevier Publications 2014.
A Framework for Automatic Memory Leak Detection Designed for Identifying
Figure 3. Example of heap allocation graph running as per SPECjbb2000. Table 2. Call invocation for leak identification for printing in status reports. Call to Process Alloc(), Calloc(), Free Malloc Safeguard MallocException MemUtility MemLog
Declared in Library Malloc Malloc Malloc Stderr Stdio Stdio
MemStackLog
Memstd
MemLeakExe MallocHeapCheck, MallocHeapEach MallocData MallocHeapStart
Malloc Malloc Malloc Malloc
Malloc-History Shm− open, shm− unlink
Malloc Safeguard Memstd
Functionality Memory Allocation Freeing Memory Debugs Memory errors Detects Leaks Define amount of memory used Defining allocation of Memory logged Defines the log file by stack data structure Defines Leak Trends Defines Heap Corruption
Status MLR MS MLR MLR MS MS
Defines Data Structure Used Defines Heap Corruption at an instance Backtrace data Shared memory functions
MLR & MS MLR
MS MLR MLR
MS MS
This sections considers first the costs of using LeakDetect, in terms of its ease of use and performance. It then considers the benefits, that is, its effectiveness in helping programmers find real bugs. Ease of use. Ease of use has a number of facts: how easy LeakDetect is to obtain, how easy it is to run, and how easy it is to act upon its results. Running LeakDetect. LeakDetect could hardly be easier to run. Fore runtime process, more tokens and objects are designed and declared, as shown in figure 4. It is similar as C/C++ in running a program. Typically, the only further effort a user must make is to compile her program with debugging information, to ensure error messages are as informative as possible. Custom Allocators. There are some cases where the user needs to expend a little more effort. If a program uses custom memory management rather than malloc (), new and new [], LeakDetect can miss some error it would otherwise find. The problem can be avoided by embedding small number of client requests into the code. These are special assembly code sequences, encoded as C macros for easy use that LeakDetect recognizes, but which perform a cheap © Elsevier Publications 2014.
191
S. Poornima, et al.
Figure 4. Slice report of values according fop benchmark.
Figure 5. Memory allocation and deallocation report.
Figure 6. Slice memory leak report.
no-op if the client is not running on this tool. They provide a way for the client to pass information to the tool. The below figure 5 shows slice report of custom allocators implemented. Self-Modifying code. Extra effort is also needed to handle self-modifying code. Dynamically generated code it not a problem, but if code that has executed is modified and re-executed, the tool will not realize this. Auto-detecting is possible, but expensive, the cost is not justified by the small number of programs that use self-modifying code. Acting upon LeakDetect’s Results. In general, LeakDetect’s addressability checking, deallocation checking, and overlap checking do quite well here, in that LeakDetect’s report of the problem is usually close to the root cause. This tool also provides another processing for client request that can be inserted into the C or C++ code. It instructs LeakDetect to check if a particular variable or memory address range is defined, issuing an error message if not. Ease of interpreting error messages is also important. Figure 6 shows slice memory leak report (MLR), which produced heap access error. Ten survey responders complained that the messages are confusing, but 7 praised them. One issue in particular is the depth of stack races: the default is four, but many users immediately adjust that to a much higher number. This gives more information, but also makes the error messages longer. 192
© Elsevier Publications 2014.
A Framework for Automatic Memory Leak Detection Designed for Identifying
Figure 7. Memory leak detection rate mapped with various criterias according to SPECjbb2000 BenchMark.
Figure 8. Memory leak rate against allocation and deallocation of data structures according to SPECjbb2000 Benchmark.
5. Experimentation In the following Figure, the results for the serial (nonparallel) C and C++ test cases can be seen. Most test cases of the LeakDetect test suites are passed, however there is a small amount of test cases where the current implementation still fails. These failures are mostly due to some unhandled special cases, which have not been implemented yet. There is a class of errors which is not detected by the runtime-error checking system at all: floating point errors. These errors occur, when a division by zero, an underflow or overflow in a floating point variable happens and are deliberately not handled by this tool, because there already exist good methods to detect and fix these errors [8]. They are detected by the floating point unit of the processor, and the program can be configured such that a signal is sent to the process in this case. These program instructions may be provided to a processor to produce a machine, such that the instructions, which execute on the processor, create means for implementing the actions specified in the flowchart block or blocks. Figure 7 represents the graph plotted between memory trends and rate of access. 6. Conclusion This paper describes about a tool named LeakDetect, but with the dynamic binary instrumentation framework, which can detect several kinds of memory errors that are common in imperative programs. This tool is easy to install, it fits appropriately into UNIX paradigm, and needs one particular command to link line of make file to use on an existing application. In particular, this tool is dedicated to detect undefined value errors with bit precision. While testing, it displayed 102 bugs found in C Application by one user, and more generally by the thousands of programmers who develops application software’s on C/C++. More importantly, it produced leak reports and executables fast enough © Elsevier Publications 2014.
193
S. Poornima, et al.
during its development and test process. LeakDetect is a precise leak detector to detect and analyze with data structures with unbounded heap growth leaks in three popular benchmarks: fop, 202 jess and SPECjbb2000. It also reduces space and runtime overheads, while executing C/C++ software intensive systems. It is also accurate, reporting various leaks and access errors according to benchmarks programs. Acknowledgement Author S.Poornima would like to acknowledge and thank Women Scientist Scheme (SR/WOS-A/ET-24/2012), Department of Science and Technology for all the Grant-in Aid provided for the Research Work. This work is supported by SR Engineering College, India. Also her sincere feelings and gratitude to Management and Principle of SR Engineering College for the Financial Support and encouragement which helped her to carry out the research work and wishes to thank Dr. C. V. Guru Rao for his valuable suggestions. References [1] B. Alpern, et al., Implementing Jalape˜no In Java. In The Conference on Object-Oriented Programming, Systems, Languages, and Applications, Denver, CO, pp. 314–324, November 1999. [2] T. M. Chilimbi and M. Hauswirth, Low-overhead memory leak detection using adaptive statistical profiling. In The International Conference on Architectural Support for Programming Languages and Operating Systems, Boston, Massachusetts, pp. 156–164, October 2004. [3] D. L. Heine and M. S. Lam, A practical flow-sensitive and context sensitive C and C++ memory leak detector. In The Conference on Programming Language Design and Implementation, pp. 168–181, 2003. [4] M. Hertz, S. M. Blackburn, J. E. B. Moss, K. S. McKinley and D. Stefanovi’c. Error free garbage collection traces: How to cheat and not get caught. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems, Marina Del Rey, CA, pp. 140–151, June 2002. [5] Xie, Y. and Aiken, A., Context- and path-sensitive memory leak detection. In: ACM SIGSOFT Symposium on the Foundations of Software Engineering, Lisbon, Portugal, 2005. [6] Andersen, L. O., Program Analysis and Specialization for the C Programming Language. PhD thesis, DIKU, University of Copenhagen, 1994. [7] Andersen, “SPEC CPU2006 benchmark suite,” 2006, http://www.spec.org/osg/cpu2006/. [8] D. Bruening, “Efficient, transparent, and comprehensive runtime code manipulation,” Ph.D. dissertation, M.I.T., September 2004. [9] Standard Performance Evaluation Corporation, “SPEC CPU2000 benchmark suite,” 2000, http://www.spec.org/osg/cpu2000/. [10] S. Cherem, L. Princehouse and R. Rugina, Practical memory leak detection using guarded value-flow analysis. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming language design and implementation (PLDI ’07), pp. 480–491, 2007. [11] M. D. Bond and K. S. McKinley, Bell: bit-encoding online memory leak detection. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’06), San Jose, CA, pp. 61–72, October 2006. [12] P. R. Wilson, M. S. Johnstone, M. Neely and D. Boles, Dynamic storage allocation: A survey and critical review. In Proceedings of the International Workshop on Memory Management, Kinross, Scotland, Springer, Volume 986, pp. 1–116, September 1995. [13] D. Liang and M. Harrold, Efficient Computation of parameterized pointer information for interprocedural analysis. In Proceedings of the 8th Static Analysis Symposium, 2001. [14] R. Hastings and B. Joyce, Purify: Fast detection of memory leaks and access errors. In Proceedings of the Winter USENIX Conference, December 1992. [15] Pavel macik and Martin, Method and System for Automatic Leak Detection by U.S patent, August 2012.
194
© Elsevier Publications 2014.