Valgrind Software Quick Start Manual

page of 319

/ 319
Contents
Table of Contents
Bookmarks

Table of Contents

Quick Links

Valgrind Documentation

Release 3.8.0 10 August 2012

AUTHORS

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation

License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, with

no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled

The

GNU Free Documentation

License.

This is the top level of Valgrind's documentation tree. The documentation is contained in six logically separate

documents, as listed in the following Table of Contents. To get started quickly, read the Valgrind Quick Start Guide.

For full documentation on Valgrind, read the Valgrind User Manual.

Table of Contents

Chapters

Table of Contents

Need help?

Do you have a question about the Software and is the answer not in the manual?

Questions and answers

Summary of Contents for Valgrind Software

Page 1 This is the top level of Valgrind’s documentation tree. The documentation is contained in six logically separate documents, as listed in the following Table of Contents. To get started quickly, read the Valgrind Quick Start Guide. For full documentation on Valgrind, read the Valgrind User Manual.
Page 2 Valgrind Documentation Table of Contents The Valgrind Quick Start Guide Valgrind User Manual Valgrind FAQ Valgrind Technical Documentation Valgrind Distribution Documents GNU Licenses...
Page 4: Table Of Contents
The Valgrind Quick Start Guide Table of Contents The Valgrind Quick Start Guide 1. Introduction 2. Preparing your program 3. Running your program under Memcheck 4. Interpreting Memcheck’s output 5. Caveats 6. More information...
Page 5: Preparing Your Program
The Valgrind Quick Start Guide 1. Introduction The Valgrind tool suite provides a number of debugging and proﬁling tools that help you make your programs faster and more correct. The most popular of these tools is called Memcheck. It can detect many memory-related errors that are common in C and C++ programs and that can lead to crashes and unpredictable behaviour.
Page 6 The Valgrind Quick Start Guide #include <stdlib.h> void f(void) int * x = malloc(10 * sizeof(int)); x[10] = 0; // problem 1: heap block overrun // problem 2: memory leak -- x not freed int main(void) f(); return 0; Most error messages look like the following, which describes problem 1, the heap block overrun:...
Page 7: Valgrind Faq
OpenOfﬁce.org and Firefox are Memcheck-clean, or very close to it. 6. More information Please consult the Valgrind FAQ and the Valgrind User Manual, which have much more information. Note that the other tools in the Valgrind distribution can be invoked with the --tool option.
Page 9 2.11. Limitations 2.12. An Example Run 2.13. Warning Messages You Might See 3. Using and understanding the Valgrind core: Advanced Topics 3.1. The Client Request mechanism 3.2. Debugging your program using Valgrind gdbserver and GDB 3.2.1. Quick Start: debugging in 3 steps 3.2.2.
Page 10 4.5.3. Putting it all together 4.6. Memcheck Monitor Commands 4.7. Client Requests 4.8. Memory Pools: describing and working with custom allocators 4.9. Debugging MPI Parallel Programs with Valgrind 4.9.1. Building and installing the wrappers 4.9.2. Getting started 4.9.3. Controlling the wrapper library 4.9.4.
Page 11 Valgrind User Manual 6.2.3. Counting global bus events 6.2.4. Avoiding cycles 6.2.5. Forking Programs 6.3. Callgrind Command-line Options 6.3.1. Dump creation options 6.3.2. Activity options 6.3.3. Data collection options 6.3.4. Cost entity separation options 6.3.5. Simulation options 6.3.6. Cache simulation options 6.4.
Page 12 12.2. Using Basic Block Vectors to create SimPoints 12.3. BBV Command-line Options 12.4. Basic Block Vector File Format 12.5. Implementation 12.6. Threaded Executable Support 12.7. Validation 12.8. Performance 13. Lackey: an example tool 13.1. Overview 13.2. Lackey Command-line Options 14. Nulgrind: the minimal Valgrind tool 14.1. Overview...
Page 13: Introduction
Valgrind is closely tied to details of the CPU and operating system, and to a lesser extent, the compiler and basic C libraries. Nonetheless, it supports a number of widely-used platforms, listed in full at http://www.valgrind.org/.
Page 14: How To Navigate This Manual
1.2. How to navigate this manual This manual’s structure reﬂects the structure of Valgrind itself. First, we describe the Valgrind core, how to use it, and the options it supports. Then, each tool has its own chapter in this manual. You only need to read the documentation for the core and for the tool(s) you actually use, although you may ﬁnd it helpful to be at least a little bit familiar with...
Page 15: Using And Understanding The Valgrind Core
Your program is then run on a synthetic CPU provided by the Valgrind core. As new code is executed for the ﬁrst time, the core hands the code to the selected tool. The tool adds its own instrumentation code to this and hands the result back to the core, which coordinates the continued execution of this instrumented code.
Page 16: Getting Started
First off, consider whether it might be beneﬁcial to recompile your application and supporting libraries with debugging info enabled (the -g option). Without debugging info, the best Valgrind tools will be able to do is guess which function a particular piece of code belongs to, which makes both error messages and proﬁling output nearly useless. With -g, you’ll get messages which point directly to the relevant source code lines.
Page 17 • portnumber: changes the port it listens on from the default (1500). The speciﬁed port must be in the range 1024 to 65535. The same restriction applies to port numbers speciﬁed by a --log-socket to Valgrind itself. If a Valgrinded process fails to connect to a listener, for whatever reason (the listener isn’t running, invalid or unreachable host or port, etc), Valgrind switches back to writing the commentary to stderr.
Page 18: Reporting Of Errors
(current or freed) heap block, for example reading freed memory, Valgrind reports not only the location where the error happened, but also where the associated heap block was allocated/freed. Valgrind remembers all error reports. When an error is detected, it is compared against old reports, to see if it is a duplicate.
Page 19 ﬂexible speciﬁcation of errors to suppress. If you use the -v option, at the end of execution, Valgrind prints out one line for each used suppression, giving its name and the number of times it got used. Here’s the suppressions used by a run of valgrind --tool=memcheck ls -l: --27579-- supp: 1 socketcall.connect(serv_addr)/__libc_connect/__nscd_getgrgid_r...
Page 20 Using and understanding the Valgrind core • Next line: a small number of suppression types have extra information after the second line (eg. the Param suppression for Memcheck) • Remaining lines: This is the calling context for the error -- the chain of function calls that led to it. There can be up to 24 of these lines.
Page 21: Core Command-Line Options
2.6. Core Command-line Options As mentioned above, Valgrind’s core accepts a common set of options. The tools also accept tool-speciﬁc options, which are documented separately for each tool.
Page 22 Note that Valgrind does trace into the child of a fork (it would be difﬁcult not to, since fork makes an identical copy of a process), so this option is arguably badly named. However, most children of fork calls immediately call exec anyway.
Page 23: Error-Related Options
--log-file=<filename> Speciﬁes that Valgrind should send all of its messages to the speciﬁed ﬁle. If the ﬁle name is empty, it causes an abort. There are three special format speciﬁers that can be used in the ﬁle name.
Page 24 So this doesn’t affect the total number of errors reported. The maximum value for this is 500. Note that higher settings will make Valgrind run a bit more slowly and take a bit more memory, but can be useful when working with programs with deeply-nested call chains.
Page 25: Caveats
Valgrind detects any errors. This is useful for using Valgrind as part of an automated test suite, since it makes it easy to detect test cases for which Valgrind has reported errors, just by inspecting return codes.
Page 26 The prompt’s behaviour is the same as for the --db-attach option (see below). If you choose to, Valgrind will print out a suppression for this error. You can then cut and paste it into a suppression ﬁle if you don’t want to hear about the error in the future.
Page 27 This option allows you to change the threshold to a different value. You should only consider use of this option if Valgrind’s debug output directs you to do so. In that case it will tell you the new threshold you should specify.
Page 28: Malloc-Related Options
This is only really of signiﬁcance on 32-bit machines. On Linux, you may request a stack of size up to 2GB. Valgrind will stop with a diagnostic message if the stack cannot be allocated. --main-stacksize only affects the stack size for the program’s initial thread. It has no bearing on the size of thread stacks, as Valgrind does not allocate those.
Page 29 When enabled, Valgrind will read information about variable types and locations from DWARF3 debug info. This slows Valgrind down and makes it use more memory, but for the tools that can take advantage of it (Memcheck, Helgrind, DRD) it can result in more precise error messages. For example, here are some standard errors issued by...
Page 30 5000] As part of its main loop, the Valgrind scheduler will poll to check if some activity (such as an external command or some input from a gdb) has to be handled by gdbserver. This activity poll will be done after having run the given number of basic blocks (or slightly more than the given number of basic blocks).
Page 31 Enable special handling for certain system calls that may block in a FUSE ﬁle-system. This may be necessary when running Valgrind on a multi-threaded program that uses one thread to manage a FUSE ﬁle-system and another thread to access that ﬁle-system.
Page 32 * and ? wildcards. --soname-synonyms=syn1=pattern1,syn2=pattern2,... When a shared library is loaded, Valgrind checks for functions in the library that must be replaced or wrapped. For example, Memcheck replaces all malloc related functions (malloc, free, calloc, ...) with its own versions. Such replacements are done by default only in shared libraries whose soname matches a predeﬁned soname pattern (e.g.
Page 33: Debugging Options
The actual thread scheduling remains under control of the OS kernel. What this does mean, though, is that your program will see very different scheduling when run on Valgrind than it does when running normally. This is both because Valgrind is serialising the threads, and because the code runs so much slower than normal.
Page 34: Scheduling And Multi-Thread Performance
Depending on your Linux distribution, CPU frequency scaling may be controlled using a graphical interface or using command line such as cpufreq-selector or cpufreq-set. An alternative way to avoid these problems is to tell the OS scheduler to tie a Valgrind process to a speciﬁc (ﬁxed) CPU using the taskset command.
Page 35: Handling Of Signals
• --enable-inner This builds Valgrind with some special magic hacks which make it possible to run it on a standard build of Valgrind (what the developers call "self-hosting"). Ordinarily you should not use this option as various kinds of safety checks are disabled.
Page 36: If You Have Problems
Limitations for the known limitations of Valgrind, and for a list of programs which are known not to work on it. All parts of the system make heavy use of assertions and internal self-checks. They are permanently enabled, and we have no plans to disable them.
Page 37 If you regenerate code over the top of old code (ie. at the same memory addresses), if the code is on the stack Valgrind will realise the code has changed, and work correctly. This is necessary to handle the trampolines GCC uses to implemented nested functions. If you regenerate code somewhere other than the stack, and you are running on an 32- or 64-bit x86 CPU, you will need to use the --smc-check=all option, and Valgrind will run more slowly than normal.
Page 38: An Example Run
Essentially the same: no exceptions, and limited observance of rounding mode. Also, switching the VFP unit into vector mode will cause Valgrind to abort the program -- it has no way to emulate vector uses of VFP at a reasonable performance level.
Page 39: Warning Messages You Might See
After 1000 different errors have been detected, Valgrind ignores any more. It seems unlikely that collecting even more different ones would be of practical help to anybody, and it avoids the danger that Valgrind spends more and more of its time comparing new errors against an ever-growing collection. As above, the 1000 number is a...
Page 40 Valgrind spotted such a large change in the stack pointer that it guesses the client is switching to a different stack. At this point it makes a kludgey guess where the base of the new stack is, and sets memory permissions accordingly.
Page 41: Using And Understanding The Valgrind Core: Advanced Topics
RUNNING_ON_VALGRIND: Returns 1 if running on Valgrind, 0 if running on the real CPU. If you are running Valgrind on itself, returns the number of layers of Valgrind emulation you’re running on.
Page 42: More Information
VALGRIND_NON_SIMD_CALL[0123]: Executes a function in the client program on the real CPU, not the virtual CPU that Valgrind normally runs code on. The function must take an integer (holding a thread ID) as the ﬁrst argument and then 0, 1, 2 or 3 more arguments (depending on which client request is used).
Page 43: Debugging Your Program Using Valgrind Gdbserver And Gdb
VALGRIND_STACK_ * calls. Valgrind will use this information to determine if a change to the stack pointer is an item pushed onto the stack or a change over to a new stack. Use this if you’re using a user-level thread package and are noticing spurious errors from Valgrind about uninitialized memory reads.
Page 44: Valgrind Gdbserver Overall Organisation
Valgrind’s gdbserver, communication is done via a pipe and a small helper program called vgdb, which acts as an intermediary. If no GDB is in use, vgdb can also be used to send monitor commands to the Valgrind gdbserver from a shell command line.
Page 45 ==2418== Command: ./prog ==2418== ==2418== (action at startup) vgdb me ... GDB (in another shell) can then be connected to the Valgrind gdbserver. For this, GDB must be started on the program prog: gdb ./prog You then indicate to GDB that you want to debug a remote target:...
Page 46: Connecting To An Android Gdbserver
[Switching to Thread 2479] 0x001f2850 in _start () from /lib/ld-linux.so.2 (gdb) Once GDB is connected to the Valgrind gdbserver, it can be used in the same way as if you were debugging the program natively: • Breakpoints can be inserted or deleted.
Page 47: Monitor Command Handling By The Valgrind Gdbserver
GDB server, but you will need to explicitly enable it using the ﬂag --vgdb=yes or --vgdb=full. Additionally, you will need to select a temporary directory which is (a) writable by Valgrind, and (b) supports FIFOs. This is the main difﬁcult point. Often, /sdcard satisﬁes requirement (a), but fails for (b) because it is a VFAT ﬁle system and VFAT does not support pipes.
Page 48 The Valgrind gdbserver will execute the monitor command itself, if it recognises it to be a Valgrind core monitor command. If it is not recognised as such, it is assumed to be tool-speciﬁc and is handed to the tool for execution. For example:...
Page 49: Valgrind Gdbserver Thread Information
3.2.6. Valgrind gdbserver thread information Valgrind’s gdbserver enriches the output of the GDB info threads command with Valgrind-speciﬁc information. The operating system’s thread number is followed by Valgrind’s internal index for that thread ("tid") and by the Valgrind scheduler thread state: (gdb) info threads 4 Thread 6239 (tid 4 VgTs_Yielding) 0x001f2832 in _dl_sysinfo_int80 () from /lib/ld-linux.so...
Page 50: Limitations Of The Valgrind Gdbserver
When Valgrind gdbserver stops on an error, on a breakpoint or when single stepping, registers and ﬂags val- ues might not be always up to date due to the optimisations done by the Valgrind core. The default value --vex-iropt-register-updates=unwindregs-at-mem-access ensures that the registers needed to make a stack trace (typically PC/SP/FP) are up to date at each memory access (i.e.
Page 51 It also has no limitation on the length of the memory zone being watched. Using GDB version 7.4 or later allow full use of the ﬂexibility of the Valgrind gdbserver’s simulated hardware watchpoints.
Page 52 "target" command. Debugging will not work because GDB will then not be able to fetch the registers from the Valgrind gdbserver. For ARM programs using the Thumb instruction set, you must use a GDB version of 7.1 or later, as earlier versions have problems with next/step/breakpoints in Thumb code.
Page 53 When ptrace is disabled in vgdb, a query packet sent by GDB may take signiﬁcant time to be handled by the Valgrind gdbserver. In such cases, GDB might encounter a protocol timeout. To avoid this, you can increase the value of the timeout by using the GDB command "set remotetimeout".
Page 54: Vgdb Command Line Options
Usage: vgdb [OPTION]... [[-c] COMMAND]... vgdb ("Valgrind to GDB") is a small program that is used as an intermediary between Valgrind and GDB or a shell. Therefore, it has two usage modes: 1. As a standalone utility, it is used from a shell command line to send monitor commands to a process running under Valgrind.
Page 55: Valgrind Monitor Commands
• -l instructs a standalone vgdb to report the list of the Valgrind gdbserver processes running and then exit. • -D instructs a standalone vgdb to show the state of the shared memory used by the Valgrind gdbserver. vgdb will exit after having shown the Valgrind gdbserver shared memory state.
Page 56 When sent from a standalone vgdb, if this is the last command, the Valgrind process will continue the execution of the guest process. The typical usage of this is to use vgdb to send a "no-op" command to a Valgrind gdbserver so as to continue the execution of the guest process.
Page 57: Function Wrapping
To become active, the wrapper merely needs to be present in a text section somewhere in the same process’ address space as the function it wraps, and for its ELF symbol name to be visible to Valgrind. In practice, this means either...
Page 58: Wrapping Speciﬁcations
Instead, the result lvalue, OrigFn and arguments are handed to one of a family of macros of the form CALL_FN_ * . These cause Valgrind to call the original and avoid recursion back to the wrapper.
Page 59: Wrapping Semantics
The ability for a wrapper to replace an inﬁnite family of functions is powerful but brings complications in situations where ELF objects appear and disappear (are dlopen’d and dlclose’d) on the ﬂy. Valgrind tries to maintain sensible behaviour in such situations.
Page 60: Debugging
A second possible problem is that of conﬂicting wrappers. It is easily possible to load two or more wrappers, both of which claim to be wrappers for some third function. In such cases Valgrind will complain about conﬂicting wrappers when the second one appears, and will honour only the ﬁrst one.
Page 61: Limitations - Original Function Signatures
Using and understanding the Valgrind core: Advanced Topics 3.3.6. Limitations - original function signatures As shown in the above example, to call the original you must use a macro of the form CALL_FN_ * . For technical reasons it is impossible to create a single macro to deal with all argument types and numbers, so a family of macros covering the most common cases is supplied.
Page 62: Memcheck: A Memory Error Detector
4. Memcheck: a memory error detector To use this tool, you may specify --tool=memcheck on the Valgrind command line. You don’t have to, though, since Memcheck is the default tool. 4.1. Overview Memcheck is a memory error detector. It can detect the following problems that are common in C and C++ programs.
Page 63: Use Of Uninitialised Values
Memcheck: a memory error detector freed. Likewise, if it should turn out to be just off the end of a heap block, a common result of off-by-one- errors in array subscripting, you’ll be informed of this fact, and also where the block was allocated. If you use option Memcheck will run more slowly but may give a more detailed description of any --read-var-info...
Page 64: Use Of Uninitialised Or Unaddressable Values In System Calls
Memcheck: a memory error detector To see information on the sources of uninitialised data in your program, use the --track-origins=yes option. This makes Memcheck run more slowly, but can make it much easier to track down the root causes of uninitialised value errors.
Page 65: Illegal Frees
Memcheck: a memory error detector 4.2.4. Illegal frees For example: Invalid free() at 0x4004FFDF: free (vg_clientmalloc.c:577) by 0x80484C7: main (tests/doublefree.c:10) Address 0x3807F7B4 is 0 bytes inside a block of size 177 free’d at 0x4004FFDF: free (vg_clientmalloc.c:577) by 0x80484C7: main (tests/doublefree.c:10) Memcheck keeps track of the blocks allocated by your program with malloc/new, so it can know exactly whether or not the argument to free/delete is legitimate or not.
Page 66: Overlapping Source And Destination Blocks
Memcheck: a memory error detector The worst thing is that on Linux apparently it doesn’t matter if you do mix these up, but the same program may then crash on a different platform, Solaris for example. So it’s best to ﬁx it properly. According to the KDE folks "it’s amazing how many C++ programmers don’t know this".
Page 67 Memcheck: a memory error detector • It might be a pointer to an array of C++ objects (which possess destructors) allocated with new[]. In this case, some compilers store a "magic cookie" containing the array length at the start of the allocated block, and return a pointer to just past that magic cookie, i.e.
Page 68 Memcheck: a memory error detector • "Possibly lost". This covers cases 5--8 (for the BBB blocks) above. This means that a chain of one or more pointers to the block has been found, but at least one of the pointers is an interior-pointer. This could just be a random value in memory that happens to point into a block, and so you shouldn’t consider this ok unless you know you have interior-pointers.
Page 69: Memcheck Command-Line Options
Memcheck: a memory error detector 64 bytes in 4 blocks are still reachable in loss record 2 of 4 at 0x..: malloc (vg_replace_malloc.c:177) by 0x..: mk (leak-cases.c:52) by 0x..: main (leak-cases.c:74) 32 bytes in 2 blocks are indirectly lost in loss record 1 of 4 at 0x..: malloc (vg_replace_malloc.c:177) by 0x..: mk (leak-cases.c:52) by 0x..: main (leak-cases.c:80)
Page 70 Memcheck: a memory error detector --undef-value-errors=<yes|no> [default: yes] Controls whether Memcheck reports uses of undeﬁned value errors. Set this to no if you don’t want to see undeﬁned value errors. It also has the side effect of speeding up Memcheck somewhat. --track-origins=<yes|no>...
Page 71: Writing Suppression Files
Memcheck: a memory error detector --freelist-big-blocks=<number> [default: 1000000] When making blocks from the queue of freed blocks available for re-allocation, Memcheck will in priority re-circulate the blocks with a size greater or equal to --freelist-big-blocks. This ensures that freeing big blocks (in particular freeing blocks bigger than --freelist-vol) does not immediately lead to a re-circulation of all (or a lot of) the small blocks in the free list.
Page 72: Details Of Memcheck's Checking Machinery
Memcheck: a memory error detector • Addr1, Addr2, Addr4, Addr8, Addr16, meaning an invalid address during a memory access of 1, 2, 4, 8 or 16 bytes respectively. • Jump, meaning an jump to an unaddressable location error. • Param, meaning an invalid system call parameter error. •...
Page 73 Memcheck: a memory error detector int i, j; int a[10], b[10]; for ( i = 0; i < 10; i++ ) { j = a[i]; b[i] = j; Memcheck emits no complaints about this, since it merely copies uninitialised values from a[] into b[], and doesn’t use them in a way which could affect the behaviour of the program.
Page 74: Valid-Address (A) Bits
Memcheck: a memory error detector So s1 occupies 8 bytes, yet only 5 of them will be initialised. For the assignment s2 = s1, GCC generates code to copy all 8 bytes wholesale into s2 without regard for their meaning. If Memcheck simply checked values as they came out of memory, it would yelp every time a structure assignment like this happened.
Page 75: Memcheck Monitor Commands
Until that happens, all attempts to access it will elicit an invalid-address error, as you would hope. 4.6. Memcheck Monitor Commands The Memcheck tool provides monitor commands handled by Valgrind’s built-in gdbserver (see Monitor command handling by the Valgrind...
Page 76 (gdb) The command get_vbits cannot be used with registers. To get the validity bits of a register, you must start Valgrind with the option --vgdb-shadow-registers=yes. The validity bits of a register can be obtained by printing the ’shadow 1’ corresponding register. In the below x86 example, the register eax has all its bits undeﬁned, while the register ebx is fully deﬁned.
Page 77 The ﬁrst command outputs one entry having an increase in the leaked bytes. The second command is the same as the ﬁrst command, but uses the abbreviated forms accepted by GDB and the Valgrind gdbserver. It only outputs the summary information, as there was no increase since the previous leak search.
Page 78 ==19520== (gdb) Note that when using Valgrind’s gdbserver, it is not necessary to rerun with --leak-check=full --show-reachable=yes to see the reachable blocks. You can obtain the same information without rerunning by using the GDB command monitor leak_check full reachable any (or, using abbreviation: mo l f r a).
Page 79 The second call shows the pointers (start and interior pointers) to block G. The block G (0x40281A8) is reachable via block C (0x40280a8) and register ECX of tid 1 (tid is the Valgrind thread id). It is "interior reachable"...
Page 80: Client Requests
ﬁrst byte for which the property is not true. Always returns 0 when not run on Valgrind. • VALGRIND_CHECK_VALUE_IS_DEFINED: a quick and easy way to ﬁnd out whether Valgrind thinks a particular value (lvalue, to be precise) is addressable and deﬁned.
Page 81: Memory Pools: Describing And Working With Custom Allocators
VALGRIND_CREATE_BLOCK. handle", which is a C int value. You can pass this block handle to VALGRIND_DISCARD. After doing so, Valgrind will no longer relate addressing errors in the speciﬁed range to the block. Passing invalid handles to VALGRIND_DISCARD is harmless.
Page 82 Memcheck: a memory error detector Keep in mind that the last two points above say "typically": the Valgrind mempool client request API is intentionally vague about the exact structure of a mempool. There is no speciﬁc mention made of headers or superblocks.
Page 83 Memcheck: a memory error detector • VALGRIND_MEMPOOL_ALLOC(pool, addr, size): This request informs Memcheck that a size-byte chunk has been allocated at addr, and associates the chunk with the speciﬁed pool. If the pool was created with nonzero rzB redzones, Memcheck will mark the rzB bytes before and after the chunk as NOACCESS. If the pool was created with the is_zeroed argument set, Memcheck will mark the chunk as DEFINED, otherwise Memcheck will mark the chunk as UNDEFINED.
Page 84: Debugging Mpi Parallel Programs With Valgrind
PMPI_Send, or receiving data into a buffer which is too small. Unlike most of the rest of Valgrind, the wrapper library is subject to a BSD-style license, so you can link it into any code base you like. See the top of mpi/libmpiwrap.c for license details.
Page 85: Getting Started
Memcheck: a memory error detector 4.9.2. Getting started Compile your MPI application as usual, taking care to link it using the same mpicc that your Valgrind build was conﬁgured with. Use the following basic scheme to run your application on Valgrind with the wrappers engaged: MPIWRAP_DEBUG=[wrapper-args] LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so...
Page 86: Functions
Memcheck: a memory error detector If you want to use Valgrind’s XML output facility (--xml=yes), you should pass quiet in MPIWRAP_DEBUG so as to get rid of any extraneous printing from the wrappers. 4.9.4. Functions All MPI2 functions except MPI_Wtick, MPI_Wtime and MPI_Pcontrol have wrappers. The ﬁrst two are not wrapped because they return a double, which Valgrind’s function-wrap mechanism cannot handle (but it could easily...
Page 87: Writing New Wrappers
Some effort is made to mark/check memory ranges corresponding to arrays of values in a single pass. This is important for performance since asking Valgrind to mark/check any range, no matter how small, carries quite a large constant cost. This optimisation is applied to arrays of primitive types (double, float, int, long, long long, short, char, and long double on platforms where sizeof(long double) == 8).
Page 88 Memcheck: a memory error detector A known source of potential false errors are the PMPI_Reduce family of functions, when using a custom (user- deﬁned) reduction function. In a reduction operation, each node notionally sends data to a "central point" which uses the speciﬁed reduction function to merge the data items into a single item.
Page 89: Cachegrind: A Cache And Branch-Prediction Proﬁler
5. Cachegrind: a cache and branch-prediction proﬁler To use this tool, you must specify --tool=cachegrind on the Valgrind command line. 5.1. Overview Cachegrind simulates how your program interacts with a machine’s cache hierarchy and (optionally) branch predictor. It simulates a machine with independent ﬁrst-level instruction and data caches (I1 and D1), backed by a uniﬁed second-level cache (L2).
Page 90: Running Cachegrind
First off, as for normal Valgrind use, you probably want to compile with debugging info (the -g option). by contrast with normal Valgrind use, you probably do want to turn optimisation on, since you should proﬁle your program as it will be normally run.
Page 91: Running Cg_Annotate
Cachegrind: a cache and branch-prediction proﬁler can be changed with the --cachegrind-out-file option. This ﬁle is human-readable, but is intended to be interpreted by the accompanying program cg_annotate, described in the next section. The default .<pid> sufﬁx on the output ﬁle name serves two purposes. Firstly, it means you don’t have to rename old log ﬁles that you don’t want to overwrite.
Page 92: The Global And Function-Level Counts
Cachegrind: a cache and branch-prediction proﬁler • Event sort order: the sort order in which functions are shown. For example, in this case the functions are sorted from highest Ir counts to lowest. If two functions have identical Ir counts, they will then be sorted by I1mr counts, and so on.
Page 93: Line-By-Line Counts
Cachegrind: a cache and branch-prediction proﬁler -------------------------------------------------------------------------------- I1mr ILmr Dr D1mr DLmr Dw D1mw DLmw file:function -------------------------------------------------------------------------------- 8,821,482 5 2,242,702 1,621 73 1,794,230 0 getc.c:_IO_getc 5,222,023 4 2,276,334 875,959 1 concord.c:get_word 2,649,248 2 1,344,810 7,326 1,385 . vg_main.c:strcmp 2,521,927 591,215 179,398 0 concord.c:hash 2,242,740...
Page 94 Cachegrind: a cache and branch-prediction proﬁler -------------------------------------------------------------------------------- -- User-annotated source: concord.c -------------------------------------------------------------------------------- I1mr ILmr Dr D1mr DLmr Dw D1mw DLmw . void init_hash_table(char * file_name, Word_N FILE * file_ptr; Word_Info * data; int line = 1, i; data = (Word_Info * ) create(sizeof(Word_In 4,991 1,995 for (i = 0;...
Page 95: Annotating Assembly Code Programs
ﬁles are often not present on a system. If a ﬁle is chosen for annotation both manually and automatically, it is marked as User-annotated source. Use the -I/--include option to tell Valgrind where to look for source ﬁles if the ﬁlenames found from the debugging information aren’t speciﬁc enough.
Page 96: Unusual Annotation Cases
This is because the line number in the struct nlist deﬁned in a.out.h under Linux is only a 16-bit value. Valgrind can handle some ﬁles with more than 65,535 lines correctly by making some guesses to identify line number overﬂows. But some cases are beyond it, in which case you’ll get a warning message explaining that annotations for the ﬁle might...
Page 97: Merging Proﬁles With Cg_Merge
Cachegrind: a cache and branch-prediction proﬁler • If you compile some ﬁles with -g and some without, some events that take place in a ﬁle without debug info could be attributed to the last line of a ﬁle with debug info (whichever one gets placed before the non-debug-info ﬁle in the executable).
Page 98: Cachegrind Command-Line Options
Cachegrind: a cache and branch-prediction proﬁler be negative; this indicates that the counts for the relevant function are fewer in the second version than those in the ﬁrst version. cg_diff does not attempt to check that the input ﬁles come from runs of the same executable. It will happily merge together proﬁle ﬁles from completely unrelated programs.
Page 99: Cg_Annotate Command-Line Options
Cachegrind: a cache and branch-prediction proﬁler --cachegrind-out-file=<file> Write the proﬁle data to file rather than to the default output ﬁle, cachegrind.out.<pid>. The %p and %q format speciﬁers can be used to embed the process ID and/or the contents of an environment variable in the name, as is the case for the core option --log-file.
Page 100: Acting On Cachegrind's Information
Cachegrind: a cache and branch-prediction proﬁler --mod-filename=<expr> [default: none] Speciﬁes a Perl search-and-replace expression that is applied to all ﬁlenames. Useful for removing minor differences in paths between two different versions of a program that are sitting in different directories. --mod-funcname=<expr>...
Page 101: Simulation Details
Cachegrind: a cache and branch-prediction proﬁler enum E { A, B, C }; enum E e; enum E table[] = { 1, 2, 3 }; int i; i += table[e]; This is obviously a contrived example, but the basic principle applies in a wide variety of situations. In short, Cachegrind can tell you where some of the bottlenecks in your code are, but it can’t tell you how to ﬁx them.
Page 102: Branch Simulation Speciﬁcs
Section 2.3 (pages 80-89) for background on modern branch predictors. 5.7.3. Accuracy Valgrind’s cache proﬁling has a number of shortcomings: • It doesn’t account for kernel activity -- the effect of system calls on the cache and branch predictor contents is ignored.
Page 103: Implementation Details
• It doesn’t account for cache misses not visible at the instruction level, e.g. those arising from TLB misses, or speculative execution. • Valgrind will schedule threads differently from how they would be when running natively. This could warp the results for threaded programs.
Page 104 Cachegrind: a cache and branch-prediction proﬁler file ::= desc_line * cmd_line events_line data_line+ summary_line desc_line ::= "desc:" ws? non_nl_string cmd_line ::= "cmd:" ws? cmd events_line ::= "events:" ws? (event ws)+ data_line ::= file_line | fn_line | count_line file_line ::= "fl=" filename fn_line ::= "fn="...
Page 105: Callgrind: A Call-Graph Generating Cache And Branch Prediction Proﬁler
6. Callgrind: a call-graph generating cache and branch prediction proﬁler To use this tool, you must specify --tool=callgrind on the Valgrind command line. 6.1. Overview Callgrind is a proﬁling tool that records the call history among functions in a program’s run as a call-graph. By default, the collected data consists of the number of instructions executed, their relationship to source lines, the caller/callee relationship between functions, and the numbers of such calls.
Page 106: Basic Usage
As with Cachegrind, you probably want to compile with debugging info (the -g option) and with optimization turned To start a proﬁle run for a program, execute: valgrind --tool=callgrind [callgrind options] your-program [program options] While the simulation is running, you can observe execution with: callgrind_control -b This will print out the current backtrace.
Page 107: Advanced Usage
Callgrind: a call-graph generating cache and branch prediction proﬁler Use --auto=yes to get annotated source code for all relevant functions for which the source can be found. In addition to source annotation as produced by cg_annotate, you will see the annotated call sites with call counts. For all other options, consult the (Cachegrind) documentation for cg_annotate.
Page 108: Limiting The Range Of Collected Events
Event collection is only possible if instrumentation for program code is enabled. This is the default, but for faster execution (identical to valgrind --tool=none), it can be disabled until the program reaches a state in which you want to start collecting proﬁling data. Callgrind can start without instrumentation by specifying option --instr-atstart=no.
Page 109: Counting Global Bus Events
Callgrind: a call-graph generating cache and branch prediction proﬁler misses which would not have happened in reality. If you do not want to see these, start event collection a few million instructions after you have enabled instrumentation. 6.2.3. Counting global bus events For access to shared data among threads in a multithreaded code, synchronization is required to avoid raced conditions.
Page 110: Forking Programs
Callgrind: a call-graph generating cache and branch prediction proﬁler quite capable of avoiding cycles, it has to be used carefully to not cause symbol explosion. The latter imposes large memory requirement for Callgrind with possible out-of-memory conditions, and big proﬁle data ﬁles. A further possibility to avoid cycles in Callgrind’s proﬁle data output is to simply leave out given functions in the call graph.
Page 111: Activity Options
0, never] Dump proﬁle data every count basic blocks. Whether a dump is needed is only checked when Valgrind’s internal scheduler is run. Therefore, the minimum setting useful is about 100000. The count is a 64-bit value to make long dump periods possible.
Page 112: Cost Entity Separation Options
Specify if you want Callgrind to start simulation and proﬁling from the beginning of the program. When set to no, Callgrind will not be able to collect any information, including calls, but it will have at most a slowdown of around 4, which is the minimum Valgrind overhead. Instrumentation can be interactively enabled via callgrind_control -i on.
Page 113: Simulation Options
Callgrind: a call-graph generating cache and branch prediction proﬁler --separate-threads=<no|yes> [default: This option speciﬁes whether proﬁle data should be generated separately for every thread. If yes, the ﬁle names get "-threadID" appended. --separate-callers=<callers> [default: Separate contexts by at most <callers> functions in the call chain. See Avoiding cycles.
Page 114: Callgrind Monitor Commands
Specify the size, associativity and line size of the level 1 data cache. --LL=<size>,<associativity>,<line size> Specify the size, associativity and line size of the last-level cache. 6.4. Callgrind Monitor Commands The Callgrind tool provides monitor commands handled by the Valgrind gdbserver (see Monitor command handling by the Valgrind gdbserver).
Page 115: Callgrind Speciﬁc Client Requests
Callgrind: a call-graph generating cache and branch prediction proﬁler • instrumentation [on|off] requests to set (if parameter on/off is given) or get the current instrumentation state. • status requests to print out some status information. 6.5. Callgrind speciﬁc client requests Callgrind provides the following speciﬁc client requests in callgrind.h.
Page 116: Callgrind_Control Command-Line Options
Callgrind: a call-graph generating cache and branch prediction proﬁler --auto=<yes|no> [default: Annotate all source ﬁles containing functions that helped reach the event count threshold. --context=N [default: Print N lines of context before and after annotated lines. --inclusive=<yes|no> [default: Add subroutine costs to functions calls. --tree=<none|caller|calling|both>...
Page 117 Switch instrumentation mode on or off. If a Callgrind run has instrumentation disabled, no simulation is done and no events are counted. This is useful to skip uninteresting program parts, as there is much less slowdown (same as with the Valgrind tool "none"). See also the Callgrind option --instr-atstart. -w=<dir>...
Page 118: Helgrind: A Thread Error Detector
To use this tool, you must specify --tool=helgrind on the Valgrind command line. 7.1. Overview Helgrind is a Valgrind tool for detecting synchronisation errors in C, C++ and Fortran programs that use the POSIX pthreads threading primitives. The main abstractions in POSIX pthreads are: a set of threads sharing a common address space, thread creation, thread joining, thread exit, mutexes (locks), condition variables (inter-thread event notiﬁcations), reader-writer locks,...
Page 119: Detected Errors: Inconsistent Lock Orderings
Helgrind: a thread error detector • destroying an invalid or a locked mutex • recursively locking a non-recursive mutex • deallocation of memory that contains a locked mutex • passing mutex arguments to functions expecting reader-writer lock arguments, and vice versa •...
Page 120 Helgrind: a thread error detector In this section, and in general, to "acquire" a lock simply means to lock that lock, and to "release" a lock means to unlock it. Helgrind monitors the order in which threads acquire locks. This allows it to detect potential deadlocks which could arise from the formation of cycles of locks.
Page 121: Detected Errors: Data Races
Helgrind: a thread error detector Thread #6: lock order "0x6010C0 before 0x601160" violated Observed (incorrect) order is: acquisition of lock at 0x601160 (stack unavailable) followed by a later acquisition of lock at 0x6010C0 at 0x4C2BC62: pthread_mutex_lock (hg_intercepts.c:494) by 0x4007DE: dine (tc14_laog_dinphils.c:19) by 0x4C2CBE7: mythread_wrapper (hg_intercepts.c:219) by 0x4E369C9: start_thread (pthread_create.c:300) 7.4.
Page 122 Helgrind: a thread error detector Thread #1 is the program’s root thread Thread #2 was created at 0x511C08E: clone (in /lib64/libc-2.8.so) by 0x4E333A4: do_clone (in /lib64/libpthread-2.8.so) by 0x4E33A30: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.8.so) by 0x4C299D4: pthread_create@ * (hg_intercepts.c:214) by 0x400605: main (simple_race.c:12) Possible data race during read of size 4 at 0x601038 by thread #1 Locks held: none at 0x400606: main (simple_race.c:13)
Page 123: Helgrind's Race Detection Algorithm
Helgrind: a thread error detector The following section explains Helgrind’s race detection algorithm in more detail. 7.4.2. Helgrind’s Race Detection Algorithm Most programmers think about threaded programming in terms of the basic functionality provided by the threading library (POSIX Pthreads): thread creation, thread joining, locks, condition variables, semaphores and barriers. The effect of using these functions is to impose constraints upon the order in which memory accesses can happen.
Page 124 Helgrind: a thread error detector Parent thread: Child thread: int var; // create child thread pthread_create(...) var = 20; // send message to child // wait for message to arrive var = 10; exit // wait for child pthread_join(...) printf("%d\n", var); Now the program reliably prints "10", regardless of the speed of the threads.
Page 125: Interpreting Race Error Messages
Helgrind: a thread error detector • When a condition variable (CV) is signalled on by thread T1 and some other thread T2 is thereby released from a wait on the same CV, then the memory accesses in T1 prior to the signalling must happen-before those in T2 after it returns from the wait.
Page 126 Helgrind: a thread error detector Thread #2 was created at 0x511C08E: clone (in /lib64/libc-2.8.so) by 0x4E333A4: do_clone (in /lib64/libpthread-2.8.so) by 0x4E33A30: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.8.so) by 0x4C299D4: pthread_create@ * (hg_intercepts.c:214) by 0x4008F2: main (tc21_pthonce.c:86) Thread #3 was created at 0x511C08E: clone (in /lib64/libc-2.8.so) by 0x4E333A4: do_clone (in /lib64/libpthread-2.8.so) by 0x4E33A30: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.8.so) by 0x4C299D4: pthread_create@ * (hg_intercepts.c:214)
Page 127: Hints And Tips For Effective Use Of Helgrind
Helgrind: a thread error detector The ﬁrst thing to do is examine the source locations referred to by each call stack. They should both show an access to the same location, or variable. Now ﬁgure out how how that location should have been made thread-safe: •...
Page 128 Helgrind: a thread error detector • Qt version 4.X. Qt 3.X is harmless in that it only uses POSIX pthreads primitives. Unfortunately Qt 4.X has its • Runtime support library for GNU OpenMP (part of GCC), at least for GCC versions 4.2 and 4.3. The GNU own implementation of mutexes (QMutex) and thread reaping.
Page 129 Helgrind: a thread error detector 2. Avoid memory recycling. If you can’t avoid it, you must use tell Helgrind what is going on via the VALGRIND_HG_CLEAN_MEMORY client request (in helgrind.h). Helgrind is aware of standard heap memory allocation and deallocation that occurs via malloc/free/new/delete and from entry and exit of stack frames.
Page 130: Helgrind Command-Line Options
This functionality is new in Valgrind 3.7.0, and is regarded as experimental. It is not enabled by default because its interaction with custom memory allocators is not well understood at present. User feedback is welcomed.
Page 131 Helgrind: a thread error detector --track-lockorders=no|yes [default: yes] When enabled (the default), Helgrind performs lock order consistency checking. For some buggy programs, the large number of lock order errors reported can become annoying, particularly if you’re only interested in race errors. You may therefore ﬁnd it helpful to disable lock order checking.
Page 132: Helgrind Client Requests
Helgrind: a thread error detector --check-stack-refs=no|yes [default: yes] By default Helgrind checks all data memory accesses made by your program. This ﬂag enables you to skip checking for accesses to thread stacks (local variables). This can improve performance, but comes at the cost of missing races on stack-allocated data.
Page 133: Drd: A Thread Error Detector
8.1. Overview DRD is a Valgrind tool for detecting errors in multithreaded C and C++ programs. The tool works for any program that uses the POSIX threading primitives or that uses threading concepts built on top of the POSIX threading primitives.
Page 134: Multithreaded Programming Problems
DRD: a thread error detector • A shared address space. All threads running within the same process share the same address space. All data, whether shared or not, is identiﬁed by its address. • Regular load and store operations, which allow to read values from or to write values to the memory shared by all threads running in the same process.
Page 135: Using Drd
DRD: a thread error detector 2. Synchronization operations determine certain ordering constraints on memory operations performed by different threads. These ordering constraints are called the synchronization order. The combination of program order and synchronization order is called the happens-before relationship. This concept was ﬁrst deﬁned by S.
Page 136 • Don’t enable this option when using reference-counted objects because that will result in false pos- itives, even when that code has been annotated properly with ANNOTATE_HAPPENS_BEFORE and ANNOTATE_HAPPENS_AFTER. See e.g. the output of the following command for an example: valgrind --tool=drd --free-is-write=yes drd/tests/annotate_smart_pointer. --report-signal-unlocked=<yes|no> [default:...
Page 137: Detected Errors: Data Races
DRD: a thread error detector --ptrace-addr=<address> [default: none] Trace all load and store activity for the speciﬁed address and keep doing that even after the memory at that address has been freed and reallocated. --trace-alloc=<yes|no> [default: Trace all memory allocations and deallocations. May produce a huge amount of output. --trace-barrier=<yes|no>...
Page 138 DRD: a thread error detector Below you can ﬁnd an example of a message printed by DRD when it detects a data race: $ valgrind --tool=drd --read-var-info=yes drd/tests/rwlock_race ==9466== Thread 3: ==9466== Conflicting load by thread 3 at 0x006020b8 size 4 ==9466== at 0x400B6C: thread_func (rwlock_race.c:29)
Page 139: Detected Errors: Lock Contention
Lock contention causes delays. Such delays should be as short as possible. The two command line options --exclusive-threshold=<n> and --shared-threshold=<n> make it possible to detect excessive lock contention by making DRD report any lock that has been held longer than the speciﬁed threshold. An example: $ valgrind --tool=drd --exclusive-threshold=10 drd/tests/hold_lock -i 500 ==10668== Acquired at: ==10668== at 0x4C267C8: pthread_mutex_lock (drd_pthread_intercepts.c:395)
Page 140: Client Requests
8.2.5. Client Requests Just as for other Valgrind tools it is possible to let a client program interact with the DRD tool through client requests. In addition to the client requests several macros have been deﬁned that allow to use the client requests in a convenient way.
Page 141 DRD: a thread error detector • The macro DRD_STOP_IGNORING_VAR(x) and the corresponding client request VG_USERREQ__DRD_FINISH_SUPPRESSIO Tell DRD to no longer ignore data races for the address range that was suppressed either via the macro DRD_IGNORE_VAR(x) or via the client request VG_USERREQ__DRD_START_SUPPRESSION. •...
Page 142 • The macro ANNOTATE_THREAD_NAME(name) tells DRD to associate the speciﬁed name with the current thread and to include this name in the error messages printed by DRD. • The macros VALGRIND_MALLOCLIKE_BLOCK and VALGRIND_FREELIKE_BLOCK from the Valgrind core are implemented; they are described in The Client Request mechanism.
Page 143: Debugging Gnome Programs
Note: if you compiled Valgrind yourself, the header ﬁle <valgrind/drd.h> will have been installed in the directory /usr/include by the command make install. If you obtained Valgrind by installing it as a package however, you will probably have to install another package with a name like valgrind-devel before Valgrind’s header ﬁles are available.
Page 144: Drd And Custom Memory Allocators
DRD tracks all memory allocation events that happen via the standard memory allocation and deallocation functions (malloc, free, new and delete), via entry and exit of stack frames or that have been annotated with Valgrind’s memory pool client requests. DRD uses memory allocation and deallocation information for two purposes:...
Page 145: Drd Versus Memcheck
DRD: a thread error detector • To know where the scope ends of POSIX objects that have not been destroyed explicitly. It is e.g. not required by the POSIX threads standard to call pthread_mutex_destroy before freeing the memory in which a mutex object resides.
Page 146: Using The Posix Threads Api Effectively
DRD: a thread error detector • Compile with option -O1 instead of -O0. This will reduce the amount of generated code, may reduce the amount of debug info and will speed up DRD’s processing of the client program. For more information, see also Getting started.
Page 147: Condition Variables
The older LinuxThreads library is not supported. 8.5. Feedback If you have any comments, suggestions, feedback or bug reports about DRD, feel free to either post a message on the Valgrind users mailing list or to ﬁle a bug report. See also http://www.valgrind.org/ for more information.
Page 148: Massif: A Heap Proﬁler
9.2. Using Massif and ms_print First off, as for the other Valgrind tools, you should compile with debugging info (the -g option). It shouldn’t matter much what optimisation level you compile your program with, as this is unlikely to affect the heap memory usage.
Page 149: Running Massif
To gather heap proﬁling information about the program prog, type: valgrind --tool=massif prog The program will execute (slowly). Upon completion, no summary statistics are printed to Valgrind’s commentary; all of Massif’s proﬁling data is written to a ﬁle. By default, this ﬁle is called massif.out.<pid>, where <pid>...
Page 150: The Output Preamble
Massif: a heap proﬁler ms_print massif.out.12345 ms_print will produce (a) a graph showing the memory consumption over the program’s execution, and (b) detailed information about the responsible allocation sites at various points in the program, including the point of peak memory allocation.
Page 151 Massif: a heap proﬁler 19.63^ 0 +----------------------------------------------------------------------->ki Number of snapshots: 25 Detailed snapshots: [9, 14 (peak), 24] Why is most of the graph empty, with only a couple of bars at the very end? By default, Massif uses "instructions executed" as the unit of time. For very short-run programs such as the example, most of the executed instructions involve the loading and dynamic linking of the program.
Page 152 Massif: a heap proﬁler 19.63^ # :: # : ::: :::::::::# : : :: # : : : :: # : : : : ::: # : : : : : :: ::::::::::: # : : : : : : ::: # : : : : : : : :: ::::: # : : : : : : : : ::...
Page 153: The Snapshot Details
Massif: a heap proﬁler • Peak snapshots are only ever taken after a deallocation happens. This avoids lots of unnecessary peak snapshot recordings (imagine what happens if your program allocates a lot of heap blocks in succession, hitting a new peak every time).
Page 154 Massif: a heap proﬁler -------------------------------------------------------------------------------- time(B) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 1,008 1,008 1,000 2,016 2,016 2,000 3,024 3,024 3,000 4,032 4,032 4,000 5,040 5,040 5,000 6,048 6,048 6,000 7,056 7,056 7,000 8,064 8,064 8,000 Each normal snapshot records several things. •...
Page 155 Massif: a heap proﬁler The next snapshot is detailed. As well as the basic counts, it gives an allocation tree which indicates exactly which pieces of code were responsible for allocating heap memory: 9,072 9,072 9,000 99.21% (9,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.21% (9,000B) 0x804841A: main (example.c:20) The allocation tree can be read from the top down.
Page 156 Massif: a heap proﬁler distinct stack traces in the tree. In contrast, if B calls A repeatedly from line 15 (e.g. due to a loop), then each of those calls will be represented by the same stack trace in the tree. Note also that each tree entry with children in the example satisﬁes an invariant: the entry’s size is equal to the sum of its children’s sizes.
Page 157: Forking Programs
Massif: a heap proﬁler responsible for more than 1% of useful memory bytes, and ms_print likewise only prints the details for code locations responsible for more than 1%. The entries that do not meet this threshold are aggregated. This avoids ﬁlling up the output with large numbers of unimportant entries.
Page 158: Acting On Massif's Information
Massif: a heap proﬁler 9.2.9. Acting on Massif’s Information Massif’s information is generally fairly easy to act upon. The obvious place to start looking is the peak snapshot. It can also be useful to look at the overall shape of the graph, to see if memory usage climbs and falls as you expect; spikes in the graph might be worth investigating.
Page 159 Massif: a heap proﬁler --alloc-fn=<name> Functions speciﬁed with this option will be treated as though they were a heap allocation function such as malloc. This is useful for functions that are wrappers to malloc or new, which can ﬁll up the allocation trees with uninteresting information.
Page 160: Massif Monitor Commands
ID and/or the contents of an environment variable in the name, as is the case for the core option --log-file. 9.4. Massif Monitor Commands The Massif tool provides monitor commands handled by the Valgrind gdbserver (see Monitor command handling by the Valgrind gdbserver).
Page 161: Dhat: A Dynamic Heap Analysis Tool
10. DHAT: a dynamic heap analysis tool To use this tool, you must specify --tool=exp-dhat on the Valgrind command line. 10.1. Overview DHAT is a tool for examining how programs use their heap allocations. It tracks the allocated blocks, and inspects every memory access to ﬁnd which block, if any, it is to. The following data is collected and presented per allocation point (allocation stack): •...
Page 162: Understanding Dhat's Output
DHAT: a dynamic heap analysis tool As with the Massif heap proﬁler, DHAT measures program progress by counting instructions, and so presents all age/time related ﬁgures as instruction counts. This sounds a little odd at ﬁrst, but it makes runs repeatable in a way which is not possible if CPU time is used.
Page 163: Interpreting The Acc-Ratios Fields
DHAT: a dynamic heap analysis tool ======== SUMMARY STATISTICS ======== guest_insns: 418,901,537 [...] max-live: 32,512 in 254 blocks tot-alloc: 32,512 in 254 blocks (avg size 128.00) deaths: 254, at avg age 300,467,389 acc-ratios: 0.26 rd, 0.20 wr (8,756 b-read, 6,604 b-written) at 0x4C275B8: malloc (vg_replace_malloc.c:236) by 0x4C27632: realloc (vg_replace_malloc.c:525) by 0x56FF41D: QtFontStyle::pixelSize(unsigned short, bool) (qfontdatabase.cpp:269)
Page 164: Interpreting "Aggregated Access Counts By Offset" Data
DHAT: a dynamic heap analysis tool perform such an analysis. We can see that they must have varying sizes since the average block size, 61.13, isn’t a whole number. 10.2.2.2. A more suspicious looking example max-live: 180,224 in 22 blocks tot-alloc: 180,224 in 22 blocks (avg size 8192.00) deaths:...
Page 165: Dhat Command-Line Options
DHAT: a dynamic heap analysis tool max-live: 317,408 in 5,668 blocks tot-alloc: 317,408 in 5,668 blocks (avg size 56.00) deaths: 5,668, at avg age 622,890,597 acc-ratios: 1.03 rd, 1.28 wr (327,642 b-read, 408,172 b-written) at 0x4C275B8: malloc (vg_replace_malloc.c:236) by 0x5440C16: QDesignerPropertySheetPrivate::ensureInfo (qhash.h:515) by 0x544350B: QDesignerPropertySheet::setVisible (qdesigner_propertysh...) by 0x5446232: QDesignerPropertySheet::QDesignerPropertySheet (qdesigne...) Aggregated access counts by offset:...
Page 166 DHAT: a dynamic heap analysis tool --show-top-n=<number> [default: At the end of the run, DHAT sorts the accumulated allocation points according to some metric, and shows the highest scoring entries. --show-top-n controls how many entries are shown. The default of 10 is quite small. realistic applications you will probably need to set it much higher, at least several hundred.
Page 167: Sgcheck: An Experimental Stack And Global Array Overrun Detector
11. SGCheck: an experimental stack and global array overrun detector To use this tool, you must specify --tool=exp-sgcheck on the Valgrind command line. 11.1. Overview SGCheck is a tool for ﬁnding overruns of stack and global arrays. It works by using a heuristic approach derived from an observation about the likely forms of stack and global array accesses.
Page 168: Comparison With Memcheck
It is hard to see how to get around this problem. The only mitigating factor is that such constructions appear very rare, at least judging from the results using the tool so far. Such a construction appears only once in the Valgrind sources (running Valgrind on Valgrind) and perhaps two or three times for a start and exit of Firefox.
Page 169: Still To Do: User-Visible Functionality
SGCheck: an experimental stack and global array overrun detector • Coverage: Stack and global checking is fragile. If a shared object does not have debug information attached, then SGCheck will not be able to determine the bounds of any stack or global arrays deﬁned within that shared object, and so will not be able to check accesses to them.
Page 170: Bbv: An Experimental Basic Block Vector Generation Tool
12.2. Using Basic Block Vectors to create SimPoints To quickly create a basic block vector ﬁle, you will call Valgrind like this: valgrind --tool=exp-bbv /bin/ls In this case we are running on /bin/ls, but this can be any program. By default a ﬁle called bb.out.PID will be created, where PID is replaced by the process ID of the running process.
Page 171: Bbv Command-Line Options
BBV: an experimental basic block vector generation tool The outputs from the SimPoint run are the results.simpts and results.weights ﬁles. The ﬁrst holds the 5 most relevant intervals of the program. The seconds holds the weight to scale each interval by when extrapolating full-program behavior.
Page 172: Implementation
SimPoint utility ignores them. 12.5. Implementation Valgrind provides all of the information necessary to create BBV ﬁles. In the current implementation, all instructions are instrumented. This is slower (by approximately a factor of two) than a method that instruments at the basic block level, but there are some complications (especially with rep preﬁx detection) that make that method more difﬁcult.
Page 173: Performance
BBV: an experimental basic block vector generation tool Binary Instrumentation to Generate Multi-Platform SimPoints: Methodology and Accuracy" by V.M. Weaver and S.A. McKee. 12.8. Performance Using this program slows down execution by roughly a factor of 40 over native execution. This varies depending on the machine used and the benchmark being run.
Page 174: Lackey: An Example Tool
13.1. Overview Lackey is a simple Valgrind tool that does various kinds of basic program measurement. It adds quite a lot of simple instrumentation to the program’s code. It is primarily intended to be of use as an example tool, and consequently emphasises clarity of implementation over performance.
Page 175: Nulgrind: The Minimal Valgrind Tool
To use this tool, you must specify --tool=none on the Valgrind command line. 14.1. Overview Nulgrind is the simplest possible Valgrind tool. It performs no instrumentation or analysis of a program, just runs it normally. It is mainly of use for Valgrind’s developers for debugging and regression testing.
Page 177 Valgrind FAQ Table of Contents Valgrind Frequently Asked Questions...
Page 178 3.2. My (buggy) program dies like this: 3.3. My program dies, printing a message like this along the way: 3.4. I tried running a Java program (or another program that uses a just-in-time compiler) under Valgrind but something went wrong. Does Valgrind handle such programs? 4.
Page 179 "Heimdal". Keeping with the Nordic theme, Valgrind was chosen. Valgrind is the name of the main entrance to Valhalla (the Hall of the Chosen Slain in Asgard). Over this entrance there resides a wolf and over it there is the head of a boar and on it perches a huge eagle, whose eyes can see to the far regions of the nine worlds.
Page 180 Valgrind Frequently Asked Questions 3. Valgrind aborts unexpectedly 3.1. Programs run OK on Valgrind, but at exit produce a bunch of errors involving __libc_freeres and then die with a segmentation fault. When the program exits, Valgrind runs the procedure __libc_freeres in glibc.
Page 181 --smc-check=all. Apart from this, in theory Valgrind can run any Java program just ﬁne, even those that use JNI and are partially implemented in other languages like C and C++. In practice, Java implementations tend to do nasty things that most programs do not, and Valgrind sometimes falls over these corner cases.
Page 182 Also, for leak reports involving shared objects, if the shared object is unloaded before the program terminates, Valgrind will discard the debug information and the error message will be full of ??? entries. The workaround here is to avoid calling dlclose on these shared objects.
Page 183 Valgrind. There isn’t anything you can do to change this, it’s just the nature of the way Valgrind works that it cannot exactly replicate a native execution environment. In the case where your program crashes due to a memory error when run natively but not when run under Valgrind, in most cases Memcheck should identify the bad memory operation.
Page 184 Valgrind Frequently Asked Questions Second, if your program is statically linked, most Valgrind tools won’t work as well, because they won’t be able to replace certain functions, such as malloc, with their own versions. A key indicator of this is if...
Page 185 If you think an answer in this FAQ is incomplete or inaccurate, please e-mail valgrind@valgrind.org. If you have tried all of these things and are still stuck, you can try mailing the valgrind-users mailing list. Note that an email has a better change of being answered usefully if it is clearly written.
Page 187 Valgrind Technical Documentation Table of Contents 1. The Design and Implementation of Valgrind 2. Writing a New Valgrind Tool 2.1. Introduction 2.2. Basics 2.2.1. How tools work 2.2.2. Getting the code 2.2.3. Getting started 2.2.4. Writing the code 2.2.5. Initialisation 2.2.6.
Page 188: The Design And Implementation Of Valgrind
A number of academic publications nicely describe many aspects of Valgrind’s design and implementation. Online copies of all of them, and others, are available on the Valgrind publications page. The following paper gives a good overview of Valgrind, and explains how it differs from other dynamic binary instrumentation frameworks such as Pin and DynamoRIO.
Page 189: Writing A New Valgrind Tool
Tools must deﬁne various functions for instrumenting programs that are called by Valgrind’s core. They are then linked against Valgrind’s core to deﬁne a complete Valgrind tool which will be used when the --tool option is used to select it.
Page 190: Writing The Code
(almost any program should work; date is just an example). The output should be something like this: ==738== foobar-0.0.1, a foobarring tool. ==738== Copyright (C) 2002-2009, and GNU GPL’d, by J. Programmer. ==738== Using Valgrind-3.5.0.SVN and LibVEX; rerun with -h for copyright info ==738== Command: date ==738==...
Page 191: Initialisation
More information about "details", "needs" and "trackable events" can be found in include/pub_tool_tooliface.h. 2.2.6. Instrumentation instrument is the interesting one. It allows you to instrument VEX IR, which is Valgrind’s RISC-like intermediate language. VEX IR is described in the comments of the header ﬁle VEX/pub/libvex_ir.h.
Page 192: Advanced Topics
C library, details of which are in pub_tool_libc * .h. When writing a tool, in theory you shouldn’t need to look at any of the code in Valgrind’s core, but in practice it might be useful sometimes to help understand something.
Page 193: Documentation
If you are feeling conscientious and want to write some documentation for your tool, please use XML as the rest of Valgrind does. The ﬁle docs/README has more details on getting the XML toolchain to work; this can be difﬁcult, unfortunately.
Page 194: Regression Tests
3. Write the tests, .vgtest test description ﬁles, .stdout.exp and .stderr.exp expected output ﬁles. (Note that Valgrind’s output goes to stderr.) Some details on writing and running tests are given in the comments at the top of the testing script tests/vg_regtest.
Page 195: Final Words
Writing a New Valgrind Tool 2.4. Final Words Writing a new Valgrind tool is not easy, but the tools you can write with Valgrind are among the most powerful programming tools there are. Happy programming!
Page 196: Callgrind Format Speciﬁcation
The event names in the following example are quite arbitrary, and are not related to event names used by Callgrind. Especially, cycle counts matching real processors probably will never be generated by any Valgrind tools, as these are bound to simulations of simple machine models for acceptable slowdown. However, any proﬁling tool could use the format described in this chapter.
Page 197: Associations
Callgrind Format Speciﬁcation line 16 in ﬁle file.f, taking 20 CPU cycles. If a cost line speciﬁes less event counts than given in the "events" line, the rest is assumed to be zero. I.e. there was no ﬂoating point instruction executed relating to line 16. Note that regular cost lines always give self (also called exclusive) cost of code at a given position.
Page 198: Name Compression
Callgrind Format Speciﬁcation One can see that in main only code from line 16 is executed where also the other functions are called. Inclusive cost of main is 820, which is the sum of self cost 20 and costs spent in the calls: 400 for the single call to func1 and 400 as sum for the three calls to func2.
Page 199: Subposition Compression
Callgrind Format Speciﬁcation events: Instructions # define file ID mapping fl=(1) file1.c fl=(2) file2.c # define function ID mapping fn=(1) main fn=(2) func1 fn=(3) func2 fl=(1) fn=(1) 16 20 3.1.6. Subposition Compression If a Callgrind data ﬁle should hold costs for each assembler instruction of a program, you specify subposition "instr" in the "positions:"...
Page 200: Miscellaneous
Callgrind Format Speciﬁcation Remark: For assembler annotation to work, instruction addresses have to be corrected to correspond to addresses found in the original binary. I.e. for relocatable shared objects, often a load offset has to be subtracted. 3.1.7. Miscellaneous 3.1.7.1. Cost Summary Information For the visualization to be able to show cost percentage, a sum of the cost of the full run has to be known.
Page 201 Callgrind Format Speciﬁcation PartDetail := TargetCommand | TargetID TargetCommand := "cmd:" Space * NoNewLineChar * TargetID := ("pid"|"thread"|"part") ":" Space * Number Description := "desc:" Space * Name Space * ":" NoNewLineChar * EventSpecification := "event:" Space * Name InheritedDef? LongNameDef? InheritedDef := "="...
Page 202: Description Of Header Lines
Callgrind Format Speciﬁcation CostPosition := "ob" | "fl" | "fi" | "fe" | "fn" CalledPosition := " "cob" | "cfl" | "cfn" PositionName := ( "(" Number ")" )? (Space * NoNewLineChar * )? AssociationSpecification := CallSpecification | JumpSpecification CallSpecification := CallLine "\n" CostLine CallLine := "calls="...
Page 203 Callgrind Format Speciﬁcation • version: number [Callgrind] This is used to distinguish future proﬁle data formats. A major version of 0 or 1 is supposed to be upwards compatible with Cachegrind’s format. It is optional; if not appearing, version 1 is supposed. Otherwise, this has to be the ﬁrst header line.
Page 204: Description Of Body Lines
Callgrind Format Speciﬁcation • summary: costs [Callgrind] costs [Cachegrind] totals: The value or the total number of events covered by this trace ﬁle. Both keys have the same meaning, but the "totals:" line happens to be at the end of the ﬁle, while "summary:" appears in the header. This was added to allow postprocessing tools to know in advance to total cost.
Page 205 Callgrind Format Speciﬁcation • jump=count target position [Callgrind] Unconditional jump, executed count times, to the given target position. • jcnd=exe.count jumpcount target position [Callgrind] Conditional jump, executed exe.count times with jumpcount jumps to the given target position.
Page 207 Valgrind Distribution Documents Table of Contents 1. AUTHORS 2. NEWS 3. OLDER NEWS 4. README 5. README_MISSING_SYSCALL_OR_IOCTL 6. README_DEVELOPERS 7. README_PACKAGERS 8. README.S390 9. README.android 10. README.android_emulator 11. README.mips...
Page 208: Authors
1. AUTHORS Julian Seward was the original founder, designer and author of Valgrind, created the dynamic translation frameworks, wrote Memcheck, the 3.X versions of Helgrind, SGCheck, DHAT, and did lots of other things. Nicholas Nethercote did the core/tool generalisation, wrote Cachegrind and Massif, and tons of other stuff.
Page 209 Jakub Jelinek helped out with the AVX support. Many, many people sent bug reports, patches, and helpful feedback. Development of Valgrind was supported in part by the Tri-Lab Partners (Lawrence Livermore National Laboratory, Los Alamos National Laboratory, and Sandia National Laboratories) of the U.S. Department...
Page 210: News
There is initial support for MacOSX 10.8, but it is not usable for serious work at present. * ================== PLATFORM CHANGES ================= * Support for MIPS32 platforms running Linux. Valgrind has been tested on MIPS32 and MIPS32r2 platforms running different Debian Squeeze and MeeGo distributions. Both little-endian and big-endian cores are supported.
Page 211 NEWS - The leak_check GDB server monitor command now can control the maximum nr of loss records to output. - Reduction of memory use for applications allocating many blocks and/or having many partially deﬁned bytes. - Addition of GDB server monitor command ’block_list’ that lists the addresses/sizes of the blocks of a leak search loss record.
Page 212 NEWS and DRD. * For tool developers: support to run Valgrind on Valgrind has been improved. We can now routinely Valgrind on Helgrind or Memcheck. * gdbserver now shows the ﬂoat shadow registers as integer rather than ﬂoat values, as the shadow values are mostly used as bit patterns.
Page 213 293755 == 293754 (No tests for PCMPxSTRx on 16-bit characters) 293808 CLFLUSH not supported by latest VEX for amd64 294047 valgrind does not correctly emulate prlimit64(..., RLIMIT_NOFILE, ...) 294048 MPSADBW instruction not implemented 294055 regtest none/tests/shell fails when locale is not set to C...
Page 214 298421 accept4() syscall (366) support is missing for ARM 298718 vex amd64->IR: 0xF 0xB1 0xCB 0x9C 0x8F 0x45 298732 valgrind installation problem in ubuntu with kernel version 3.x 298862 POWER Processor DFP instruction support missing, part 4 298864 DWARF reader mis-parses DW_FORM_ref_addr 298943 massif asserts with --pages-as-heap=yes when brk is changing [..]...
Page 215 302578 Unrecognized isntruction 0xc5 0x32 0xc2 0xca 0x09 vcmpngess 302656 == 273475 (Add support for AVX instructions) 302709 valgrind for ARM needs extra tls support for android emulator [..] 302827 add wrapper for CDROM_GET_CAPABILITY 302901 Valgrind crashes with dwz optimized debuginfo...
Page 216 This release also supports MacOSX 10.6, but drops support for 10.5. * Preliminary support for Android (on ARM). Valgrind can now run large applications (eg, Firefox) on (eg) a Samsung Nexus S. See README.android for more details, plus instructions on how to get started.
Page 217 ("Stack and Global Array Checking"). * ==================== OTHER CHANGES ==================== * GDB server: Valgrind now has an embedded GDB server. That means it is possible to control a Valgrind run from GDB, doing all the usual things that GDB can do (single stepping, breakpoints, examining data, etc).
Page 218 268621 s390x: improve IR generation for XC 268715 s390x: FLOGR is not universally available 268792 == 267997 (valgrind seg faults on startup when compiled with Xcode 4) 268930 s390x: MHY is not universally available 269078 arm->IR: unhandled instruction SUB (SP minus immediate/register) 269079 Support ptrace system call on ARM 269144 missing "Bad option"...
Page 219 270959 s390x: invalid use of R0 as base register 271042 VSX conﬁgure check fails when it should not 271043 Valgrind build fails with assembler error on ppc64 with binutils 2.21 271259 s390x: ﬁx code confusion 271337 == 267997 (Valgrind segfaults on MacOS X)
Page 220 280290 vex amd64->IR: 0x66 0xF 0x38 0x28 0xC1 0x66 0xF 0x6F 280710 s390x: conﬁg ﬁles for nightly builds 280757 /tmp dir still used by valgrind even if TMPDIR is speciﬁed 280965 Valgrind breaks fcntl locks when program does mmap 281138 WARNING: unhandled syscall: 340 281241 == 275168 (valgrind useless on Macos 10.7.1 Lion)
Page 221 XXXXXX is the bug number as listed below. 188572 Valgrind on Mac should suppress setenv() mem leak 194402 vex amd64->IR: 0x48 0xF 0xAE 0x4 (proper FX{SAVE,RSTOR} support) 210481 vex amd64->IR: Assertion ‘sz == 2 || sz == 4’ failed (REX.W POPQ)
Page 222 258870 (SSE4.x) Add support for EXTRACTPS SSE 4.1 instruction 261966 (SSE4.x) support for CRC32B and CRC32Q is lacking (also CRC32{W,L}) 262985 VEX regression in valgrind 3.6.0 in handling PowerPC VMX 262995 (SSE4.x) crash when trying to valgrind gcc-snapshot (PCMPxSTRx $0) 263099 callgrind_annotate counts Ir improperly [...]...
Page 223 * Support for ARM/Linux. Valgrind now runs on ARMv7 capable CPUs running Linux. It is known to work on Ubuntu 10.04, Ubuntu 10.10, and Maemo 5, so you can run Valgrind on your Nokia N900 if you want. This requires a CPU capable of running the ARMv7-A instruction set (Cortex A5, A8 and A9).
Page 224 NEWS useful for giving a general idea about a program’s locality. * Massif has a new option, --pages-as-heap, which is disabled by default. When enabled, instead of tracking allocations at the level of heap blocks (as allocated with malloc/new/new[]), it instead tracks memory allocations at the level of memory pages (as mapped by mmap, brk, etc).
Page 225 * Improved support for the Valkyrie GUI, version 2.0.0. GUI output and control of Valgrind is now available for the tools Memcheck and Helgrind. XML output from Valgrind is available for Memcheck, Helgrind and exp-Ptrcheck.
Page 226 205241 Snow Leopard 10.6 support (partial ﬁx) 206600 Leak checker fails to upgrade indirect blocks when their parent becomes reachable 210935 port valgrind.h (not valgrind) to win32 so apps run under wine can make client requests 211410 vex amd64->IR: 0x15 0xFF 0xFF 0x0 0x0 0x89...
Page 227 245535 print full path names in plain text reports 245925 x86-64 red zone handling problem 246258 Valgrind not catching integer underruns + new [] s 246311 reg/reg cmpxchg doesn’t work on amd64 246549 unhandled syscall unix:277 while testing 32-bit Darwin app 246888 Improve Makeﬁle.vex.am...
Page 228 Release 3.5.0 (19 August 2009) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3.5.0 is a feature release with many signiﬁcant improvements and the usual collection of bug ﬁxes. The main improvement is that Valgrind now works on Mac OS X. This release supports X86/Linux, AMD64/Linux, PPC32/Linux, PPC64/Linux and X86/Darwin.
Page 229 NEWS * Valgrind now runs on Mac OS X. (Note that Mac OS X is sometimes called "Darwin" because that is the name of the OS core, which is the level that Valgrind works at.) Supported systems: - It requires OS 10.5.x (Leopard). Porting to 10.4.x is not planned because it would require work and 10.4 is only becoming less common.
Page 230 - Documentation for the leak checker has been improved. * Various aspects of Valgrind’s text output have changed. - Valgrind’s start-up message has changed. It is shorter but also includes the command being run, which makes it easier to use --trace-children=yes.
Page 231 --xml-ﬁle=, --xml-fd= or --xml-socket= to select the XML destination, one of --log-ﬁle=, --log-fd= or --log-socket= to select the destination for any remaining text messages, and, importantly, -q. -q makes Valgrind completely silent on the text channel, except in the case of critical failures, such as Valgrind...
Page 232 NEWS itself segfaulting, or failing to read debugging information. Hence, in this scenario, it sufﬁces to check whether or not any output appeared on the text channel. If yes, then it is likely to be a critical error which should be brought to the attention of the user.
Page 233 - The error messages printed by DRD are now easier to interpret. Instead of using two different numbers to identify each thread (Valgrind thread ID and DRD thread ID), DRD does now identify threads via a single number (the DRD thread ID). Furthermore "ﬁrst observed at"...
Page 234 * Something that happened in 3.4.0, but wasn’t clearly announced: the option --read-var-info=yes can be used by some tools (Memcheck, Helgrind and DRD). When enabled, it causes Valgrind to read DWARF3 variable type and location information. This makes those tools...
Page 235 - The location of some install ﬁles has changed. This should not affect most users. Those who might be affected: * For people who use Valgrind with MPI programs, the installed libmpiwrap.so library has moved from $(INSTALL)/<platform>/libmpiwrap.so to $(INSTALL)/libmpiwrap-<platform>.so.
Page 236 108528 NPTL pthread cleanup handlers not called 110126 Valgrind 2.4.1 conﬁgure.in tramples CFLAGS 110128 mallinfo is not implemented... 110770 VEX: Generated ﬁles not always updated when making valgrind 111102 Memcheck: problems with large (memory footprint) applications 115673 Vex’s decoder should never assert...
Page 237 Assertion ’!already_present’ failed. 185359 exp-ptrcheck: unhandled syscall getresuid() 185794 "WARNING: unhandled syscall: 285" (fallocate) on x86_64 185816 Valgrind is unable to handle debug info for ﬁles with split debug info that are prelinked afterwards 185980 [darwin] unhandled syscall: sem_open 186238 bbToIR_AMD64: disInstr miscalculated next %rip 186507 exp-ptrcheck unhandled syscalls prctl, etc.
Page 238 198624 Missing syscalls on Darwin: 82, 167, 281, 347 198649 callgrind_annotate doesn’t cumulate counters 199338 callgrind_annotate sorting/thresholds are broken for all but Ir 199977 Valgrind complains about an unrecognized instruction in the atomic_incs test program 200029 valgrind isn’t able to read Fedora 12 debuginfo 200760 darwin unhandled syscall: unix:284 200827 DRD doesn’t work on Mac OS X...
Page 239 179624 helgrind: false positive races with pthread_create and recv/open/close/read 134207 pkg-conﬁg output contains @VG_PLATFORM@ 176926 ﬂoating point exception at valgrind startup with PPC 440EPX 181594 Bogus warning for empty text segment 173751 amd64->IR: 0x48 0xF 0x6F 0x45 (even more redundant rex preﬁxes) 181707 Dwarf3 doesn’t require enumerations to have name...
Page 240 Ptrcheck currently works only on x86-linux and amd64-linux. To use it, use --tool=exp-ptrcheck. A simple manual is provided, as part of the main Valgrind documentation. As this is an experimental tool, we would be particularly interested in hearing about your...
Page 241 * 3.4.0 adds support on x86/amd64 for the SSSE3 instruction set. * Very basic support for IBM Power6 has been added (64-bit processes only). * Valgrind is now cross-compilable. For example, it is possible to cross compile Valgrind on an x86/amd64-linux host, so that it runs on a ppc32/64-linux target.
Page 242 It can now correctly establish the addresses for ELF data symbols, which is something that has never worked properly before now. Also, Valgrind can now read DWARF3 type and location information for stack and global variables. This makes it possible to use the framework to build tools that rely on knowing the type and locations of stack and global variables, for example exp-Ptrcheck.
Page 243: Older News
156960 ==155901 155528 support Core2/SSSE3 insns on x86/amd64 155929 ms_print fails on massif outputs containing long lines 157665 valgrind fails on shmdt(0) after shmat to 0 157748 support x86 PUSHFW/POPFW 158212 helgrind: handle pthread_rwlock_try{rd,wr}lock. 158425 sys_poll incorrectly emulated when RES==0 158744 vex amd64->IR: 0xF0 0x41 0xF 0xC0 (xaddb)
Page 244 162386 ms_print typo in milliseconds time unit for massif 161036 exp-drd: client allocated memory was never freed 162663 signalfd_wrapper fails on 64bit linux (3.3.1.RC1: 2 June 2008, vex r1854, valgrind r8169). (3.3.1: 4 June 2008, vex r1854, valgrind r8180). Release 3.3.0 (7 December 2007) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3.3.0 is a feature release with many signiﬁcant improvements and the...
Page 245 - There is experimental support for AIX 5.3, both 32-bit and 64-bit processes. You need to be running a 64-bit kernel to use Valgrind on a 64-bit executable. - There have been some changes to command line options, which may affect you: * --log-ﬁle-exactly and...
Page 246 OLDER NEWS Cachegrind, Callgrind and Massif. They accept the same %p and %q format speciﬁers that --log-ﬁle accepts. --callgrind-out-ﬁle replaces Callgrind’s old --base option. * Cachegrind’s ’cg_annotate’ script no longer uses the --<pid> option to specify the output ﬁle. Instead, the ﬁrst non-option argument is taken to be the name of the output ﬁle, and any subsequent non-option arguments are taken to be the names of source ﬁles to be annotated.
Page 247 143062 massif crashes on app exit with signal 8 SIGFPE 144453 (get_XCon): Assertion ’xpt->max_children != 0’ failed. 145559 valgrind aborts when malloc_stats is called 145609 valgrind aborts all runs with ’repeated section!’ 145622 --db-attach broken again on x86-64 145837 ==149519...
Page 248 - Internally, the code base has been further factorised and abstractiﬁed, particularly with respect to support for non-Linux OSs. (3.3.0.RC1: 2 Dec 2007, vex r1803, valgrind r7268). (3.3.0.RC2: 5 Dec 2007, vex r1804, valgrind r7282). (3.3.0.RC3: 9 Dec 2007, vex r1804, valgrind r7288).
Page 249 ==133054 132998 startup fails in when running on UML 134207 pkg-conﬁg output contains @VG_PLATFORM@ 134727 valgrind exits with "Value too large for deﬁned data type" n-i-bz ppc32/64: support mcrfs n-i-bz Cachegrind/Callgrind: Update cache parameter detection 135012 x86->IR: 0xD7 0x8A 0xE0 0xD0 (xlat)
Page 250 (3.2.2: 22 Jan 2007, vex r1729, valgrind r6545). Release 3.2.1 (16 Sept 2006) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3.2.1 adds x86/amd64 support for all SSE3 instructions except monitor and mwait, further reduces memcheck’s false error rate on all platforms, adds support for recent binutils (in OpenSUSE 10.2 and...
Page 251 3.2.X: 133154 crash when using client requests to register/deregister stack (3.2.1: 16 Sept 2006, vex r1658, valgrind r6070). Release 3.2.0 (7 June 2006) ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3.2.0 is a feature release with many signiﬁcant improvements and the usual collection of bug ﬁxes.
Page 252 You can get it from http://www.valgrind.org/downloads/guis.html. - Valgrind now works on PPC64/Linux. As with the AMD64/Linux port, this supports programs using to 32G of address space. On 64-bit capable PPC64/Linux setups, you get a dual architecture build so that both 32-bit and 64-bit executables can be run.
Page 253 - A new ﬂag, --error-exitcode=, has been added. This allows changing the exit code in runs where Valgrind reported errors, which is useful when using Valgrind as part of an automated test suite. - Various segfaults when reading old-style "stabs" debug information have been ﬁxed.
Page 254 - The way client requests are encoded in the instruction stream has changed. Unfortunately, this means 3.2.0 will not honour client requests compiled into binaries using headers from earlier versions of Valgrind. We will try to keep the client request encodings more stable in future. BUGS FIXED:...
Page 255 CDROMREADRAW ioctl and CDROMREADTOCENTRY ﬁx 126722 assertion: segment_is_sane at m_aspacemgr/aspacemgr.c:1624 126938 bad checking for syscalls linkat, renameat, symlinkat (3.2.0RC1: 27 May 2006, vex r1626, valgrind r5947). (3.2.0: 7 June 2006, vex r1628, valgrind r5957). Release 3.1.1 (15 March 2006) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3.1.1 ﬁxes a bunch of bugs reported in 3.1.0.
Page 256 On 64-bit machines up to 32GB of space is usable; when using Memcheck that means your program can use up to about 14GB. A side effect of this change is that Valgrind is no longer protected against wild writes by the client. This feature was nice but relied on the x86 segment registers and so wasn’t portable.
Page 257 - Core dumping has been reinstated (it was disabled in 3.0.0 and 3.0.1). If your program crashes while running under Valgrind, a core ﬁle with the name "vgcore.<pid>" will be created (if your settings allow core ﬁle creation). Note that the ﬂoating point information is not all there.
Page 258 110831 Would like to be able to run against both 32 and 64 bit binaries on AMD64 110829 == 110831 111781 compile of valgrind-3.0.0 fails on my linux (gcc 2.X prob) 112670 Cachegrind: cg_main.c:486 (handleOneStatement ... 112941 vex x86: 0xD9 0xF4 (fxtract) 110201 == 112941 113015 vex amd64->IR: 0xE3 0x14 0x48 0x83 (jrcxz)
Page 259 114289 Memcheck fails to intercept malloc when used in an uclibc environment 114756 mbind syscall support 114757 Valgrind dies with assertion: Assertion ’noLargerThan > 0’ failed 114563 stack tracking module not informed when valgrind switches threads 114564 clone() and stacks...
Page 260 - Address space may be limited; see the point about position-independent executables below. - If Valgrind is built on an AMD64 machine, it will only run 64-bit executables. If you want to run 32-bit x86 executables under Valgrind on an AMD64, you will need to build Valgrind on an x86 machine and copy it to the AMD64 machine.
Page 261 - Output can now be printed in XML format. This should make it easier for tools such as GUI front-ends and automated error-processing schemes to use Valgrind output as input. The --xml ﬂag controls this. As part of this change, ELF directory information is read from executables, so absolute source ﬁle paths are available if needed.
Page 262 ﬂags are --log-ﬁle-exactly= and --log-ﬁle-qualiﬁer=. - As part of adding AMD64 support, DWARF2 CFI-based stack unwinding support was added. In principle this means Valgrind can produce meaningful backtraces on x86 code compiled with -fomit-frame-pointer providing you also compile your code with -fasynchronous-unwind-tables.
Page 263 There is a small performance improvement, and a large stability improvement. * On the downside, Valgrind can no longer report misuses of the POSIX PThreads API. It also means that Helgrind currently does not work. We hope to ﬁx these problems in a future release.
Page 264 * Signal handling is much improved and should be very close to what you get when running natively. One result of this is that Valgrind observes changes to sigcontexts passed to signal handlers. Such modiﬁcations will take effect when the signal returns. You will need to run with --single-step=yes to make this useful.
Page 265 93117 Tool and core interface versions do not match 93128 Can’t run valgrind --tool=memcheck because of unimplement... 93174 Valgrind can crash if passed bad args to certain syscalls 93309 Stack frame in new thread is badly aligned 93328 Wrong types used with sys_sigprocmask() 93763 /usr/include/asm/msr.h is missing...
Page 266 - Blocking system calls behave exactly as they do when running natively (not on valgrind). That is, if a syscall blocks only the calling thread when running natively, than it behaves the same on valgrind.
Page 267 Draws pretty .ps pictures of memory use against time. A potentially powerful tool for making sense of your program’s space use. * File descriptor leakage checks. When enabled, Valgrind will print out a list of open ﬁle descriptors on exit.
Page 268 SSE code. * Add support for the POSIX message queue system calls. * Fix to allow 32-bit Valgrind to run on AMD64 boxes. Note: this does NOT allow Valgrind to work with 64-bit executables - only with 32-bit executables on an AMD64 box.
Page 269 2.0.0, might also want to try this release. The following bugs, and probably many more, have been ﬁxed. These are listed at http://bugs.kde.org. Reporting a bug for valgrind in the http://bugs.kde.org is much more likely to get you a ﬁx than mailing developers directly, so please continue to keep sending bugs there.
Page 270 AFAICS: * Rearranged address space layout relative to 2.1.1, so that Valgrind/tools will run out of memory later than currently in many circumstances. This is good news esp. for Calltree. It should be possible for client programs to allocate over 800MB of memory when using memcheck now.
Page 271 OLDER NEWS long-term future. These don’t affect end-users. Most notable user-visible changes are: * Greater isolation between Valgrind and the program being run, so the program is less likely to inadvertently kill Valgrind by doing wild writes. * Massif: a new space proﬁling tool. Try it! It’s cool, and it’ll tell you in detail where and when your C/C++ code is allocating heap.
Page 272 Speciﬁcally: - Blocking system calls behave exactly as they do when running natively (not on valgrind). That is, if a syscall blocks only the calling thread when running natively, than it behaves the same on valgrind.
Page 273 68525: CVS head doesn’t compile on C90 compilers 68566: pkgconﬁg support (wishlist) 68588: Assertion ‘sz == 4’ failed in vg_to_ucode.c (disInstr) 69140: valgrind not able to explicitly specify a path to a binary. 69432: helgrind asserts encountering a MutexErr when there are EraserErr suppressions - Increase the max size of the translation cache from 200k average bbs to 300k average bbs.
Page 274 OLDER NEWS - Don’t fail silently if the executable is statically linked, or is setuid/setgid. Print an error message instead. - Support for old DWARF-1 format line number info. Snapshot 20031012 (12 October 2003) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Three months worth of bug ﬁxes, roughly. Most signiﬁcant single change is improved SSE/SSE2 support, mostly thanks to Dirk Mueller.
Page 275 This may have caused confusing error messages. Snapshot 20030716 (16 July 2003) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 20030716 is a snapshot of our current CVS head (development) branch. This is the branch which will become valgrind-2.0. It contains signiﬁcant enhancements over the 1.9.X branch.
Page 276 OLDER NEWS Despite this being a snapshot of the CVS head, it is believed to be quite stable -- at least as stable as 1.9.6 or 1.0.4, if not more so -- and therefore suitable for widespread use. Please let us know asap if it causes problems for you.
Page 277 OLDER NEWS - Fix assertion failure in pthread_once(). - Fix this: valgrind: vg_intercept.c:598 (vgAllRoadsLeadToRome_select): Assertion ‘ms_end >= ms_now’ failed. - Implement pthread_mutexattr_setpshared. - Understand Pentium 4 branch hints. Also implemented a couple more obscure x86 instructions. - Lots of other minor bug ﬁxes.
Page 278 - Try and avoid assertion failures in mash_LD_PRELOAD_and_LD_LIBRARY_PATH. - Minor bug ﬁxes in cg_annotate. Version 1.9.5 (7 April 2003) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It occurs to me that it would be helpful for valgrind users to record in the source distribution the changes in each release. So I now...
Page 279 - Add support for the munlock system call (124). Some comments about future releases: 1.9.5 is, we hope, the most stable Valgrind so far. It pretty much supersedes the 1.0.X branch. If you are a valgrind packager, please consider making 1.9.5 available to your users. You can regard the...
Page 280 OLDER NEWS are no plans at all for further releases of the 1.0.X branch. If you want a leading-edge valgrind, consider building the cvs head (from SourceForge), or getting a snapshot of it. Current cool stuff going in includes MMX support (done); SSE/SSE2 support (in progress), a signiﬁcant (10-20%) performance improvement (done), and the usual...
Page 281: Readme
SimPoint basic block vector generator. Valgrind is closely tied to details of the CPU, operating system and to a lesser extent, compiler and basic C libraries. This makes it difﬁcult to make it portable. Nonetheless, it is available for the following...
Page 282 6. Run "make install", possibly as root if the destination permissions require that. 7. See if it works. Try "valgrind ls -l". Either this works, or it bombs out with some complaint. In that case, please let us know (see www.valgrind.org).
Page 283: Readme_Missing_Syscall_Or_Ioctl
What are syscall/ioctl wrappers? What do they do? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Valgrind does what it does, in part, by keeping track of everything your program does. When a system call happens, for example a request to read part of a ﬁle, control passes to the Linux kernel, which fulﬁlls the request, and returns control to your program.
Page 284 README_MISSING_SYSCALL_OR_IOCTL /* time_t time(time_t *t); */ PRINT("sys_time ( %p )",ARG1); PRE_REG_READ1(long, "time", int *, t); if (ARG1 != 0) { PRE_MEM_WRITE( "time(t)", ARG1, sizeof(vki_time_t) ); POST(sys_time) if (ARG1 != 0) { POST_MEM_WRITE( ARG1, sizeof(vki_time_t) ); The ﬁrst thing we do happens before the syscall occurs, in the PRE() function. The PRE() function typically starts with invoking to the PRINT() macro.
Page 285 README_MISSING_SYSCALL_OR_IOCTL If Valgrind tells you that system call NNN is unimplemented, do the following: 1. Find out the name of the system call: grep NNN /usr/include/asm/unistd*.h This should tell you something like __NR_mysyscallname. Copy this entry to include/vki/vki-scnums-$(VG_PLATFORM).h. 2. Do ’man 2 mysyscallname’ to get some idea of what the syscall does.
Page 286 PRE(ioctl) and POST(ioctl). There’s a default case, sometimes it isn’t correct and you have to write a more speciﬁc case to get the right behaviour. As above, please create a bug report and attach the patch as described on http://www.valgrind.org.
Page 287: Readme_Developers
6. README_DEVELOPERS Building and not installing it ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To run Valgrind without having to install it, run coregrind/valgrind with the VALGRIND_LIB environment variable set, where <dir> is the root of the source tree (and must be an absolute path). Eg: VALGRIND_LIB=~/grind/head4/.in_place ~/grind/head4/coregrind/valgrind...
Page 288 To debug the valgrind launcher program (<preﬁx>/bin/valgrind) just run it under gdb in the normal way. Debugging the main body of the valgrind code (and/or the code for a particular tool) requires a bit more trickery but can be achieved without too much problem by following these steps: (1) Set VALGRIND_LAUNCHER to point to the valgrind executable.
Page 289 (ie. not an environment variable). A different and possibly easier way is as follows: (1) Run Valgrind as normal, but add the ﬂag --wait-for-gdb=yes. This puts the tool executable into a wait loop soon after it gains control.
Page 290 Memcheck can be used to ﬁnd leaks and use after free in an Inner Valgrind. The Valgrind "big lock" is annotated with helgrind client requests so helgrind and drd can be used to ﬁnd race conditions in an Inner Valgrind.
Page 291 When an outer valgrind runs an inner valgrind, a regression test produces one additional ﬁle <testname>.outer.log which contains the errors detected by the outer valgrind. E.g. for an outer memcheck, it contains the leaks found in the inner, for an outer helgrind or drd, it contains the detected race conditions.
Page 292 README_DEVELOPERS callgrind.out.inner_trunk.me.many-loss-records.22916 callgrind.outer.log.inner_trunk.me.many-loss-records.22916 Printing out problematic blocks ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If you want to print out a disassembly of a particular block that causes a crash, do the following. Try running with "--vex-guest-chase-thresh=0 --trace-ﬂags=10000000 --trace-notbelow=999999". This should print one line for each block translated, and that includes the address.
Page 293: Readme_Packagers
So you can’t build a relocatable RPM / whatever from Valgrind. -- Don’t strip the debug info off lib/valgrind/$platform/vgpreload*.so in the installation tree. Either Valgrind won’t work at all, or it will still work if you do, but will generate less helpful error messages.
Page 294 -- Please test the ﬁnal installation works by running it on something huge. I suggest checking that it can start and exit successfully both Firefox and OpenOfﬁce.org. I use these as test programs, and I know they fairly thoroughly exercise Valgrind. The command lines to use are: valgrind -v --trace-children=yes ﬁrefox valgrind -v --trace-children=yes sofﬁce...
Page 295: Readme.s390
------------ - You need GCC 3.4 or later to compile the s390 port. - A working combination of autotools is required. - To run valgrind a z900 machine or any later model is needed. Limitations ----------- - 31-bit client programs are not supported.
Page 296: Readme.android
9. README.android How to cross-compile for Android. These notes were last updated on 17 Feb 2012, for Valgrind SVN revision 12390/2257. This is known to work at least for : ARM: Android 4.0.3 running on a (rooted, AOSP build) Nexus S.
Page 297 README.android # Then cd to the root of your Valgrind source tree. cd /path/to/valgrind/source/tree # After this point, you don’t need to modify anything; just copy and # paste the commands below. # Set up toolchain paths. # For ARM export AR=$NDKROOT/toolchains/arm-linux-androideabi-4.4.3/prebuilt/linux-x86/bin/arm-linux-androideabi-ar...
Page 298 # where ’mq’ is an alias for ’make --quiet’. # One common cause of runs failing at startup is the inability of # Valgrind to ﬁnd a suitable temporary directory. On the device, # there doesn’t seem to be any one location which we always have # permission to write to.
Page 299: Readme.android_Emulator
10. README.android_emulator How to install and run an android emulator. mkdir android # or any other place you prefer cd android # download java JDK # http://www.oracle.com/technetwork/java/javase/downloads/index.html # download android SDK # http://developer.android.com/sdk/index.html # download android NDK # http://developer.android.com/sdk/ndk/index.html # versions I used: # jdk-7u4-linux-i586.tar.gz # android-ndk-r8-linux-x86.tar.bz2 # android-sdk_r18-linux.tgz...
Page 300 # and see it is working. Note that I usually get # one or two time out from adb shell before it works adb shell # Once the emulator is ready, push your Valgrind to the emulator: adb push Inst / # if you need to debug:...
Page 301: Readme.mips
* --with-pagesize option is used to set default PAGE SIZE. If option is not used, PAGE SIZE is set to value default for platform on which Valgrind is built on. Possible values are 4, 16 of 64 and represent size in kilobytes.
Page 302 README.mips based on newer GCC versions, if possible.
Page 303: Gnu Licenses
GNU Licenses...
Page 304 GNU Licenses Table of Contents 1. The GNU General Public License 2. The GNU Free Documentation License...
Page 305 1. The GNU General Public License GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
Page 306 The GNU General Public License patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone’s free use or not licensed at all. The precise terms and conditions for copying, distribution and modiﬁcation follow.
Page 307 The GNU General Public License interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License.
Page 308 The GNU General Public License control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable.
Page 309 The GNU General Public License circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices.
Page 310 The GNU General Public License PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED...
Page 311 The GNU General Public License The hypothetical commands ‘show w’ and ‘show c’ should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than ‘show w’ and ‘show c’; they could even be mouse-clicks or menu items--whatever suits your program.
Page 312 2. The GNU Free Documentation License GNU Free Documentation License Version 1.2, November 2002 Copyright (C) 2000,2001,2002 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. 0.
Page 313 The GNU Free Documentation License modiﬁcations and/or translated into another language. A "Secondary Section" is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document’s overall subject (or to related matters) and contains nothing that could fall directly within that overall subject.
Page 314 The GNU Free Documentation License the text near the most prominent appearance of the work’s title, preceding the beginning of the body of the text. A section "Entitled XYZ" means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language.
Page 315 The GNU Free Documentation License If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material.
Page 316 The GNU Free Documentation License given on its Title Page, then add an item describing the Modiﬁed Version as stated in the previous sentence. J. Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on.
Page 317 The GNU Free Documentation License versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodiﬁed, and list them all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers. The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy.
Page 318 The GNU Free Documentation License 8. TRANSLATION Translation is considered a kind of modiﬁcation, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections.
Page 319 The GNU Free Documentation License the License in the document and put the following copyright and license notices just after the title page: Copyright (c) YEAR YOUR NAME. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation;...

This manual is also suitable for:

Memcheck Cachegrind Callgrind Helgrind Drd Massif ... Show all

Valgrind Software Quick Start Manual

The Valgrind Quick Start Guide

Valgrind User Manual

1 Introduction

2 Using and Understanding the Valgrind Core

3 Using and Understanding the Valgrind Core: Advanced Topics

4 Memcheck: a Memory Error Detector

5 Cachegrind: a Cache and Branch-Prediction Proﬁler

6 Callgrind: a Call-Graph Generating Cache and Branch Prediction Proﬁler

7 Helgrind: a Thread Error Detector

8 DRD: a Thread Error Detector

9 Massif: a Heap Proﬁler

10 DHAT: a Dynamic Heap Analysis Tool

11 Sgcheck: an Experimental Stack and Global Array Overrun Detector

12 BBV: an Experimental Basic Block Vector Generation Tool

13 Lackey: an Example Tool

14 Nulgrind: the Minimal Valgrind Tool

Valgrind Technical Documentation

1 The Design and Implementation of Valgrind

2 Writing a New Valgrind Tool

3 Callgrind Format Speciﬁcation

Valgrind Distribution Documents

Quick Links

Chapters

Need help?

Questions and answers

Summary of Contents for Valgrind Software