Do you have a question about the Software and is the answer not in the manual?
Questions and answers
Subscribe to Our Youtube Channel
Summary of Contents for Valgrind Software
Page 1
This is the top level of Valgrind’s documentation tree. The documentation is contained in six logically separate documents, as listed in the following Table of Contents. To get started quickly, read the Valgrind Quick Start Guide. For full documentation on Valgrind, read the Valgrind User Manual.
Page 2
Valgrind Documentation Table of Contents The Valgrind Quick Start Guide Valgrind User Manual Valgrind FAQ Valgrind Technical Documentation Valgrind Distribution Documents GNU Licenses...
The Valgrind Quick Start Guide Table of Contents The Valgrind Quick Start Guide 1. Introduction 2. Preparing your program 3. Running your program under Memcheck 4. Interpreting Memcheck’s output 5. Caveats 6. More information...
The Valgrind Quick Start Guide 1. Introduction The Valgrind tool suite provides a number of debugging and profiling tools that help you make your programs faster and more correct. The most popular of these tools is called Memcheck. It can detect many memory-related errors that are common in C and C++ programs and that can lead to crashes and unpredictable behaviour.
Page 6
The Valgrind Quick Start Guide #include <stdlib.h> void f(void) int * x = malloc(10 * sizeof(int)); x[10] = 0; // problem 1: heap block overrun // problem 2: memory leak -- x not freed int main(void) f(); return 0; Most error messages look like the following, which describes problem 1, the heap block overrun:...
OpenOffice.org and Firefox are Memcheck-clean, or very close to it. 6. More information Please consult the Valgrind FAQ and the Valgrind User Manual, which have much more information. Note that the other tools in the Valgrind distribution can be invoked with the --tool option.
Page 9
2.11. Limitations 2.12. An Example Run 2.13. Warning Messages You Might See 3. Using and understanding the Valgrind core: Advanced Topics 3.1. The Client Request mechanism 3.2. Debugging your program using Valgrind gdbserver and GDB 3.2.1. Quick Start: debugging in 3 steps 3.2.2.
Page 10
4.5.3. Putting it all together 4.6. Memcheck Monitor Commands 4.7. Client Requests 4.8. Memory Pools: describing and working with custom allocators 4.9. Debugging MPI Parallel Programs with Valgrind 4.9.1. Building and installing the wrappers 4.9.2. Getting started 4.9.3. Controlling the wrapper library 4.9.4.
Valgrind is closely tied to details of the CPU and operating system, and to a lesser extent, the compiler and basic C libraries. Nonetheless, it supports a number of widely-used platforms, listed in full at http://www.valgrind.org/.
1.2. How to navigate this manual This manual’s structure reflects the structure of Valgrind itself. First, we describe the Valgrind core, how to use it, and the options it supports. Then, each tool has its own chapter in this manual. You only need to read the documentation for the core and for the tool(s) you actually use, although you may find it helpful to be at least a little bit familiar with...
Your program is then run on a synthetic CPU provided by the Valgrind core. As new code is executed for the first time, the core hands the code to the selected tool. The tool adds its own instrumentation code to this and hands the result back to the core, which coordinates the continued execution of this instrumented code.
First off, consider whether it might be beneficial to recompile your application and supporting libraries with debugging info enabled (the -g option). Without debugging info, the best Valgrind tools will be able to do is guess which function a particular piece of code belongs to, which makes both error messages and profiling output nearly useless. With -g, you’ll get messages which point directly to the relevant source code lines.
Page 17
• portnumber: changes the port it listens on from the default (1500). The specified port must be in the range 1024 to 65535. The same restriction applies to port numbers specified by a --log-socket to Valgrind itself. If a Valgrinded process fails to connect to a listener, for whatever reason (the listener isn’t running, invalid or unreachable host or port, etc), Valgrind switches back to writing the commentary to stderr.
(current or freed) heap block, for example reading freed memory, Valgrind reports not only the location where the error happened, but also where the associated heap block was allocated/freed. Valgrind remembers all error reports. When an error is detected, it is compared against old reports, to see if it is a duplicate.
Page 19
flexible specification of errors to suppress. If you use the -v option, at the end of execution, Valgrind prints out one line for each used suppression, giving its name and the number of times it got used. Here’s the suppressions used by a run of valgrind --tool=memcheck ls -l: --27579-- supp: 1 socketcall.connect(serv_addr)/__libc_connect/__nscd_getgrgid_r...
Page 20
Using and understanding the Valgrind core • Next line: a small number of suppression types have extra information after the second line (eg. the Param suppression for Memcheck) • Remaining lines: This is the calling context for the error -- the chain of function calls that led to it. There can be up to 24 of these lines.
2.6. Core Command-line Options As mentioned above, Valgrind’s core accepts a common set of options. The tools also accept tool-specific options, which are documented separately for each tool.
Page 22
Note that Valgrind does trace into the child of a fork (it would be difficult not to, since fork makes an identical copy of a process), so this option is arguably badly named. However, most children of fork calls immediately call exec anyway.
--log-file=<filename> Specifies that Valgrind should send all of its messages to the specified file. If the file name is empty, it causes an abort. There are three special format specifiers that can be used in the file name.
Page 24
So this doesn’t affect the total number of errors reported. The maximum value for this is 500. Note that higher settings will make Valgrind run a bit more slowly and take a bit more memory, but can be useful when working with programs with deeply-nested call chains.
Valgrind detects any errors. This is useful for using Valgrind as part of an automated test suite, since it makes it easy to detect test cases for which Valgrind has reported errors, just by inspecting return codes.
Page 26
The prompt’s behaviour is the same as for the --db-attach option (see below). If you choose to, Valgrind will print out a suppression for this error. You can then cut and paste it into a suppression file if you don’t want to hear about the error in the future.
Page 27
This option allows you to change the threshold to a different value. You should only consider use of this option if Valgrind’s debug output directs you to do so. In that case it will tell you the new threshold you should specify.
This is only really of significance on 32-bit machines. On Linux, you may request a stack of size up to 2GB. Valgrind will stop with a diagnostic message if the stack cannot be allocated. --main-stacksize only affects the stack size for the program’s initial thread. It has no bearing on the size of thread stacks, as Valgrind does not allocate those.
Page 29
When enabled, Valgrind will read information about variable types and locations from DWARF3 debug info. This slows Valgrind down and makes it use more memory, but for the tools that can take advantage of it (Memcheck, Helgrind, DRD) it can result in more precise error messages. For example, here are some standard errors issued by...
Page 30
5000] As part of its main loop, the Valgrind scheduler will poll to check if some activity (such as an external command or some input from a gdb) has to be handled by gdbserver. This activity poll will be done after having run the given number of basic blocks (or slightly more than the given number of basic blocks).
Page 31
Enable special handling for certain system calls that may block in a FUSE file-system. This may be necessary when running Valgrind on a multi-threaded program that uses one thread to manage a FUSE file-system and another thread to access that file-system.
Page 32
* and ? wildcards. --soname-synonyms=syn1=pattern1,syn2=pattern2,... When a shared library is loaded, Valgrind checks for functions in the library that must be replaced or wrapped. For example, Memcheck replaces all malloc related functions (malloc, free, calloc, ...) with its own versions. Such replacements are done by default only in shared libraries whose soname matches a predefined soname pattern (e.g.
The actual thread scheduling remains under control of the OS kernel. What this does mean, though, is that your program will see very different scheduling when run on Valgrind than it does when running normally. This is both because Valgrind is serialising the threads, and because the code runs so much slower than normal.
Depending on your Linux distribution, CPU frequency scaling may be controlled using a graphical interface or using command line such as cpufreq-selector or cpufreq-set. An alternative way to avoid these problems is to tell the OS scheduler to tie a Valgrind process to a specific (fixed) CPU using the taskset command.
• --enable-inner This builds Valgrind with some special magic hacks which make it possible to run it on a standard build of Valgrind (what the developers call "self-hosting"). Ordinarily you should not use this option as various kinds of safety checks are disabled.
Limitations for the known limitations of Valgrind, and for a list of programs which are known not to work on it. All parts of the system make heavy use of assertions and internal self-checks. They are permanently enabled, and we have no plans to disable them.
Page 37
If you regenerate code over the top of old code (ie. at the same memory addresses), if the code is on the stack Valgrind will realise the code has changed, and work correctly. This is necessary to handle the trampolines GCC uses to implemented nested functions. If you regenerate code somewhere other than the stack, and you are running on an 32- or 64-bit x86 CPU, you will need to use the --smc-check=all option, and Valgrind will run more slowly than normal.
Essentially the same: no exceptions, and limited observance of rounding mode. Also, switching the VFP unit into vector mode will cause Valgrind to abort the program -- it has no way to emulate vector uses of VFP at a reasonable performance level.
After 1000 different errors have been detected, Valgrind ignores any more. It seems unlikely that collecting even more different ones would be of practical help to anybody, and it avoids the danger that Valgrind spends more and more of its time comparing new errors against an ever-growing collection. As above, the 1000 number is a...
Page 40
Valgrind spotted such a large change in the stack pointer that it guesses the client is switching to a different stack. At this point it makes a kludgey guess where the base of the new stack is, and sets memory permissions accordingly.
RUNNING_ON_VALGRIND: Returns 1 if running on Valgrind, 0 if running on the real CPU. If you are running Valgrind on itself, returns the number of layers of Valgrind emulation you’re running on.
VALGRIND_NON_SIMD_CALL[0123]: Executes a function in the client program on the real CPU, not the virtual CPU that Valgrind normally runs code on. The function must take an integer (holding a thread ID) as the first argument and then 0, 1, 2 or 3 more arguments (depending on which client request is used).
VALGRIND_STACK_ * calls. Valgrind will use this information to determine if a change to the stack pointer is an item pushed onto the stack or a change over to a new stack. Use this if you’re using a user-level thread package and are noticing spurious errors from Valgrind about uninitialized memory reads.
Valgrind’s gdbserver, communication is done via a pipe and a small helper program called vgdb, which acts as an intermediary. If no GDB is in use, vgdb can also be used to send monitor commands to the Valgrind gdbserver from a shell command line.
Page 45
==2418== Command: ./prog ==2418== ==2418== (action at startup) vgdb me ... GDB (in another shell) can then be connected to the Valgrind gdbserver. For this, GDB must be started on the program prog: gdb ./prog You then indicate to GDB that you want to debug a remote target:...
[Switching to Thread 2479] 0x001f2850 in _start () from /lib/ld-linux.so.2 (gdb) Once GDB is connected to the Valgrind gdbserver, it can be used in the same way as if you were debugging the program natively: • Breakpoints can be inserted or deleted.
GDB server, but you will need to explicitly enable it using the flag --vgdb=yes or --vgdb=full. Additionally, you will need to select a temporary directory which is (a) writable by Valgrind, and (b) supports FIFOs. This is the main difficult point. Often, /sdcard satisfies requirement (a), but fails for (b) because it is a VFAT file system and VFAT does not support pipes.
Page 48
The Valgrind gdbserver will execute the monitor command itself, if it recognises it to be a Valgrind core monitor command. If it is not recognised as such, it is assumed to be tool-specific and is handed to the tool for execution. For example:...
3.2.6. Valgrind gdbserver thread information Valgrind’s gdbserver enriches the output of the GDB info threads command with Valgrind-specific information. The operating system’s thread number is followed by Valgrind’s internal index for that thread ("tid") and by the Valgrind scheduler thread state: (gdb) info threads 4 Thread 6239 (tid 4 VgTs_Yielding) 0x001f2832 in _dl_sysinfo_int80 () from /lib/ld-linux.so...
When Valgrind gdbserver stops on an error, on a breakpoint or when single stepping, registers and flags val- ues might not be always up to date due to the optimisations done by the Valgrind core. The default value --vex-iropt-register-updates=unwindregs-at-mem-access ensures that the registers needed to make a stack trace (typically PC/SP/FP) are up to date at each memory access (i.e.
Page 51
It also has no limitation on the length of the memory zone being watched. Using GDB version 7.4 or later allow full use of the flexibility of the Valgrind gdbserver’s simulated hardware watchpoints.
Page 52
"target" command. Debugging will not work because GDB will then not be able to fetch the registers from the Valgrind gdbserver. For ARM programs using the Thumb instruction set, you must use a GDB version of 7.1 or later, as earlier versions have problems with next/step/breakpoints in Thumb code.
Page 53
When ptrace is disabled in vgdb, a query packet sent by GDB may take significant time to be handled by the Valgrind gdbserver. In such cases, GDB might encounter a protocol timeout. To avoid this, you can increase the value of the timeout by using the GDB command "set remotetimeout".
Usage: vgdb [OPTION]... [[-c] COMMAND]... vgdb ("Valgrind to GDB") is a small program that is used as an intermediary between Valgrind and GDB or a shell. Therefore, it has two usage modes: 1. As a standalone utility, it is used from a shell command line to send monitor commands to a process running under Valgrind.
• -l instructs a standalone vgdb to report the list of the Valgrind gdbserver processes running and then exit. • -D instructs a standalone vgdb to show the state of the shared memory used by the Valgrind gdbserver. vgdb will exit after having shown the Valgrind gdbserver shared memory state.
Page 56
When sent from a standalone vgdb, if this is the last command, the Valgrind process will continue the execution of the guest process. The typical usage of this is to use vgdb to send a "no-op" command to a Valgrind gdbserver so as to continue the execution of the guest process.
To become active, the wrapper merely needs to be present in a text section somewhere in the same process’ address space as the function it wraps, and for its ELF symbol name to be visible to Valgrind. In practice, this means either...
Instead, the result lvalue, OrigFn and arguments are handed to one of a family of macros of the form CALL_FN_ * . These cause Valgrind to call the original and avoid recursion back to the wrapper.
The ability for a wrapper to replace an infinite family of functions is powerful but brings complications in situations where ELF objects appear and disappear (are dlopen’d and dlclose’d) on the fly. Valgrind tries to maintain sensible behaviour in such situations.
A second possible problem is that of conflicting wrappers. It is easily possible to load two or more wrappers, both of which claim to be wrappers for some third function. In such cases Valgrind will complain about conflicting wrappers when the second one appears, and will honour only the first one.
Using and understanding the Valgrind core: Advanced Topics 3.3.6. Limitations - original function signatures As shown in the above example, to call the original you must use a macro of the form CALL_FN_ * . For technical reasons it is impossible to create a single macro to deal with all argument types and numbers, so a family of macros covering the most common cases is supplied.
4. Memcheck: a memory error detector To use this tool, you may specify --tool=memcheck on the Valgrind command line. You don’t have to, though, since Memcheck is the default tool. 4.1. Overview Memcheck is a memory error detector. It can detect the following problems that are common in C and C++ programs.
Memcheck: a memory error detector freed. Likewise, if it should turn out to be just off the end of a heap block, a common result of off-by-one- errors in array subscripting, you’ll be informed of this fact, and also where the block was allocated. If you use option Memcheck will run more slowly but may give a more detailed description of any --read-var-info...
Memcheck: a memory error detector To see information on the sources of uninitialised data in your program, use the --track-origins=yes option. This makes Memcheck run more slowly, but can make it much easier to track down the root causes of uninitialised value errors.
Memcheck: a memory error detector 4.2.4. Illegal frees For example: Invalid free() at 0x4004FFDF: free (vg_clientmalloc.c:577) by 0x80484C7: main (tests/doublefree.c:10) Address 0x3807F7B4 is 0 bytes inside a block of size 177 free’d at 0x4004FFDF: free (vg_clientmalloc.c:577) by 0x80484C7: main (tests/doublefree.c:10) Memcheck keeps track of the blocks allocated by your program with malloc/new, so it can know exactly whether or not the argument to free/delete is legitimate or not.
Memcheck: a memory error detector The worst thing is that on Linux apparently it doesn’t matter if you do mix these up, but the same program may then crash on a different platform, Solaris for example. So it’s best to fix it properly. According to the KDE folks "it’s amazing how many C++ programmers don’t know this".
Page 67
Memcheck: a memory error detector • It might be a pointer to an array of C++ objects (which possess destructors) allocated with new[]. In this case, some compilers store a "magic cookie" containing the array length at the start of the allocated block, and return a pointer to just past that magic cookie, i.e.
Page 68
Memcheck: a memory error detector • "Possibly lost". This covers cases 5--8 (for the BBB blocks) above. This means that a chain of one or more pointers to the block has been found, but at least one of the pointers is an interior-pointer. This could just be a random value in memory that happens to point into a block, and so you shouldn’t consider this ok unless you know you have interior-pointers.
Memcheck: a memory error detector 64 bytes in 4 blocks are still reachable in loss record 2 of 4 at 0x..: malloc (vg_replace_malloc.c:177) by 0x..: mk (leak-cases.c:52) by 0x..: main (leak-cases.c:74) 32 bytes in 2 blocks are indirectly lost in loss record 1 of 4 at 0x..: malloc (vg_replace_malloc.c:177) by 0x..: mk (leak-cases.c:52) by 0x..: main (leak-cases.c:80)
Page 70
Memcheck: a memory error detector --undef-value-errors=<yes|no> [default: yes] Controls whether Memcheck reports uses of undefined value errors. Set this to no if you don’t want to see undefined value errors. It also has the side effect of speeding up Memcheck somewhat. --track-origins=<yes|no>...
Memcheck: a memory error detector --freelist-big-blocks=<number> [default: 1000000] When making blocks from the queue of freed blocks available for re-allocation, Memcheck will in priority re-circulate the blocks with a size greater or equal to --freelist-big-blocks. This ensures that freeing big blocks (in particular freeing blocks bigger than --freelist-vol) does not immediately lead to a re-circulation of all (or a lot of) the small blocks in the free list.
Memcheck: a memory error detector • Addr1, Addr2, Addr4, Addr8, Addr16, meaning an invalid address during a memory access of 1, 2, 4, 8 or 16 bytes respectively. • Jump, meaning an jump to an unaddressable location error. • Param, meaning an invalid system call parameter error. •...
Page 73
Memcheck: a memory error detector int i, j; int a[10], b[10]; for ( i = 0; i < 10; i++ ) { j = a[i]; b[i] = j; Memcheck emits no complaints about this, since it merely copies uninitialised values from a[] into b[], and doesn’t use them in a way which could affect the behaviour of the program.
Memcheck: a memory error detector So s1 occupies 8 bytes, yet only 5 of them will be initialised. For the assignment s2 = s1, GCC generates code to copy all 8 bytes wholesale into s2 without regard for their meaning. If Memcheck simply checked values as they came out of memory, it would yelp every time a structure assignment like this happened.
Until that happens, all attempts to access it will elicit an invalid-address error, as you would hope. 4.6. Memcheck Monitor Commands The Memcheck tool provides monitor commands handled by Valgrind’s built-in gdbserver (see Monitor command handling by the Valgrind...
Page 76
(gdb) The command get_vbits cannot be used with registers. To get the validity bits of a register, you must start Valgrind with the option --vgdb-shadow-registers=yes. The validity bits of a register can be obtained by printing the ’shadow 1’ corresponding register. In the below x86 example, the register eax has all its bits undefined, while the register ebx is fully defined.
Page 77
The first command outputs one entry having an increase in the leaked bytes. The second command is the same as the first command, but uses the abbreviated forms accepted by GDB and the Valgrind gdbserver. It only outputs the summary information, as there was no increase since the previous leak search.
Page 78
==19520== (gdb) Note that when using Valgrind’s gdbserver, it is not necessary to rerun with --leak-check=full --show-reachable=yes to see the reachable blocks. You can obtain the same information without rerunning by using the GDB command monitor leak_check full reachable any (or, using abbreviation: mo l f r a).
Page 79
The second call shows the pointers (start and interior pointers) to block G. The block G (0x40281A8) is reachable via block C (0x40280a8) and register ECX of tid 1 (tid is the Valgrind thread id). It is "interior reachable"...
first byte for which the property is not true. Always returns 0 when not run on Valgrind. • VALGRIND_CHECK_VALUE_IS_DEFINED: a quick and easy way to find out whether Valgrind thinks a particular value (lvalue, to be precise) is addressable and defined.
VALGRIND_CREATE_BLOCK. handle", which is a C int value. You can pass this block handle to VALGRIND_DISCARD. After doing so, Valgrind will no longer relate addressing errors in the specified range to the block. Passing invalid handles to VALGRIND_DISCARD is harmless.
Page 82
Memcheck: a memory error detector Keep in mind that the last two points above say "typically": the Valgrind mempool client request API is intentionally vague about the exact structure of a mempool. There is no specific mention made of headers or superblocks.
Page 83
Memcheck: a memory error detector • VALGRIND_MEMPOOL_ALLOC(pool, addr, size): This request informs Memcheck that a size-byte chunk has been allocated at addr, and associates the chunk with the specified pool. If the pool was created with nonzero rzB redzones, Memcheck will mark the rzB bytes before and after the chunk as NOACCESS. If the pool was created with the is_zeroed argument set, Memcheck will mark the chunk as DEFINED, otherwise Memcheck will mark the chunk as UNDEFINED.
PMPI_Send, or receiving data into a buffer which is too small. Unlike most of the rest of Valgrind, the wrapper library is subject to a BSD-style license, so you can link it into any code base you like. See the top of mpi/libmpiwrap.c for license details.
Memcheck: a memory error detector 4.9.2. Getting started Compile your MPI application as usual, taking care to link it using the same mpicc that your Valgrind build was configured with. Use the following basic scheme to run your application on Valgrind with the wrappers engaged: MPIWRAP_DEBUG=[wrapper-args] LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so...
Memcheck: a memory error detector If you want to use Valgrind’s XML output facility (--xml=yes), you should pass quiet in MPIWRAP_DEBUG so as to get rid of any extraneous printing from the wrappers. 4.9.4. Functions All MPI2 functions except MPI_Wtick, MPI_Wtime and MPI_Pcontrol have wrappers. The first two are not wrapped because they return a double, which Valgrind’s function-wrap mechanism cannot handle (but it could easily...
Some effort is made to mark/check memory ranges corresponding to arrays of values in a single pass. This is important for performance since asking Valgrind to mark/check any range, no matter how small, carries quite a large constant cost. This optimisation is applied to arrays of primitive types (double, float, int, long, long long, short, char, and long double on platforms where sizeof(long double) == 8).
Page 88
Memcheck: a memory error detector A known source of potential false errors are the PMPI_Reduce family of functions, when using a custom (user- defined) reduction function. In a reduction operation, each node notionally sends data to a "central point" which uses the specified reduction function to merge the data items into a single item.
5. Cachegrind: a cache and branch-prediction profiler To use this tool, you must specify --tool=cachegrind on the Valgrind command line. 5.1. Overview Cachegrind simulates how your program interacts with a machine’s cache hierarchy and (optionally) branch predictor. It simulates a machine with independent first-level instruction and data caches (I1 and D1), backed by a unified second-level cache (L2).
First off, as for normal Valgrind use, you probably want to compile with debugging info (the -g option). by contrast with normal Valgrind use, you probably do want to turn optimisation on, since you should profile your program as it will be normally run.
Cachegrind: a cache and branch-prediction profiler can be changed with the --cachegrind-out-file option. This file is human-readable, but is intended to be interpreted by the accompanying program cg_annotate, described in the next section. The default .<pid> suffix on the output file name serves two purposes. Firstly, it means you don’t have to rename old log files that you don’t want to overwrite.
Cachegrind: a cache and branch-prediction profiler • Event sort order: the sort order in which functions are shown. For example, in this case the functions are sorted from highest Ir counts to lowest. If two functions have identical Ir counts, they will then be sorted by I1mr counts, and so on.
files are often not present on a system. If a file is chosen for annotation both manually and automatically, it is marked as User-annotated source. Use the -I/--include option to tell Valgrind where to look for source files if the filenames found from the debugging information aren’t specific enough.
This is because the line number in the struct nlist defined in a.out.h under Linux is only a 16-bit value. Valgrind can handle some files with more than 65,535 lines correctly by making some guesses to identify line number overflows. But some cases are beyond it, in which case you’ll get a warning message explaining that annotations for the file might...
Cachegrind: a cache and branch-prediction profiler • If you compile some files with -g and some without, some events that take place in a file without debug info could be attributed to the last line of a file with debug info (whichever one gets placed before the non-debug-info file in the executable).
Cachegrind: a cache and branch-prediction profiler be negative; this indicates that the counts for the relevant function are fewer in the second version than those in the first version. cg_diff does not attempt to check that the input files come from runs of the same executable. It will happily merge together profile files from completely unrelated programs.
Cachegrind: a cache and branch-prediction profiler --cachegrind-out-file=<file> Write the profile data to file rather than to the default output file, cachegrind.out.<pid>. The %p and %q format specifiers can be used to embed the process ID and/or the contents of an environment variable in the name, as is the case for the core option --log-file.
Cachegrind: a cache and branch-prediction profiler --mod-filename=<expr> [default: none] Specifies a Perl search-and-replace expression that is applied to all filenames. Useful for removing minor differences in paths between two different versions of a program that are sitting in different directories. --mod-funcname=<expr>...
Cachegrind: a cache and branch-prediction profiler enum E { A, B, C }; enum E e; enum E table[] = { 1, 2, 3 }; int i; i += table[e]; This is obviously a contrived example, but the basic principle applies in a wide variety of situations. In short, Cachegrind can tell you where some of the bottlenecks in your code are, but it can’t tell you how to fix them.
Section 2.3 (pages 80-89) for background on modern branch predictors. 5.7.3. Accuracy Valgrind’s cache profiling has a number of shortcomings: • It doesn’t account for kernel activity -- the effect of system calls on the cache and branch predictor contents is ignored.
• It doesn’t account for cache misses not visible at the instruction level, e.g. those arising from TLB misses, or speculative execution. • Valgrind will schedule threads differently from how they would be when running natively. This could warp the results for threaded programs.
6. Callgrind: a call-graph generating cache and branch prediction profiler To use this tool, you must specify --tool=callgrind on the Valgrind command line. 6.1. Overview Callgrind is a profiling tool that records the call history among functions in a program’s run as a call-graph. By default, the collected data consists of the number of instructions executed, their relationship to source lines, the caller/callee relationship between functions, and the numbers of such calls.
As with Cachegrind, you probably want to compile with debugging info (the -g option) and with optimization turned To start a profile run for a program, execute: valgrind --tool=callgrind [callgrind options] your-program [program options] While the simulation is running, you can observe execution with: callgrind_control -b This will print out the current backtrace.
Callgrind: a call-graph generating cache and branch prediction profiler Use --auto=yes to get annotated source code for all relevant functions for which the source can be found. In addition to source annotation as produced by cg_annotate, you will see the annotated call sites with call counts. For all other options, consult the (Cachegrind) documentation for cg_annotate.
Event collection is only possible if instrumentation for program code is enabled. This is the default, but for faster execution (identical to valgrind --tool=none), it can be disabled until the program reaches a state in which you want to start collecting profiling data. Callgrind can start without instrumentation by specifying option --instr-atstart=no.
Callgrind: a call-graph generating cache and branch prediction profiler misses which would not have happened in reality. If you do not want to see these, start event collection a few million instructions after you have enabled instrumentation. 6.2.3. Counting global bus events For access to shared data among threads in a multithreaded code, synchronization is required to avoid raced conditions.
Callgrind: a call-graph generating cache and branch prediction profiler quite capable of avoiding cycles, it has to be used carefully to not cause symbol explosion. The latter imposes large memory requirement for Callgrind with possible out-of-memory conditions, and big profile data files. A further possibility to avoid cycles in Callgrind’s profile data output is to simply leave out given functions in the call graph.
0, never] Dump profile data every count basic blocks. Whether a dump is needed is only checked when Valgrind’s internal scheduler is run. Therefore, the minimum setting useful is about 100000. The count is a 64-bit value to make long dump periods possible.
Specify if you want Callgrind to start simulation and profiling from the beginning of the program. When set to no, Callgrind will not be able to collect any information, including calls, but it will have at most a slowdown of around 4, which is the minimum Valgrind overhead. Instrumentation can be interactively enabled via callgrind_control -i on.
Callgrind: a call-graph generating cache and branch prediction profiler --separate-threads=<no|yes> [default: This option specifies whether profile data should be generated separately for every thread. If yes, the file names get "-threadID" appended. --separate-callers=<callers> [default: Separate contexts by at most <callers> functions in the call chain. See Avoiding cycles.
Specify the size, associativity and line size of the level 1 data cache. --LL=<size>,<associativity>,<line size> Specify the size, associativity and line size of the last-level cache. 6.4. Callgrind Monitor Commands The Callgrind tool provides monitor commands handled by the Valgrind gdbserver (see Monitor command handling by the Valgrind gdbserver).
Callgrind: a call-graph generating cache and branch prediction profiler • instrumentation [on|off] requests to set (if parameter on/off is given) or get the current instrumentation state. • status requests to print out some status information. 6.5. Callgrind specific client requests Callgrind provides the following specific client requests in callgrind.h.
Callgrind: a call-graph generating cache and branch prediction profiler --auto=<yes|no> [default: Annotate all source files containing functions that helped reach the event count threshold. --context=N [default: Print N lines of context before and after annotated lines. --inclusive=<yes|no> [default: Add subroutine costs to functions calls. --tree=<none|caller|calling|both>...
Page 117
Switch instrumentation mode on or off. If a Callgrind run has instrumentation disabled, no simulation is done and no events are counted. This is useful to skip uninteresting program parts, as there is much less slowdown (same as with the Valgrind tool "none"). See also the Callgrind option --instr-atstart. -w=<dir>...
To use this tool, you must specify --tool=helgrind on the Valgrind command line. 7.1. Overview Helgrind is a Valgrind tool for detecting synchronisation errors in C, C++ and Fortran programs that use the POSIX pthreads threading primitives. The main abstractions in POSIX pthreads are: a set of threads sharing a common address space, thread creation, thread joining, thread exit, mutexes (locks), condition variables (inter-thread event notifications), reader-writer locks,...
Helgrind: a thread error detector • destroying an invalid or a locked mutex • recursively locking a non-recursive mutex • deallocation of memory that contains a locked mutex • passing mutex arguments to functions expecting reader-writer lock arguments, and vice versa •...
Page 120
Helgrind: a thread error detector In this section, and in general, to "acquire" a lock simply means to lock that lock, and to "release" a lock means to unlock it. Helgrind monitors the order in which threads acquire locks. This allows it to detect potential deadlocks which could arise from the formation of cycles of locks.
Helgrind: a thread error detector Thread #6: lock order "0x6010C0 before 0x601160" violated Observed (incorrect) order is: acquisition of lock at 0x601160 (stack unavailable) followed by a later acquisition of lock at 0x6010C0 at 0x4C2BC62: pthread_mutex_lock (hg_intercepts.c:494) by 0x4007DE: dine (tc14_laog_dinphils.c:19) by 0x4C2CBE7: mythread_wrapper (hg_intercepts.c:219) by 0x4E369C9: start_thread (pthread_create.c:300) 7.4.
Page 122
Helgrind: a thread error detector Thread #1 is the program’s root thread Thread #2 was created at 0x511C08E: clone (in /lib64/libc-2.8.so) by 0x4E333A4: do_clone (in /lib64/libpthread-2.8.so) by 0x4E33A30: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.8.so) by 0x4C299D4: pthread_create@ * (hg_intercepts.c:214) by 0x400605: main (simple_race.c:12) Possible data race during read of size 4 at 0x601038 by thread #1 Locks held: none at 0x400606: main (simple_race.c:13)
Helgrind: a thread error detector The following section explains Helgrind’s race detection algorithm in more detail. 7.4.2. Helgrind’s Race Detection Algorithm Most programmers think about threaded programming in terms of the basic functionality provided by the threading library (POSIX Pthreads): thread creation, thread joining, locks, condition variables, semaphores and barriers. The effect of using these functions is to impose constraints upon the order in which memory accesses can happen.
Page 124
Helgrind: a thread error detector Parent thread: Child thread: int var; // create child thread pthread_create(...) var = 20; // send message to child // wait for message to arrive var = 10; exit // wait for child pthread_join(...) printf("%d\n", var); Now the program reliably prints "10", regardless of the speed of the threads.
Helgrind: a thread error detector • When a condition variable (CV) is signalled on by thread T1 and some other thread T2 is thereby released from a wait on the same CV, then the memory accesses in T1 prior to the signalling must happen-before those in T2 after it returns from the wait.
Page 126
Helgrind: a thread error detector Thread #2 was created at 0x511C08E: clone (in /lib64/libc-2.8.so) by 0x4E333A4: do_clone (in /lib64/libpthread-2.8.so) by 0x4E33A30: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.8.so) by 0x4C299D4: pthread_create@ * (hg_intercepts.c:214) by 0x4008F2: main (tc21_pthonce.c:86) Thread #3 was created at 0x511C08E: clone (in /lib64/libc-2.8.so) by 0x4E333A4: do_clone (in /lib64/libpthread-2.8.so) by 0x4E33A30: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.8.so) by 0x4C299D4: pthread_create@ * (hg_intercepts.c:214)
Helgrind: a thread error detector The first thing to do is examine the source locations referred to by each call stack. They should both show an access to the same location, or variable. Now figure out how how that location should have been made thread-safe: •...
Page 128
Helgrind: a thread error detector • Qt version 4.X. Qt 3.X is harmless in that it only uses POSIX pthreads primitives. Unfortunately Qt 4.X has its • Runtime support library for GNU OpenMP (part of GCC), at least for GCC versions 4.2 and 4.3. The GNU own implementation of mutexes (QMutex) and thread reaping.
Page 129
Helgrind: a thread error detector 2. Avoid memory recycling. If you can’t avoid it, you must use tell Helgrind what is going on via the VALGRIND_HG_CLEAN_MEMORY client request (in helgrind.h). Helgrind is aware of standard heap memory allocation and deallocation that occurs via malloc/free/new/delete and from entry and exit of stack frames.
This functionality is new in Valgrind 3.7.0, and is regarded as experimental. It is not enabled by default because its interaction with custom memory allocators is not well understood at present. User feedback is welcomed.
Page 131
Helgrind: a thread error detector --track-lockorders=no|yes [default: yes] When enabled (the default), Helgrind performs lock order consistency checking. For some buggy programs, the large number of lock order errors reported can become annoying, particularly if you’re only interested in race errors. You may therefore find it helpful to disable lock order checking.
Helgrind: a thread error detector --check-stack-refs=no|yes [default: yes] By default Helgrind checks all data memory accesses made by your program. This flag enables you to skip checking for accesses to thread stacks (local variables). This can improve performance, but comes at the cost of missing races on stack-allocated data.
8.1. Overview DRD is a Valgrind tool for detecting errors in multithreaded C and C++ programs. The tool works for any program that uses the POSIX threading primitives or that uses threading concepts built on top of the POSIX threading primitives.
DRD: a thread error detector • A shared address space. All threads running within the same process share the same address space. All data, whether shared or not, is identified by its address. • Regular load and store operations, which allow to read values from or to write values to the memory shared by all threads running in the same process.
DRD: a thread error detector 2. Synchronization operations determine certain ordering constraints on memory operations performed by different threads. These ordering constraints are called the synchronization order. The combination of program order and synchronization order is called the happens-before relationship. This concept was first defined by S.
Page 136
• Don’t enable this option when using reference-counted objects because that will result in false pos- itives, even when that code has been annotated properly with ANNOTATE_HAPPENS_BEFORE and ANNOTATE_HAPPENS_AFTER. See e.g. the output of the following command for an example: valgrind --tool=drd --free-is-write=yes drd/tests/annotate_smart_pointer. --report-signal-unlocked=<yes|no> [default:...
DRD: a thread error detector --ptrace-addr=<address> [default: none] Trace all load and store activity for the specified address and keep doing that even after the memory at that address has been freed and reallocated. --trace-alloc=<yes|no> [default: Trace all memory allocations and deallocations. May produce a huge amount of output. --trace-barrier=<yes|no>...
Page 138
DRD: a thread error detector Below you can find an example of a message printed by DRD when it detects a data race: $ valgrind --tool=drd --read-var-info=yes drd/tests/rwlock_race ==9466== Thread 3: ==9466== Conflicting load by thread 3 at 0x006020b8 size 4 ==9466== at 0x400B6C: thread_func (rwlock_race.c:29)
Lock contention causes delays. Such delays should be as short as possible. The two command line options --exclusive-threshold=<n> and --shared-threshold=<n> make it possible to detect excessive lock contention by making DRD report any lock that has been held longer than the specified threshold. An example: $ valgrind --tool=drd --exclusive-threshold=10 drd/tests/hold_lock -i 500 ==10668== Acquired at: ==10668== at 0x4C267C8: pthread_mutex_lock (drd_pthread_intercepts.c:395)
8.2.5. Client Requests Just as for other Valgrind tools it is possible to let a client program interact with the DRD tool through client requests. In addition to the client requests several macros have been defined that allow to use the client requests in a convenient way.
Page 141
DRD: a thread error detector • The macro DRD_STOP_IGNORING_VAR(x) and the corresponding client request VG_USERREQ__DRD_FINISH_SUPPRESSIO Tell DRD to no longer ignore data races for the address range that was suppressed either via the macro DRD_IGNORE_VAR(x) or via the client request VG_USERREQ__DRD_START_SUPPRESSION. •...
Page 142
• The macro ANNOTATE_THREAD_NAME(name) tells DRD to associate the specified name with the current thread and to include this name in the error messages printed by DRD. • The macros VALGRIND_MALLOCLIKE_BLOCK and VALGRIND_FREELIKE_BLOCK from the Valgrind core are implemented; they are described in The Client Request mechanism.
Note: if you compiled Valgrind yourself, the header file <valgrind/drd.h> will have been installed in the directory /usr/include by the command make install. If you obtained Valgrind by installing it as a package however, you will probably have to install another package with a name like valgrind-devel before Valgrind’s header files are available.
DRD tracks all memory allocation events that happen via the standard memory allocation and deallocation functions (malloc, free, new and delete), via entry and exit of stack frames or that have been annotated with Valgrind’s memory pool client requests. DRD uses memory allocation and deallocation information for two purposes:...
DRD: a thread error detector • To know where the scope ends of POSIX objects that have not been destroyed explicitly. It is e.g. not required by the POSIX threads standard to call pthread_mutex_destroy before freeing the memory in which a mutex object resides.
DRD: a thread error detector • Compile with option -O1 instead of -O0. This will reduce the amount of generated code, may reduce the amount of debug info and will speed up DRD’s processing of the client program. For more information, see also Getting started.
The older LinuxThreads library is not supported. 8.5. Feedback If you have any comments, suggestions, feedback or bug reports about DRD, feel free to either post a message on the Valgrind users mailing list or to file a bug report. See also http://www.valgrind.org/ for more information.
9.2. Using Massif and ms_print First off, as for the other Valgrind tools, you should compile with debugging info (the -g option). It shouldn’t matter much what optimisation level you compile your program with, as this is unlikely to affect the heap memory usage.
To gather heap profiling information about the program prog, type: valgrind --tool=massif prog The program will execute (slowly). Upon completion, no summary statistics are printed to Valgrind’s commentary; all of Massif’s profiling data is written to a file. By default, this file is called massif.out.<pid>, where <pid>...
Massif: a heap profiler ms_print massif.out.12345 ms_print will produce (a) a graph showing the memory consumption over the program’s execution, and (b) detailed information about the responsible allocation sites at various points in the program, including the point of peak memory allocation.
Page 151
Massif: a heap profiler 19.63^ 0 +----------------------------------------------------------------------->ki Number of snapshots: 25 Detailed snapshots: [9, 14 (peak), 24] Why is most of the graph empty, with only a couple of bars at the very end? By default, Massif uses "instructions executed" as the unit of time. For very short-run programs such as the example, most of the executed instructions involve the loading and dynamic linking of the program.
Massif: a heap profiler • Peak snapshots are only ever taken after a deallocation happens. This avoids lots of unnecessary peak snapshot recordings (imagine what happens if your program allocates a lot of heap blocks in succession, hitting a new peak every time).
Page 154
Massif: a heap profiler -------------------------------------------------------------------------------- time(B) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 1,008 1,008 1,000 2,016 2,016 2,000 3,024 3,024 3,000 4,032 4,032 4,000 5,040 5,040 5,000 6,048 6,048 6,000 7,056 7,056 7,000 8,064 8,064 8,000 Each normal snapshot records several things. •...
Page 155
Massif: a heap profiler The next snapshot is detailed. As well as the basic counts, it gives an allocation tree which indicates exactly which pieces of code were responsible for allocating heap memory: 9,072 9,072 9,000 99.21% (9,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.21% (9,000B) 0x804841A: main (example.c:20) The allocation tree can be read from the top down.
Page 156
Massif: a heap profiler distinct stack traces in the tree. In contrast, if B calls A repeatedly from line 15 (e.g. due to a loop), then each of those calls will be represented by the same stack trace in the tree. Note also that each tree entry with children in the example satisfies an invariant: the entry’s size is equal to the sum of its children’s sizes.
Massif: a heap profiler responsible for more than 1% of useful memory bytes, and ms_print likewise only prints the details for code locations responsible for more than 1%. The entries that do not meet this threshold are aggregated. This avoids filling up the output with large numbers of unimportant entries.
Massif: a heap profiler 9.2.9. Acting on Massif’s Information Massif’s information is generally fairly easy to act upon. The obvious place to start looking is the peak snapshot. It can also be useful to look at the overall shape of the graph, to see if memory usage climbs and falls as you expect; spikes in the graph might be worth investigating.
Page 159
Massif: a heap profiler --alloc-fn=<name> Functions specified with this option will be treated as though they were a heap allocation function such as malloc. This is useful for functions that are wrappers to malloc or new, which can fill up the allocation trees with uninteresting information.
ID and/or the contents of an environment variable in the name, as is the case for the core option --log-file. 9.4. Massif Monitor Commands The Massif tool provides monitor commands handled by the Valgrind gdbserver (see Monitor command handling by the Valgrind gdbserver).
10. DHAT: a dynamic heap analysis tool To use this tool, you must specify --tool=exp-dhat on the Valgrind command line. 10.1. Overview DHAT is a tool for examining how programs use their heap allocations. It tracks the allocated blocks, and inspects every memory access to find which block, if any, it is to. The following data is collected and presented per allocation point (allocation stack): •...
DHAT: a dynamic heap analysis tool As with the Massif heap profiler, DHAT measures program progress by counting instructions, and so presents all age/time related figures as instruction counts. This sounds a little odd at first, but it makes runs repeatable in a way which is not possible if CPU time is used.
DHAT: a dynamic heap analysis tool perform such an analysis. We can see that they must have varying sizes since the average block size, 61.13, isn’t a whole number. 10.2.2.2. A more suspicious looking example max-live: 180,224 in 22 blocks tot-alloc: 180,224 in 22 blocks (avg size 8192.00) deaths:...
DHAT: a dynamic heap analysis tool max-live: 317,408 in 5,668 blocks tot-alloc: 317,408 in 5,668 blocks (avg size 56.00) deaths: 5,668, at avg age 622,890,597 acc-ratios: 1.03 rd, 1.28 wr (327,642 b-read, 408,172 b-written) at 0x4C275B8: malloc (vg_replace_malloc.c:236) by 0x5440C16: QDesignerPropertySheetPrivate::ensureInfo (qhash.h:515) by 0x544350B: QDesignerPropertySheet::setVisible (qdesigner_propertysh...) by 0x5446232: QDesignerPropertySheet::QDesignerPropertySheet (qdesigne...) Aggregated access counts by offset:...
Page 166
DHAT: a dynamic heap analysis tool --show-top-n=<number> [default: At the end of the run, DHAT sorts the accumulated allocation points according to some metric, and shows the highest scoring entries. --show-top-n controls how many entries are shown. The default of 10 is quite small. realistic applications you will probably need to set it much higher, at least several hundred.
11. SGCheck: an experimental stack and global array overrun detector To use this tool, you must specify --tool=exp-sgcheck on the Valgrind command line. 11.1. Overview SGCheck is a tool for finding overruns of stack and global arrays. It works by using a heuristic approach derived from an observation about the likely forms of stack and global array accesses.
It is hard to see how to get around this problem. The only mitigating factor is that such constructions appear very rare, at least judging from the results using the tool so far. Such a construction appears only once in the Valgrind sources (running Valgrind on Valgrind) and perhaps two or three times for a start and exit of Firefox.
SGCheck: an experimental stack and global array overrun detector • Coverage: Stack and global checking is fragile. If a shared object does not have debug information attached, then SGCheck will not be able to determine the bounds of any stack or global arrays defined within that shared object, and so will not be able to check accesses to them.
12.2. Using Basic Block Vectors to create SimPoints To quickly create a basic block vector file, you will call Valgrind like this: valgrind --tool=exp-bbv /bin/ls In this case we are running on /bin/ls, but this can be any program. By default a file called bb.out.PID will be created, where PID is replaced by the process ID of the running process.
BBV: an experimental basic block vector generation tool The outputs from the SimPoint run are the results.simpts and results.weights files. The first holds the 5 most relevant intervals of the program. The seconds holds the weight to scale each interval by when extrapolating full-program behavior.
SimPoint utility ignores them. 12.5. Implementation Valgrind provides all of the information necessary to create BBV files. In the current implementation, all instructions are instrumented. This is slower (by approximately a factor of two) than a method that instruments at the basic block level, but there are some complications (especially with rep prefix detection) that make that method more difficult.
BBV: an experimental basic block vector generation tool Binary Instrumentation to Generate Multi-Platform SimPoints: Methodology and Accuracy" by V.M. Weaver and S.A. McKee. 12.8. Performance Using this program slows down execution by roughly a factor of 40 over native execution. This varies depending on the machine used and the benchmark being run.
13.1. Overview Lackey is a simple Valgrind tool that does various kinds of basic program measurement. It adds quite a lot of simple instrumentation to the program’s code. It is primarily intended to be of use as an example tool, and consequently emphasises clarity of implementation over performance.
To use this tool, you must specify --tool=none on the Valgrind command line. 14.1. Overview Nulgrind is the simplest possible Valgrind tool. It performs no instrumentation or analysis of a program, just runs it normally. It is mainly of use for Valgrind’s developers for debugging and regression testing.
Page 178
3.2. My (buggy) program dies like this: 3.3. My program dies, printing a message like this along the way: 3.4. I tried running a Java program (or another program that uses a just-in-time compiler) under Valgrind but something went wrong. Does Valgrind handle such programs? 4.
Page 179
"Heimdal". Keeping with the Nordic theme, Valgrind was chosen. Valgrind is the name of the main entrance to Valhalla (the Hall of the Chosen Slain in Asgard). Over this entrance there resides a wolf and over it there is the head of a boar and on it perches a huge eagle, whose eyes can see to the far regions of the nine worlds.
Page 180
Valgrind Frequently Asked Questions 3. Valgrind aborts unexpectedly 3.1. Programs run OK on Valgrind, but at exit produce a bunch of errors involving __libc_freeres and then die with a segmentation fault. When the program exits, Valgrind runs the procedure __libc_freeres in glibc.
Page 181
--smc-check=all. Apart from this, in theory Valgrind can run any Java program just fine, even those that use JNI and are partially implemented in other languages like C and C++. In practice, Java implementations tend to do nasty things that most programs do not, and Valgrind sometimes falls over these corner cases.
Page 182
Also, for leak reports involving shared objects, if the shared object is unloaded before the program terminates, Valgrind will discard the debug information and the error message will be full of ??? entries. The workaround here is to avoid calling dlclose on these shared objects.
Page 183
Valgrind. There isn’t anything you can do to change this, it’s just the nature of the way Valgrind works that it cannot exactly replicate a native execution environment. In the case where your program crashes due to a memory error when run natively but not when run under Valgrind, in most cases Memcheck should identify the bad memory operation.
Page 184
Valgrind Frequently Asked Questions Second, if your program is statically linked, most Valgrind tools won’t work as well, because they won’t be able to replace certain functions, such as malloc, with their own versions. A key indicator of this is if...
Page 185
If you think an answer in this FAQ is incomplete or inaccurate, please e-mail valgrind@valgrind.org. If you have tried all of these things and are still stuck, you can try mailing the valgrind-users mailing list. Note that an email has a better change of being answered usefully if it is clearly written.
Page 187
Valgrind Technical Documentation Table of Contents 1. The Design and Implementation of Valgrind 2. Writing a New Valgrind Tool 2.1. Introduction 2.2. Basics 2.2.1. How tools work 2.2.2. Getting the code 2.2.3. Getting started 2.2.4. Writing the code 2.2.5. Initialisation 2.2.6.
A number of academic publications nicely describe many aspects of Valgrind’s design and implementation. Online copies of all of them, and others, are available on the Valgrind publications page. The following paper gives a good overview of Valgrind, and explains how it differs from other dynamic binary instrumentation frameworks such as Pin and DynamoRIO.
Tools must define various functions for instrumenting programs that are called by Valgrind’s core. They are then linked against Valgrind’s core to define a complete Valgrind tool which will be used when the --tool option is used to select it.
(almost any program should work; date is just an example). The output should be something like this: ==738== foobar-0.0.1, a foobarring tool. ==738== Copyright (C) 2002-2009, and GNU GPL’d, by J. Programmer. ==738== Using Valgrind-3.5.0.SVN and LibVEX; rerun with -h for copyright info ==738== Command: date ==738==...
More information about "details", "needs" and "trackable events" can be found in include/pub_tool_tooliface.h. 2.2.6. Instrumentation instrument is the interesting one. It allows you to instrument VEX IR, which is Valgrind’s RISC-like intermediate language. VEX IR is described in the comments of the header file VEX/pub/libvex_ir.h.
C library, details of which are in pub_tool_libc * .h. When writing a tool, in theory you shouldn’t need to look at any of the code in Valgrind’s core, but in practice it might be useful sometimes to help understand something.
If you are feeling conscientious and want to write some documentation for your tool, please use XML as the rest of Valgrind does. The file docs/README has more details on getting the XML toolchain to work; this can be difficult, unfortunately.
3. Write the tests, .vgtest test description files, .stdout.exp and .stderr.exp expected output files. (Note that Valgrind’s output goes to stderr.) Some details on writing and running tests are given in the comments at the top of the testing script tests/vg_regtest.
Writing a New Valgrind Tool 2.4. Final Words Writing a new Valgrind tool is not easy, but the tools you can write with Valgrind are among the most powerful programming tools there are. Happy programming!
The event names in the following example are quite arbitrary, and are not related to event names used by Callgrind. Especially, cycle counts matching real processors probably will never be generated by any Valgrind tools, as these are bound to simulations of simple machine models for acceptable slowdown. However, any profiling tool could use the format described in this chapter.
Callgrind Format Specification line 16 in file file.f, taking 20 CPU cycles. If a cost line specifies less event counts than given in the "events" line, the rest is assumed to be zero. I.e. there was no floating point instruction executed relating to line 16. Note that regular cost lines always give self (also called exclusive) cost of code at a given position.
Callgrind Format Specification One can see that in main only code from line 16 is executed where also the other functions are called. Inclusive cost of main is 820, which is the sum of self cost 20 and costs spent in the calls: 400 for the single call to func1 and 400 as sum for the three calls to func2.
Callgrind Format Specification events: Instructions # define file ID mapping fl=(1) file1.c fl=(2) file2.c # define function ID mapping fn=(1) main fn=(2) func1 fn=(3) func2 fl=(1) fn=(1) 16 20 3.1.6. Subposition Compression If a Callgrind data file should hold costs for each assembler instruction of a program, you specify subposition "instr" in the "positions:"...
Callgrind Format Specification Remark: For assembler annotation to work, instruction addresses have to be corrected to correspond to addresses found in the original binary. I.e. for relocatable shared objects, often a load offset has to be subtracted. 3.1.7. Miscellaneous 3.1.7.1. Cost Summary Information For the visualization to be able to show cost percentage, a sum of the cost of the full run has to be known.
Page 201
Callgrind Format Specification PartDetail := TargetCommand | TargetID TargetCommand := "cmd:" Space * NoNewLineChar * TargetID := ("pid"|"thread"|"part") ":" Space * Number Description := "desc:" Space * Name Space * ":" NoNewLineChar * EventSpecification := "event:" Space * Name InheritedDef? LongNameDef? InheritedDef := "="...
Page 203
Callgrind Format Specification • version: number [Callgrind] This is used to distinguish future profile data formats. A major version of 0 or 1 is supposed to be upwards compatible with Cachegrind’s format. It is optional; if not appearing, version 1 is supposed. Otherwise, this has to be the first header line.
Callgrind Format Specification • summary: costs [Callgrind] costs [Cachegrind] totals: The value or the total number of events covered by this trace file. Both keys have the same meaning, but the "totals:" line happens to be at the end of the file, while "summary:" appears in the header. This was added to allow postprocessing tools to know in advance to total cost.
Page 205
Callgrind Format Specification • jump=count target position [Callgrind] Unconditional jump, executed count times, to the given target position. • jcnd=exe.count jumpcount target position [Callgrind] Conditional jump, executed exe.count times with jumpcount jumps to the given target position.
1. AUTHORS Julian Seward was the original founder, designer and author of Valgrind, created the dynamic translation frameworks, wrote Memcheck, the 3.X versions of Helgrind, SGCheck, DHAT, and did lots of other things. Nicholas Nethercote did the core/tool generalisation, wrote Cachegrind and Massif, and tons of other stuff.
Page 209
Jakub Jelinek helped out with the AVX support. Many, many people sent bug reports, patches, and helpful feedback. Development of Valgrind was supported in part by the Tri-Lab Partners (Lawrence Livermore National Laboratory, Los Alamos National Laboratory, and Sandia National Laboratories) of the U.S. Department...
There is initial support for MacOSX 10.8, but it is not usable for serious work at present. * ================== PLATFORM CHANGES ================= * Support for MIPS32 platforms running Linux. Valgrind has been tested on MIPS32 and MIPS32r2 platforms running different Debian Squeeze and MeeGo distributions. Both little-endian and big-endian cores are supported.
Page 211
NEWS - The leak_check GDB server monitor command now can control the maximum nr of loss records to output. - Reduction of memory use for applications allocating many blocks and/or having many partially defined bytes. - Addition of GDB server monitor command ’block_list’ that lists the addresses/sizes of the blocks of a leak search loss record.
Page 212
NEWS and DRD. * For tool developers: support to run Valgrind on Valgrind has been improved. We can now routinely Valgrind on Helgrind or Memcheck. * gdbserver now shows the float shadow registers as integer rather than float values, as the shadow values are mostly used as bit patterns.
Page 213
293755 == 293754 (No tests for PCMPxSTRx on 16-bit characters) 293808 CLFLUSH not supported by latest VEX for amd64 294047 valgrind does not correctly emulate prlimit64(..., RLIMIT_NOFILE, ...) 294048 MPSADBW instruction not implemented 294055 regtest none/tests/shell fails when locale is not set to C...
Page 214
298421 accept4() syscall (366) support is missing for ARM 298718 vex amd64->IR: 0xF 0xB1 0xCB 0x9C 0x8F 0x45 298732 valgrind installation problem in ubuntu with kernel version 3.x 298862 POWER Processor DFP instruction support missing, part 4 298864 DWARF reader mis-parses DW_FORM_ref_addr 298943 massif asserts with --pages-as-heap=yes when brk is changing [..]...
Page 215
302578 Unrecognized isntruction 0xc5 0x32 0xc2 0xca 0x09 vcmpngess 302656 == 273475 (Add support for AVX instructions) 302709 valgrind for ARM needs extra tls support for android emulator [..] 302827 add wrapper for CDROM_GET_CAPABILITY 302901 Valgrind crashes with dwz optimized debuginfo...
Page 216
This release also supports MacOSX 10.6, but drops support for 10.5. * Preliminary support for Android (on ARM). Valgrind can now run large applications (eg, Firefox) on (eg) a Samsung Nexus S. See README.android for more details, plus instructions on how to get started.
Page 217
("Stack and Global Array Checking"). * ==================== OTHER CHANGES ==================== * GDB server: Valgrind now has an embedded GDB server. That means it is possible to control a Valgrind run from GDB, doing all the usual things that GDB can do (single stepping, breakpoints, examining data, etc).
Page 218
268621 s390x: improve IR generation for XC 268715 s390x: FLOGR is not universally available 268792 == 267997 (valgrind seg faults on startup when compiled with Xcode 4) 268930 s390x: MHY is not universally available 269078 arm->IR: unhandled instruction SUB (SP minus immediate/register) 269079 Support ptrace system call on ARM 269144 missing "Bad option"...
Page 219
270959 s390x: invalid use of R0 as base register 271042 VSX configure check fails when it should not 271043 Valgrind build fails with assembler error on ppc64 with binutils 2.21 271259 s390x: fix code confusion 271337 == 267997 (Valgrind segfaults on MacOS X)
Page 220
280290 vex amd64->IR: 0x66 0xF 0x38 0x28 0xC1 0x66 0xF 0x6F 280710 s390x: config files for nightly builds 280757 /tmp dir still used by valgrind even if TMPDIR is specified 280965 Valgrind breaks fcntl locks when program does mmap 281138 WARNING: unhandled syscall: 340 281241 == 275168 (valgrind useless on Macos 10.7.1 Lion)
Page 221
XXXXXX is the bug number as listed below. 188572 Valgrind on Mac should suppress setenv() mem leak 194402 vex amd64->IR: 0x48 0xF 0xAE 0x4 (proper FX{SAVE,RSTOR} support) 210481 vex amd64->IR: Assertion ‘sz == 2 || sz == 4’ failed (REX.W POPQ)
Page 222
258870 (SSE4.x) Add support for EXTRACTPS SSE 4.1 instruction 261966 (SSE4.x) support for CRC32B and CRC32Q is lacking (also CRC32{W,L}) 262985 VEX regression in valgrind 3.6.0 in handling PowerPC VMX 262995 (SSE4.x) crash when trying to valgrind gcc-snapshot (PCMPxSTRx $0) 263099 callgrind_annotate counts Ir improperly [...]...
Page 223
* Support for ARM/Linux. Valgrind now runs on ARMv7 capable CPUs running Linux. It is known to work on Ubuntu 10.04, Ubuntu 10.10, and Maemo 5, so you can run Valgrind on your Nokia N900 if you want. This requires a CPU capable of running the ARMv7-A instruction set (Cortex A5, A8 and A9).
Page 224
NEWS useful for giving a general idea about a program’s locality. * Massif has a new option, --pages-as-heap, which is disabled by default. When enabled, instead of tracking allocations at the level of heap blocks (as allocated with malloc/new/new[]), it instead tracks memory allocations at the level of memory pages (as mapped by mmap, brk, etc).
Page 225
* Improved support for the Valkyrie GUI, version 2.0.0. GUI output and control of Valgrind is now available for the tools Memcheck and Helgrind. XML output from Valgrind is available for Memcheck, Helgrind and exp-Ptrcheck.
Page 226
205241 Snow Leopard 10.6 support (partial fix) 206600 Leak checker fails to upgrade indirect blocks when their parent becomes reachable 210935 port valgrind.h (not valgrind) to win32 so apps run under wine can make client requests 211410 vex amd64->IR: 0x15 0xFF 0xFF 0x0 0x0 0x89...
Page 227
245535 print full path names in plain text reports 245925 x86-64 red zone handling problem 246258 Valgrind not catching integer underruns + new [] s 246311 reg/reg cmpxchg doesn’t work on amd64 246549 unhandled syscall unix:277 while testing 32-bit Darwin app 246888 Improve Makefile.vex.am...
Page 228
Release 3.5.0 (19 August 2009) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3.5.0 is a feature release with many significant improvements and the usual collection of bug fixes. The main improvement is that Valgrind now works on Mac OS X. This release supports X86/Linux, AMD64/Linux, PPC32/Linux, PPC64/Linux and X86/Darwin.
Page 229
NEWS * Valgrind now runs on Mac OS X. (Note that Mac OS X is sometimes called "Darwin" because that is the name of the OS core, which is the level that Valgrind works at.) Supported systems: - It requires OS 10.5.x (Leopard). Porting to 10.4.x is not planned because it would require work and 10.4 is only becoming less common.
Page 230
- Documentation for the leak checker has been improved. * Various aspects of Valgrind’s text output have changed. - Valgrind’s start-up message has changed. It is shorter but also includes the command being run, which makes it easier to use --trace-children=yes.
Page 231
--xml-file=, --xml-fd= or --xml-socket= to select the XML destination, one of --log-file=, --log-fd= or --log-socket= to select the destination for any remaining text messages, and, importantly, -q. -q makes Valgrind completely silent on the text channel, except in the case of critical failures, such as Valgrind...
Page 232
NEWS itself segfaulting, or failing to read debugging information. Hence, in this scenario, it suffices to check whether or not any output appeared on the text channel. If yes, then it is likely to be a critical error which should be brought to the attention of the user.
Page 233
- The error messages printed by DRD are now easier to interpret. Instead of using two different numbers to identify each thread (Valgrind thread ID and DRD thread ID), DRD does now identify threads via a single number (the DRD thread ID). Furthermore "first observed at"...
Page 234
* Something that happened in 3.4.0, but wasn’t clearly announced: the option --read-var-info=yes can be used by some tools (Memcheck, Helgrind and DRD). When enabled, it causes Valgrind to read DWARF3 variable type and location information. This makes those tools...
Page 235
- The location of some install files has changed. This should not affect most users. Those who might be affected: * For people who use Valgrind with MPI programs, the installed libmpiwrap.so library has moved from $(INSTALL)/<platform>/libmpiwrap.so to $(INSTALL)/libmpiwrap-<platform>.so.
Page 236
108528 NPTL pthread cleanup handlers not called 110126 Valgrind 2.4.1 configure.in tramples CFLAGS 110128 mallinfo is not implemented... 110770 VEX: Generated files not always updated when making valgrind 111102 Memcheck: problems with large (memory footprint) applications 115673 Vex’s decoder should never assert...
Page 237
Assertion ’!already_present’ failed. 185359 exp-ptrcheck: unhandled syscall getresuid() 185794 "WARNING: unhandled syscall: 285" (fallocate) on x86_64 185816 Valgrind is unable to handle debug info for files with split debug info that are prelinked afterwards 185980 [darwin] unhandled syscall: sem_open 186238 bbToIR_AMD64: disInstr miscalculated next %rip 186507 exp-ptrcheck unhandled syscalls prctl, etc.
Page 238
198624 Missing syscalls on Darwin: 82, 167, 281, 347 198649 callgrind_annotate doesn’t cumulate counters 199338 callgrind_annotate sorting/thresholds are broken for all but Ir 199977 Valgrind complains about an unrecognized instruction in the atomic_incs test program 200029 valgrind isn’t able to read Fedora 12 debuginfo 200760 darwin unhandled syscall: unix:284 200827 DRD doesn’t work on Mac OS X...
Page 239
179624 helgrind: false positive races with pthread_create and recv/open/close/read 134207 pkg-config output contains @VG_PLATFORM@ 176926 floating point exception at valgrind startup with PPC 440EPX 181594 Bogus warning for empty text segment 173751 amd64->IR: 0x48 0xF 0x6F 0x45 (even more redundant rex prefixes) 181707 Dwarf3 doesn’t require enumerations to have name...
Page 240
Ptrcheck currently works only on x86-linux and amd64-linux. To use it, use --tool=exp-ptrcheck. A simple manual is provided, as part of the main Valgrind documentation. As this is an experimental tool, we would be particularly interested in hearing about your...
Page 241
* 3.4.0 adds support on x86/amd64 for the SSSE3 instruction set. * Very basic support for IBM Power6 has been added (64-bit processes only). * Valgrind is now cross-compilable. For example, it is possible to cross compile Valgrind on an x86/amd64-linux host, so that it runs on a ppc32/64-linux target.
Page 242
It can now correctly establish the addresses for ELF data symbols, which is something that has never worked properly before now. Also, Valgrind can now read DWARF3 type and location information for stack and global variables. This makes it possible to use the framework to build tools that rely on knowing the type and locations of stack and global variables, for example exp-Ptrcheck.
156960 ==155901 155528 support Core2/SSSE3 insns on x86/amd64 155929 ms_print fails on massif outputs containing long lines 157665 valgrind fails on shmdt(0) after shmat to 0 157748 support x86 PUSHFW/POPFW 158212 helgrind: handle pthread_rwlock_try{rd,wr}lock. 158425 sys_poll incorrectly emulated when RES==0 158744 vex amd64->IR: 0xF0 0x41 0xF 0xC0 (xaddb)
Page 244
162386 ms_print typo in milliseconds time unit for massif 161036 exp-drd: client allocated memory was never freed 162663 signalfd_wrapper fails on 64bit linux (3.3.1.RC1: 2 June 2008, vex r1854, valgrind r8169). (3.3.1: 4 June 2008, vex r1854, valgrind r8180). Release 3.3.0 (7 December 2007) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3.3.0 is a feature release with many significant improvements and the...
Page 245
- There is experimental support for AIX 5.3, both 32-bit and 64-bit processes. You need to be running a 64-bit kernel to use Valgrind on a 64-bit executable. - There have been some changes to command line options, which may affect you: * --log-file-exactly and...
Page 246
OLDER NEWS Cachegrind, Callgrind and Massif. They accept the same %p and %q format specifiers that --log-file accepts. --callgrind-out-file replaces Callgrind’s old --base option. * Cachegrind’s ’cg_annotate’ script no longer uses the --<pid> option to specify the output file. Instead, the first non-option argument is taken to be the name of the output file, and any subsequent non-option arguments are taken to be the names of source files to be annotated.
Page 247
143062 massif crashes on app exit with signal 8 SIGFPE 144453 (get_XCon): Assertion ’xpt->max_children != 0’ failed. 145559 valgrind aborts when malloc_stats is called 145609 valgrind aborts all runs with ’repeated section!’ 145622 --db-attach broken again on x86-64 145837 ==149519...
Page 248
- Internally, the code base has been further factorised and abstractified, particularly with respect to support for non-Linux OSs. (3.3.0.RC1: 2 Dec 2007, vex r1803, valgrind r7268). (3.3.0.RC2: 5 Dec 2007, vex r1804, valgrind r7282). (3.3.0.RC3: 9 Dec 2007, vex r1804, valgrind r7288).
Page 249
==133054 132998 startup fails in when running on UML 134207 pkg-config output contains @VG_PLATFORM@ 134727 valgrind exits with "Value too large for defined data type" n-i-bz ppc32/64: support mcrfs n-i-bz Cachegrind/Callgrind: Update cache parameter detection 135012 x86->IR: 0xD7 0x8A 0xE0 0xD0 (xlat)
Page 250
(3.2.2: 22 Jan 2007, vex r1729, valgrind r6545). Release 3.2.1 (16 Sept 2006) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3.2.1 adds x86/amd64 support for all SSE3 instructions except monitor and mwait, further reduces memcheck’s false error rate on all platforms, adds support for recent binutils (in OpenSUSE 10.2 and...
Page 251
3.2.X: 133154 crash when using client requests to register/deregister stack (3.2.1: 16 Sept 2006, vex r1658, valgrind r6070). Release 3.2.0 (7 June 2006) ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3.2.0 is a feature release with many significant improvements and the usual collection of bug fixes.
Page 252
You can get it from http://www.valgrind.org/downloads/guis.html. - Valgrind now works on PPC64/Linux. As with the AMD64/Linux port, this supports programs using to 32G of address space. On 64-bit capable PPC64/Linux setups, you get a dual architecture build so that both 32-bit and 64-bit executables can be run.
Page 253
- A new flag, --error-exitcode=, has been added. This allows changing the exit code in runs where Valgrind reported errors, which is useful when using Valgrind as part of an automated test suite. - Various segfaults when reading old-style "stabs" debug information have been fixed.
Page 254
- The way client requests are encoded in the instruction stream has changed. Unfortunately, this means 3.2.0 will not honour client requests compiled into binaries using headers from earlier versions of Valgrind. We will try to keep the client request encodings more stable in future. BUGS FIXED:...
Page 255
CDROMREADRAW ioctl and CDROMREADTOCENTRY fix 126722 assertion: segment_is_sane at m_aspacemgr/aspacemgr.c:1624 126938 bad checking for syscalls linkat, renameat, symlinkat (3.2.0RC1: 27 May 2006, vex r1626, valgrind r5947). (3.2.0: 7 June 2006, vex r1628, valgrind r5957). Release 3.1.1 (15 March 2006) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3.1.1 fixes a bunch of bugs reported in 3.1.0.
Page 256
On 64-bit machines up to 32GB of space is usable; when using Memcheck that means your program can use up to about 14GB. A side effect of this change is that Valgrind is no longer protected against wild writes by the client. This feature was nice but relied on the x86 segment registers and so wasn’t portable.
Page 257
- Core dumping has been reinstated (it was disabled in 3.0.0 and 3.0.1). If your program crashes while running under Valgrind, a core file with the name "vgcore.<pid>" will be created (if your settings allow core file creation). Note that the floating point information is not all there.
Page 258
110831 Would like to be able to run against both 32 and 64 bit binaries on AMD64 110829 == 110831 111781 compile of valgrind-3.0.0 fails on my linux (gcc 2.X prob) 112670 Cachegrind: cg_main.c:486 (handleOneStatement ... 112941 vex x86: 0xD9 0xF4 (fxtract) 110201 == 112941 113015 vex amd64->IR: 0xE3 0x14 0x48 0x83 (jrcxz)
Page 259
114289 Memcheck fails to intercept malloc when used in an uclibc environment 114756 mbind syscall support 114757 Valgrind dies with assertion: Assertion ’noLargerThan > 0’ failed 114563 stack tracking module not informed when valgrind switches threads 114564 clone() and stacks...
Page 260
- Address space may be limited; see the point about position-independent executables below. - If Valgrind is built on an AMD64 machine, it will only run 64-bit executables. If you want to run 32-bit x86 executables under Valgrind on an AMD64, you will need to build Valgrind on an x86 machine and copy it to the AMD64 machine.
Page 261
- Output can now be printed in XML format. This should make it easier for tools such as GUI front-ends and automated error-processing schemes to use Valgrind output as input. The --xml flag controls this. As part of this change, ELF directory information is read from executables, so absolute source file paths are available if needed.
Page 262
flags are --log-file-exactly= and --log-file-qualifier=. - As part of adding AMD64 support, DWARF2 CFI-based stack unwinding support was added. In principle this means Valgrind can produce meaningful backtraces on x86 code compiled with -fomit-frame-pointer providing you also compile your code with -fasynchronous-unwind-tables.
Page 263
There is a small performance improvement, and a large stability improvement. * On the downside, Valgrind can no longer report misuses of the POSIX PThreads API. It also means that Helgrind currently does not work. We hope to fix these problems in a future release.
Page 264
* Signal handling is much improved and should be very close to what you get when running natively. One result of this is that Valgrind observes changes to sigcontexts passed to signal handlers. Such modifications will take effect when the signal returns. You will need to run with --single-step=yes to make this useful.
Page 265
93117 Tool and core interface versions do not match 93128 Can’t run valgrind --tool=memcheck because of unimplement... 93174 Valgrind can crash if passed bad args to certain syscalls 93309 Stack frame in new thread is badly aligned 93328 Wrong types used with sys_sigprocmask() 93763 /usr/include/asm/msr.h is missing...
Page 266
- Blocking system calls behave exactly as they do when running natively (not on valgrind). That is, if a syscall blocks only the calling thread when running natively, than it behaves the same on valgrind.
Page 267
Draws pretty .ps pictures of memory use against time. A potentially powerful tool for making sense of your program’s space use. * File descriptor leakage checks. When enabled, Valgrind will print out a list of open file descriptors on exit.
Page 268
SSE code. * Add support for the POSIX message queue system calls. * Fix to allow 32-bit Valgrind to run on AMD64 boxes. Note: this does NOT allow Valgrind to work with 64-bit executables - only with 32-bit executables on an AMD64 box.
Page 269
2.0.0, might also want to try this release. The following bugs, and probably many more, have been fixed. These are listed at http://bugs.kde.org. Reporting a bug for valgrind in the http://bugs.kde.org is much more likely to get you a fix than mailing developers directly, so please continue to keep sending bugs there.
Page 270
AFAICS: * Rearranged address space layout relative to 2.1.1, so that Valgrind/tools will run out of memory later than currently in many circumstances. This is good news esp. for Calltree. It should be possible for client programs to allocate over 800MB of memory when using memcheck now.
Page 271
OLDER NEWS long-term future. These don’t affect end-users. Most notable user-visible changes are: * Greater isolation between Valgrind and the program being run, so the program is less likely to inadvertently kill Valgrind by doing wild writes. * Massif: a new space profiling tool. Try it! It’s cool, and it’ll tell you in detail where and when your C/C++ code is allocating heap.
Page 272
Specifically: - Blocking system calls behave exactly as they do when running natively (not on valgrind). That is, if a syscall blocks only the calling thread when running natively, than it behaves the same on valgrind.
Page 273
68525: CVS head doesn’t compile on C90 compilers 68566: pkgconfig support (wishlist) 68588: Assertion ‘sz == 4’ failed in vg_to_ucode.c (disInstr) 69140: valgrind not able to explicitly specify a path to a binary. 69432: helgrind asserts encountering a MutexErr when there are EraserErr suppressions - Increase the max size of the translation cache from 200k average bbs to 300k average bbs.
Page 274
OLDER NEWS - Don’t fail silently if the executable is statically linked, or is setuid/setgid. Print an error message instead. - Support for old DWARF-1 format line number info. Snapshot 20031012 (12 October 2003) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Three months worth of bug fixes, roughly. Most significant single change is improved SSE/SSE2 support, mostly thanks to Dirk Mueller.
Page 275
This may have caused confusing error messages. Snapshot 20030716 (16 July 2003) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 20030716 is a snapshot of our current CVS head (development) branch. This is the branch which will become valgrind-2.0. It contains significant enhancements over the 1.9.X branch.
Page 276
OLDER NEWS Despite this being a snapshot of the CVS head, it is believed to be quite stable -- at least as stable as 1.9.6 or 1.0.4, if not more so -- and therefore suitable for widespread use. Please let us know asap if it causes problems for you.
Page 277
OLDER NEWS - Fix assertion failure in pthread_once(). - Fix this: valgrind: vg_intercept.c:598 (vgAllRoadsLeadToRome_select): Assertion ‘ms_end >= ms_now’ failed. - Implement pthread_mutexattr_setpshared. - Understand Pentium 4 branch hints. Also implemented a couple more obscure x86 instructions. - Lots of other minor bug fixes.
Page 278
- Try and avoid assertion failures in mash_LD_PRELOAD_and_LD_LIBRARY_PATH. - Minor bug fixes in cg_annotate. Version 1.9.5 (7 April 2003) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It occurs to me that it would be helpful for valgrind users to record in the source distribution the changes in each release. So I now...
Page 279
- Add support for the munlock system call (124). Some comments about future releases: 1.9.5 is, we hope, the most stable Valgrind so far. It pretty much supersedes the 1.0.X branch. If you are a valgrind packager, please consider making 1.9.5 available to your users. You can regard the...
Page 280
OLDER NEWS are no plans at all for further releases of the 1.0.X branch. If you want a leading-edge valgrind, consider building the cvs head (from SourceForge), or getting a snapshot of it. Current cool stuff going in includes MMX support (done); SSE/SSE2 support (in progress), a significant (10-20%) performance improvement (done), and the usual...
SimPoint basic block vector generator. Valgrind is closely tied to details of the CPU, operating system and to a lesser extent, compiler and basic C libraries. This makes it difficult to make it portable. Nonetheless, it is available for the following...
Page 282
6. Run "make install", possibly as root if the destination permissions require that. 7. See if it works. Try "valgrind ls -l". Either this works, or it bombs out with some complaint. In that case, please let us know (see www.valgrind.org).
What are syscall/ioctl wrappers? What do they do? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Valgrind does what it does, in part, by keeping track of everything your program does. When a system call happens, for example a request to read part of a file, control passes to the Linux kernel, which fulfills the request, and returns control to your program.
Page 284
README_MISSING_SYSCALL_OR_IOCTL /* time_t time(time_t *t); */ PRINT("sys_time ( %p )",ARG1); PRE_REG_READ1(long, "time", int *, t); if (ARG1 != 0) { PRE_MEM_WRITE( "time(t)", ARG1, sizeof(vki_time_t) ); POST(sys_time) if (ARG1 != 0) { POST_MEM_WRITE( ARG1, sizeof(vki_time_t) ); The first thing we do happens before the syscall occurs, in the PRE() function. The PRE() function typically starts with invoking to the PRINT() macro.
Page 285
README_MISSING_SYSCALL_OR_IOCTL If Valgrind tells you that system call NNN is unimplemented, do the following: 1. Find out the name of the system call: grep NNN /usr/include/asm/unistd*.h This should tell you something like __NR_mysyscallname. Copy this entry to include/vki/vki-scnums-$(VG_PLATFORM).h. 2. Do ’man 2 mysyscallname’ to get some idea of what the syscall does.
Page 286
PRE(ioctl) and POST(ioctl). There’s a default case, sometimes it isn’t correct and you have to write a more specific case to get the right behaviour. As above, please create a bug report and attach the patch as described on http://www.valgrind.org.
6. README_DEVELOPERS Building and not installing it ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To run Valgrind without having to install it, run coregrind/valgrind with the VALGRIND_LIB environment variable set, where <dir> is the root of the source tree (and must be an absolute path). Eg: VALGRIND_LIB=~/grind/head4/.in_place ~/grind/head4/coregrind/valgrind...
Page 288
To debug the valgrind launcher program (<prefix>/bin/valgrind) just run it under gdb in the normal way. Debugging the main body of the valgrind code (and/or the code for a particular tool) requires a bit more trickery but can be achieved without too much problem by following these steps: (1) Set VALGRIND_LAUNCHER to point to the valgrind executable.
Page 289
(ie. not an environment variable). A different and possibly easier way is as follows: (1) Run Valgrind as normal, but add the flag --wait-for-gdb=yes. This puts the tool executable into a wait loop soon after it gains control.
Page 290
Memcheck can be used to find leaks and use after free in an Inner Valgrind. The Valgrind "big lock" is annotated with helgrind client requests so helgrind and drd can be used to find race conditions in an Inner Valgrind.
Page 291
When an outer valgrind runs an inner valgrind, a regression test produces one additional file <testname>.outer.log which contains the errors detected by the outer valgrind. E.g. for an outer memcheck, it contains the leaks found in the inner, for an outer helgrind or drd, it contains the detected race conditions.
Page 292
README_DEVELOPERS callgrind.out.inner_trunk.me.many-loss-records.22916 callgrind.outer.log.inner_trunk.me.many-loss-records.22916 Printing out problematic blocks ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If you want to print out a disassembly of a particular block that causes a crash, do the following. Try running with "--vex-guest-chase-thresh=0 --trace-flags=10000000 --trace-notbelow=999999". This should print one line for each block translated, and that includes the address.
So you can’t build a relocatable RPM / whatever from Valgrind. -- Don’t strip the debug info off lib/valgrind/$platform/vgpreload*.so in the installation tree. Either Valgrind won’t work at all, or it will still work if you do, but will generate less helpful error messages.
Page 294
-- Please test the final installation works by running it on something huge. I suggest checking that it can start and exit successfully both Firefox and OpenOffice.org. I use these as test programs, and I know they fairly thoroughly exercise Valgrind. The command lines to use are: valgrind -v --trace-children=yes firefox valgrind -v --trace-children=yes soffice...
------------ - You need GCC 3.4 or later to compile the s390 port. - A working combination of autotools is required. - To run valgrind a z900 machine or any later model is needed. Limitations ----------- - 31-bit client programs are not supported.
9. README.android How to cross-compile for Android. These notes were last updated on 17 Feb 2012, for Valgrind SVN revision 12390/2257. This is known to work at least for : ARM: Android 4.0.3 running on a (rooted, AOSP build) Nexus S.
Page 297
README.android # Then cd to the root of your Valgrind source tree. cd /path/to/valgrind/source/tree # After this point, you don’t need to modify anything; just copy and # paste the commands below. # Set up toolchain paths. # For ARM export AR=$NDKROOT/toolchains/arm-linux-androideabi-4.4.3/prebuilt/linux-x86/bin/arm-linux-androideabi-ar...
Page 298
# where ’mq’ is an alias for ’make --quiet’. # One common cause of runs failing at startup is the inability of # Valgrind to find a suitable temporary directory. On the device, # there doesn’t seem to be any one location which we always have # permission to write to.
10. README.android_emulator How to install and run an android emulator. mkdir android # or any other place you prefer cd android # download java JDK # http://www.oracle.com/technetwork/java/javase/downloads/index.html # download android SDK # http://developer.android.com/sdk/index.html # download android NDK # http://developer.android.com/sdk/ndk/index.html # versions I used: # jdk-7u4-linux-i586.tar.gz # android-ndk-r8-linux-x86.tar.bz2 # android-sdk_r18-linux.tgz...
Page 300
# and see it is working. Note that I usually get # one or two time out from adb shell before it works adb shell # Once the emulator is ready, push your Valgrind to the emulator: adb push Inst / # if you need to debug:...
* --with-pagesize option is used to set default PAGE SIZE. If option is not used, PAGE SIZE is set to value default for platform on which Valgrind is built on. Possible values are 4, 16 of 64 and represent size in kilobytes.
Page 302
README.mips based on newer GCC versions, if possible.
Page 304
GNU Licenses Table of Contents 1. The GNU General Public License 2. The GNU Free Documentation License...
Page 305
1. The GNU General Public License GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
Page 306
The GNU General Public License patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone’s free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow.
Page 307
The GNU General Public License interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License.
Page 308
The GNU General Public License control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable.
Page 309
The GNU General Public License circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices.
Page 310
The GNU General Public License PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED...
Page 311
The GNU General Public License The hypothetical commands ‘show w’ and ‘show c’ should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than ‘show w’ and ‘show c’; they could even be mouse-clicks or menu items--whatever suits your program.
Page 312
2. The GNU Free Documentation License GNU Free Documentation License Version 1.2, November 2002 Copyright (C) 2000,2001,2002 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. 0.
Page 313
The GNU Free Documentation License modifications and/or translated into another language. A "Secondary Section" is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document’s overall subject (or to related matters) and contains nothing that could fall directly within that overall subject.
Page 314
The GNU Free Documentation License the text near the most prominent appearance of the work’s title, preceding the beginning of the body of the text. A section "Entitled XYZ" means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language.
Page 315
The GNU Free Documentation License If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material.
Page 316
The GNU Free Documentation License given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence. J. Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on.
Page 317
The GNU Free Documentation License versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers. The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy.
Page 318
The GNU Free Documentation License 8. TRANSLATION Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections.
Page 319
The GNU Free Documentation License the License in the document and put the following copyright and license notices just after the title page: Copyright (c) YEAR YOUR NAME. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation;...
Need help?
Do you have a question about the Software and is the answer not in the manual?
Questions and answers