1. SystemTap overview 1.1. About this guide ......................1 1.2. Reasons to use SystemTap ..................1 1.3. Event-action language ....................1 1.4. Sample SystemTap scripts ................... 1 1.4.1. Basic SystemTap syntax and control structures ........... 1 1.4.2. Primes between 0 and 49 .................. 2 1.4.3.
Chapter 1. SystemTap overview 1.1. About this guide This guide is a comprehensive reference of SystemTap's language constructs and syntax. The contents borrow heavily from existing SystemTap documentation found in manual pages and the tutorial. The presentation of information here provides the reader with a single place to find language syntax and recommended usage.
Chapter 1. SystemTap overview probe begin { # "no" and "ne" are local integers for (i = 0; i < 10; i++) { if (i % 2) odds [no++] = i else evens [ne++] = i delete odds[2] delete evens[3] exit() probe end { foreach (x+ in odds)
Chapter 1. SystemTap overview resulting kernel module into a running Linux kernel to perform the requested system trace or probe functions. You can supply the script in a named file, from standard input, or from the command line. The program runs until it is interrupted by the user or a sufficient number of soft errors, or if the script voluntarily invokes the exit() function.
Page 13
Safety and security If something goes wrong with stap or staprun after a probe has started running, you may safely kill both user processes, and remove the active probe kernel module with the rmmod command. Any pending trace messages may be lost.
Chapter 2. Types of SystemTap scripts 2.1. Probe scripts Probe scripts are analogous to programs; these scripts identify probe points and associated handlers. 2.2. Tapset scripts Tapset scripts are libraries of probe aliases and auxiliary functions. The /usr/share/systemtap/tapset directory contains tapset scripts. While these scripts look like regular SystemTap scripts, they cannot be run directly.
Chapter 3. Components of a SystemTap script The main construct in the scripting language identifies probes. Probes associate abstract events with a statement block, or probe handler, that is to be executed when any of those events occur. The following example shows how to trace entry and exit from a function using two probes. probe kernel.function("sys_mkdir") { log ("enter") } probe kernel.function("sys_mkdir").return { log ("exit") } To list the probe-able functions in the kernel, use the last-pass option to the translator.
Chapter 3. Components of a SystemTap script probe socket.sendmsg = kernel.function ("sock_sendmsg") { ... } probe socket.do_write = kernel.function ("do_sock_write") { ... } probe socket.send = socket.sendmsg, socket.do_write { ... } There are two types of aliases, the prologue style and the epilogue style which are identified by the equal sign (=) and "+="...
Probe alias usage probe syscall.read += kernel.function("sys_read") { if (traceme) println ("tracing me") 3.2.3. Probe alias usage A probe alias is used the same way as any built-in probe type, by naming it: probe syscall.read { printf("reading fd=%d\n", fildes) 3.2.4. Unused alias variables An unused alias variable is a variable defined in a probe alias, usually as one of a group of var = $var assignments, which is not actually used by the script probe that instantiates the alias.
Chapter 3. Components of a SystemTap script function <name>[:<type>] ( <arg1>[:<type>], ... ) { <stmts> } SystemTap scripts may define subroutines to factor out common work. Functions may take any number of scalar arguments, and must return a single scalar value. Scalars in this context are Section 3.3, “Variables”...
Page 21
Embedded C functions that could potentially be invalid or dangerous. If you are unsure, err on the side of caution and use kread(). The kread() macro is one of the safety mechanisms used in code generated by embedded C. It protects against pointer accesses that could crash the system. For example, to access the pointer chain name = skb->dev->name in embedded C, use the following code.
Chapter 4. Probe points 4.1. General syntax The general probe point syntax is a dotted-symbol sequence. This divides the event namespace into parts, analogous to the style of the Domain Name System. Each component identifier is parameterized by a string or number literal, with a syntax analogous to a function call. The following are all syntactically valid probe points.
Chapter 4. Probe points The following is the general syntax. kernel.function("no_such_function") ? 4.2. Built-in probe point types (DWARF probes) This family of probe points uses symbolic debugging information for the target kernel or module, as may be found in executables that have not been stripped, or in the separate debuginfo packages. They allow logical placement of probes into the execution path of the target by specifying a set of points in the source or object code.
kernel.function, module().function In the above probe descriptions, MPATTERN stands for a string literal that identifies the loaded kernel module of interest and LPATTERN stands for a source program label. Both MPATTERN and LPATTERN may include asterisk (*), square brackets "[]", and question mark (?) wildcards. PATTERN stands for a string literal that identifies a point in the program.
Chapter 4. Probe points kernel.function("func[@file]" module("modname").function("func[@file]" Examples: # Refers to all kernel functions with "init" or "exit" # in the name: kernel.function("*init*"), kernel.function("*exit*") # Refers to any functions within the "kernel/sched.c" # file that span line 240: kernel.function("*@kernel/sched.c:240") # Refers to all functions in the ext3 module: module("ext3").function("*") 4.2.2.
Marker probes count) You can obtain the values of fd, buf, and count, respectively, as uint_arg(1), pointer_arg(2), and ulong_arg(3). In this case, your probe code must first call asmlinkage(), because on some architectures the asmlinkage attribute affects how the function's arguments are passed.
Chapter 4. Probe points timer.jiffies(N) timer.jiffies(N).randomize(M) The probe handler runs every N jiffies. If the randomize component is given, a linearly distributed random value in the range [-M … +M] is added to N every time the handler executes. N is restricted to a reasonable range (1 to approximately 1,000,000), and M is restricted to be less than N.
Special probe points the context of the return probe, though their values may have been changed by the function. Inline functions do not have an identifiable return point, so .return is not supported on .inline probes. 4.7. Special probe points The probe points begin and end are defined by the translator to refer to the time of session startup and shutdown.
Chapter 4. Probe points implementation, except for generic cycle and instructions events, which are available on all processors. The probe perfmon.counter(event) starts a counter on the processor which counts the number of events that occur on that processor. For more details about the performance monitoring events available on a specific processor, see the help text returned by typing the perfmon2 command pfmon -l.
Chapter 5. Language elements 5.1. Identifiers Identifiers are used to name variables and functions. They are an alphanumeric sequence that may include the underscore (_) and dollar sign ($) characters. They have the same syntax as C identifiers, except that the dollar sign is also a legal character. Identifiers that begin with a dollar sign are interpreted as references to variables in the target software, rather than to SystemTap script variables.
Chapter 5. Language elements # ... shell style, to the end of line // ... C++ style, to the end of line /* ... C style ... */ 5.5. Whitespace As in C, spaces, tabs, returns, newlines, and comments are treated as whitespace. Whitespace is ignored by the parser.
Function call 5.6.9. Function call General syntax: fn ([ arg1, arg2, ... ]) 5.6.10. $ptr->member ptr is a kernel pointer available in a probed context. 5.6.11. <value> in <array_name> This expression evaluates to true if the array contains an element with the specified index. 5.6.12.
Chapter 5. Language elements 10, mystring 5.8. Conditional compilation 5.8.1. Conditions One of the steps of parsing is a simple conditional preprocessing stage. The general form of this is Section 5.6.7, “Ternary operator”). similar to the ternary operator (Section %( CONDITION %? TRUE-TOKENS %) %( CONDITION %? TRUE-TOKENS %: FALSE-TOKENS %) The CONDITION is a limited expression whose format is determined by its first keyword.
Page 35
True and False Tokens The following code adapts to hypothetical kernel version drift. probe kernel.function ( %( kernel_v <= "2.6.12" %? "__mm_do_fault" %: %( kernel_vr == "2.6.13-1.8273FC3smp" %? "do_page_fault" %: UNSUPPORTED %) %)) { /* ... */ } %( arch == "ia64" %? probe syscall.vliw = kernel.function("vliw_widget") {}...
Chapter 6. Statement types Statements enable procedural control flow within functions and probe handlers. The total number of statements executed in response to any single probe event is limited to MAXACTION, which defaults Section 1.6, “Safety and security”. to 1000. See Section 6.1.
Chapter 6. Statement types for (EXP1; EXP2; EXP3) STMT The for statement is similar to the for statement in C. The for expression executes EXP1 as initialization. While EXP2 is non-zero, it executes STMT, then the iteration expression EXP3. 6.6. foreach General syntax: foreach (VAR in ARRAY) STMT The foreach statement loops over each element of a named global array, assigning the current key...
return statement1 statement2 The semicolon represents the null statement, or do nothing. It is useful as an optional separator between statements to improve syntax error detection and to handle certain grammar ambiguities. 6.10. return General syntax: return EXP The return statement returns the EXP value from the enclosing function. If the value of the function is not returned, then a return statement is not needed, and the function will have a special unknown type with no return value.
Chapter 7. Associative arrays Associative arrays are implemented as hash tables with a maximum size set at startup. Associative arrays are too large to be created dynamically for individual probe handler runs, so they must be declared as global. The basic operations for arrays are setting and looking up elements. These operations are expressed in awk syntax: the array name followed by an opening bracket ([), a comma- separated list of up to five index index expressions, and a closing bracket (]).
Chapter 7. Associative arrays 7.4. Iteration, foreach Like awk, SystemTap's foreach creates a loop that iterates over key tuples of an array, not only values. The iteration may be sorted by any single key or a value by adding an extra plus symbol (+) or minus symbol (-) to the code.
Chapter 8. Statistics (aggregates) Aggregate instances are used to collect statistics on numerical values, when it is important to accumulate new data quickly and in large volume. These instances operate without exclusive locks, and store only aggregated stream statistics. Aggregates make sense only for global variables. They are stored individually or as elements of an array.
Chapter 8. Statistics (aggregates) 8.4. Histogram extractors The following functions provide methods to extract histogram information. Printing a histogram with the print family of functions renders a histogram object as a tabular "ASCII art" bar chart. 8.4.1. @hist_linear The statement @hist_linear(v,L,H,W) represents a linear histogram v, where L and H represent the lower and upper end of a range of values and W represents the width (or size) of each bucket within the range.
@hist_log 8.4.2. @hist_log The statement @hist_log(v) represents a base-2 logarithmic histogram. Empty buckets are replaced with a tilde (~) character in the same way as @hist_linear() (see above). The following is an example. global reads probe netdev.receive { reads <<< length probe end { print(@hist_log(reads)) This generates the following output.
Chapter 9. Predefined functions Unlike built-in functions, predefined functions are implemented in tapsets. 9.1. Output functions The following sections describe the functions you can use to output data. 9.1.1. error General syntax: error:unknown (msg:string) This function logs the given string to the error stream. It appends an implicit end-of-line. It blocks any further execution of statements in this probe.
Page 48
Chapter 9. Predefined functions The printf function takes a formatting string as an argument, and a number of values of corresponding types, and prints them all. The format must be a literal string constant. The printf formatting directives are similar to those of C, except that they are fully checked for type by the translator. The formatting string can contain tags that are defined as follows: %[flags][width][.precision][length]specifier Where specifier is required and defines the type and the interpretation of the value of the...
Page 49
printf Used with o, x or X specifiers the value is preceded with 0, 0x or 0X respectively for non- zero values. Left-pads the number with zeroes instead of spaces, where padding is specified (see width sub-specifier). Table 9.2. printf flag values Width Description (number)
printdln printd:unknown (delimiter:string, ) This function takes a string delimiter and two or more values of any type, then prints the values with the delimiter interposed. The delimiter must be a literal string constant. For example: printd("/", "one", "two", "three", 4, 5, 6) prints: one/two/three/4/5/6 9.1.6.
Chapter 9. Predefined functions This function operates like printf, but returns the formatted string rather than printing it. 9.1.10. system General syntax: system (cmd:string) The system function runs a command on the system. The specified command runs in the background once the current probe completes.
caller_addr:long () Returns the address of the calling function. It works only for return probes. 9.2.4. cpu General syntax: cpu:long () Returns the current cpu number. 9.2.5. egid General syntax: egid:long () Returns the effective group ID of the current process. 9.2.6.
Chapter 9. Predefined functions 9.2.9. is_return General syntax: is_return:long () Returns 1 if the probe point is a return probe, else it returns zero. Deprecated. 9.2.10. pexecname General syntax: pexecname:string () Returns the name of the parent process. 9.2.11. pid General syntax: pid:long () Returns the process ID of the current process.
print_backtrace uid:long () Returns the user ID of the current task. 9.2.15. print_backtrace General syntax: print_backtrace:unknown () This function is equivalent to print_stack(backtrace()), except that deeper stack nesting is supported. The function does not return a value. 9.2.16. print_regs General syntax: print_regs:unknown () This function prints a register dump.
Chapter 9. Predefined functions stack_unused:long () Returns how many bytes are currently unused in the stack. 9.2.20. stack_used General syntax: stack_used:long () Returns how many bytes are currently used in the stack. 9.2.21. stp_pid stp_pid:long () Returns the process ID of the of the staprun process. 9.2.22.
task_cpu 9.3.1. task_cpu General syntax: task_cpu:long (task:long) Returns the scheduled cpu for the given task. 9.3.2. task_current General syntax: task_current:long () Returns the address of the task_struct representing the current process. This address can be passed to the various task_*() functions to extract more task-specific data. 9.3.3.
Chapter 9. Predefined functions task_gid:long (task:long) Returns the group ID of the given task. 9.3.7. task_nice General syntax: task_nice:long (task:long) Returns the nice value of the given task. 9.3.8. task_parent General syntax: task_parent:long (task:long) Returns the address of the parent task_struct of the given task. This address can be passed to the various task_*() functions to extract more task-specific data.
task_tid task_state:long (task:long) Returns the state of the given task. Possible states are: TASK_RUNNING TASK_INTERRUPTIBLE TASK_UNINTERRUPTIBLE TASK_STOPPED TASK_TRACED EXIT_ZOMBIE EXIT_DEAD 9.3.12. task_tid General syntax: task_tid:long (task:long) Returns the thread ID of the given task. 9.3.13. task_uid General syntax: task_uid:long (task:long) Returns the user ID of the given task.
Chapter 9. Predefined functions 9.4. Accessing string data at a probe point The following functions provide methods to access string data at a probe point. 9.4.1. kernel_string General syntax: kernel_string:string (addr:long) Copies a string from kernel space at a given address. The validation of this address is only partial. 9.4.2.
Initializing queue statistics user_string_quoted:string (addr:long) This function copies a string from userspace at given address. Any ASCII characters that are not printable are replaced by the corresponding escape sequence in the returned string. 9.5. Initializing queue statistics The queue_stats tapset provides functions that, when given notification of queuing events like wait, run, or done, track averages such as queue length, service and wait times, and utilization.
Chapter 9. Predefined functions qsq_blocked:long (qname:string, scale:long) This function returns the fraction of elapsed time during which one or more requests were on the wait queue. 9.6.2. qsq_print General syntax: qsq_print:unknown (qname:string) This function prints a line containing the following statistics for the given queue: •...
qsq_utilization qsq_throughput:long (qname:string, scale:long) This function returns the average number of requests served per microsecond. 9.6.6. qsq_utilization General syntax: qsq_utilization:long (qname:string, scale:long) This function returns the average time in microseconds that at least one request was being serviced. 9.6.7. qsq_wait_queue_length General syntax: qsq_wait_queue_length:long (qname:string, scale:long) This function returns the average length of the wait queue.
Chapter 9. Predefined functions qsq_start ("block-write") probe timer.ms(10000) { exit () # synthesize queue work/service using three randomized "threads" for each queue. global tc function qs_doit (thread, name) { n = tc[thread] = (tc[thread]+1) % 3 # per-thread state counter if (n==1) qs_wait (name) else if (n==2) qs_run (name) else if (n==0) qs_done (name)
probefunc pp:string () This function returns the probe point associated with a currently running probe handler, including alias and wild-card expansion effects. 9.7.2. probefunc General syntax: probefunc:string () This function returns the name of the function being probed. 9.7.3. probemod General syntax: probemod:string () This function returns the name of the module containing the probe point.
Chapter 9. Predefined functions errno_str:string (err:long) This function returns the symbolic string associated with the given error code, such as ENOENT for the number 2, or E#3333 for an out-of-range value such as 3333. 9.8.3. returnstr General syntax: returnstr:string (returnp:long) This function is used by the syscall tapset, and returns a string.
Chapter 9. Predefined functions This function converts the string representation of a number to an integer. The base parameter indicates the number base to assume for the string (e.g. 16 for hex, 8 for octal, 2 for binary). 9.9.4. substr General syntax: substr:string (str:string, start:long, stop:long) This function returns the substring of str starting from character position start and ending at...
get_cycles 9.10.1. get_cycles General syntax: get_cycles:long () This function returns the processor cycle counter value if available, else it returns zero. 9.10.2. gettimeofday_ms General syntax: gettimeofday_ms:long () This function returns the number of milliseconds since the UNIX epoch. 9.10.3. gettimeofday_ns General syntax: gettimeofday_ns:long () This function returns the number of nanoseconds since the UNIX epoch.
Chapter 9. Predefined functions 9.11.1. addr_to_node General syntax: addr_to_node:long (addr:long) This function accepts an address, and returns the node that the given address belongs to in a NUMA system. 9.11.2. exit General syntax: exit:unknown () This function enqueues a request to shut down the SystemTap session. It does not unwind the current probe handler, nor block new probe handlers.
Chapter 10. For Further Reference For more information, see: http://sourceware.org/systemtap/tutorial/ • The SystemTap tutorial at http://sourceware.org/systemtap/wiki • The SystemTap wiki at http://sourceware.org/systemtap/documentation.html • The SystemTap documentation page at • From an unpacked source tarball or GIT directory, the examples in in the src/examples directory, the tapsets in the src/tapset directory, and the test scripts in the src/testsuite directory.
Need help?
Do you have a question about the ENTERPRISE LINUX 5.4 - SYSTEMTAP LANGUAGE and is the answer not in the manual?
Questions and answers