Partec PARASTATION5 V5 Administrator's Manual

Integrated cluster management and communication solution
Table of Contents

Advertisement

Administrator's Guide
Release 5.0.5
Published April 2010

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the PARASTATION5 V5 and is the answer not in the manual?

Questions and answers

Summary of Contents for Partec PARASTATION5 V5

  • Page 1 Administrator's Guide Release 5.0.5 Published April 2010...
  • Page 2 ParaStation5 Administrator's Guide Release 5.0.5 Copyright © 2002-2010 ParTec Cluster Competence Center GmbH April 2010 Printed 7 April 2010, 14:11 Reproduction in any manner whatsoever without the written permission of ParTec Cluster Competence Center GmbH is strictly forbidden. All rights reserved. ParTec and ParaStation are registered trademarks of ParTec Cluster Competence Center GmbH. The ParTec logo and the ParaStation logo are trademarks of ParTec Cluster Competence Center GmbH.
  • Page 3: Table Of Contents

    Table of Contents 1. Introduction ... 1 1.1. What is ParaStation ... 1 1.2. The history of ParaStation ... 1 1.3. About this document ... 2 2. Technical overview ... 3 2.1. Runtime daemon ... 3 2.2. Libraries ... 3 2.3.
  • Page 4 6.2. Problem: node shown as "down" ... 29 6.3. Problem: cannot start parallel task ... 30 6.4. Problem: bad performance ... 30 6.5. Problem: different groups of nodes are seen as up or down ... 30 6.6. Problem: cannot start process on front end ... 30 6.7.
  • Page 5: Chapter 1. Introduction

    Chapter 1. Introduction 1.1. What is ParaStation ParaStation is an integrated cluster management and communication solution. It combines unique features only found in ParaStation with common techniques, widely used in high performance computing, to deliver an integrated, easy to use and reliable compute cluster environment. The version 5 of ParaStation supports various communication technologies as interconnect network.
  • Page 6: About This Document

    About this document In the middle of 2004, all rights on ParaStation where transferred from ParTec AG to the ParTec Cluster Competence Center GmbH. This new company takes a much more service-oriented approach to the customer. The main goal is to deliver integrated and complete software stacks for LINUX-based compute clusters by selecting state-of-the-art software components and driving software development efforts in areas where real added value can be provided.
  • Page 7: Chapter 2. Technical Overview

    Chapter 2. Technical overview Within this section, a brief technical overview of ParaStation5 will be given. The various software modules constituting ParaStation5 are explained. 2.1. Runtime daemon In order to enable ParaStation5 on a cluster, the ParaStation daemon psid(8) has to be installed on each cluster node.
  • Page 8: License

    • p4sock.o: this module implements the kernel based ParaStation5 communication protocol. • e1000_glue.o, bcm5700_glue.o: these modules enable even more efficient communication to the network drivers coming with ParaStation5 (see below). • p4tcp.o: this module provides a feature called "TCP bypass". Thus, applications using standard TCP communication channels on top of Ethernet are able to use the optimized ParaStation5 protocol and therefore achieve improved performance.
  • Page 9: Chapter 3. Installation

    Chapter 3. Installation This chapter describes the installation of ParaStation5. At first, the prerequisites to use ParaStation5 are discussed. Next, the directory structure of all installed components is explained. Finally, the installation using RPM packages is described in detail. Of course, the less automated the chosen way of installation is, the more possibilities of customization within the installation process occur.
  • Page 10: Kernel Version

    Software ParaStation requires a RPM-based Linux installation, as the ParaStation software is based on installable RPM packages. All current distributions from Novell and Red Hat are supported, like • SuSE Linux Enterprise Server (SLES) 9 and 10 • SuSE Professional 9.1, 9.2, 9.3 and 10.0, OpenSuSE 10.1, 10.2, 10.3 •...
  • Page 11: Installation Via Rpm Packages

    Installation via RPM packages contains the manual pages describing the ParaStation daemons, utilities and configuration files after installing the documentation package. The necessary steps are described in Section 3.4, “Installing the documentation”. In order to enable the users to access these pages using the man(1) command, please consult the corresponding documentation mpi2, mpi2-intel, mpi2-pgi, mpi2-psc contains an adapted version of MPIch2 after installing one of the various psmpi2 RPM files.
  • Page 12 Please note that the individual version numbers of the distinct packages building the ParaStation5 system do not necessarily have to match. Compiling the ParaStation5 packages from source To build proper RPM packages suitable for a particular setup, the source code for the ParaStation packages can be downloaded from www.parastation.com/download Typically, it is not necessary to recompile the ParaStation packages, as the provided precompiled packages will install on all major distributions.
  • Page 13: Installing The Documentation

    Installing the documentation # rpm -Uv psmgmt.5.0.0-0.i586.rpm pscom.5.0.0-0.i586.rpm \ pscom-modules.5.0.0-0.i586.rpm This will copy all the necessary files to /opt/parastation and the kernel modules to /lib/modules/ kernelversion/kernel/drivers/net/ps4. On a frontend node or file server, the pscom-modules package is only required, if this node should run processes of a parallel task.
  • Page 14: Installing Mpi

    # rpm -Uv psdoc-5.0.0-1.noarch.rpm All the PDF and HTML files will be installed within the directory /opt/parastation/doc, the manual pages will reside in /opt/parastation/man. The intended starting point to browse the HTML version of the documentation is file:///opt/ parastation/doc/html/index.html. The documentation is available in two PDF files called adminguide.pdf for the ParaStation5 Administrator's Guide and userguide.pdf for the ParaStation5 User's Guide.
  • Page 15: Uninstalling Parastation5

    Uninstalling ParaStation5 • testing These steps will be discussed in Chapter 4, Configuration. 3.7. Uninstalling ParaStation5 After stoping the ParaStation daemons, the corresponding packets can be removed using # /etc/init.d/parastation stop # rpm -e psmgmt pscom psdoc psmpi2 on all nodes of the cluster. ParaStation5 Administrator's Guide...
  • Page 16 ParaStation5 Administrator's Guide...
  • Page 17: Chapter 4. Configuration

    Chapter 4. Configuration After installing the ParaStation software successfully, only few modifications to the configuration file parastation.conf(5) have to be made in order to enable ParaStation on the local cluster. 4.1. Configuration of the ParaStation system Within this section the basic configuration procedure to enable ParaStation will be described. It covers the configuration of ParaStation5 using TCP/IP (Ethernet) and the optimized ParaStation5 protocol p4sock.
  • Page 18: Enable Optimized Network Drivers

    The values that might be assigned to the HWType parameter have to be defined within the parastation.conf configuration file. Have a brief look at the various Hardware sections of this file in order to find out which hardware types are actually defined. Other possible types are: mvapi, openib, gm, ipath, elan, dapl.
  • Page 19: Testing The Installation

    Testing the installation transfer application data across Ethernet, this adapted drivers should be used, too. To enable these drivers, the simplest way is to rename the original modules and recreate the module dependencies: # cd /lib/modules/$(uname -r)/kernel/drivers/net # mv e1000/e1000.o e1000/e1000-orig.o # mv bcm/bcm5700.o bcm/bcm5700-orig.o # depmod -a If your system uses the e1000 driver, a subsequent modinfo command for kernel version 2.4 should report...
  • Page 20 Alternatively, it is possible to use the single command form of the psiadmin command: # /opt/parastation/bin/psiadmin -s -c "list" The command should be repeated until all nodes are up. The ParaStation administration tool is described in detail in the corresponding manual page psiadmin(1). If some nodes are still marked as "down", the logfile /var/log/messages for this node should be inspected.
  • Page 21: Chapter 5. Insight Parastation5

    Chapter 5. Insight ParaStation5 This chapter provides more technical details and background information about ParaStation5. 5.1. ParaStation5 pscom communication library The ParaStation communication library libpscom offers secure and reliable end-to-end connectivity. It hides the actual transport and communication characteristics from the application and higher level libraries. The libpscom library supports a wide range of interconnects and protocols for data transfers.
  • Page 22: Directory /Proc/Sys/Ps4/State

    The p4sock.ko module inserts a number of entries within the /proc filesystem. All ParaStation5 entries are located within the subdirectory /proc/sys/ps4. Three different subdirectories, listed below, are available. To read a value, e.g. just type # cat /proc/sys/ps4/state/connections to get the number of currently open connections. To modify a value, for e.g. type # echo 10 >...
  • Page 23: Directory /Proc/Sys/Ps4/Local

    Directory /proc/sys/ps4/local • MaxAcksPending: maximum number of pending ACK messages until an "urgent" ACK messages will be sent. • MaxDevSendQSize: maximum number of entries of the (protocol internal) send queue to the network device. • MaxMTU: maximum packet size used for network packets. For sending packets, the minimum of MaxMTU and service specific MTU will be used.
  • Page 24: Using The Parastation5 Queuing Facility

    a predefined node list. If not defined, all currently known nodes are taken into account. Also, the variables PSI_NODES_SORT, PSI_LOOP_NODES_FIRST, PSI_EXCLUSIVE and PSI_OVERBOOK are observed. Based on these variables and the list of currently active processes, a sorted list of nodes is constructed, defining the final node list for this new task.
  • Page 25: Parastation5 Tcp Bypass

    ParaStation5 TCP bypass In order to run applications linked with one of those MPI libraries, ParaStation5 provides dedicated mpirun commands. The processes for those type of parallel tasks are spawned obeying all restrictions described in Section 5.3, “Controlling process placement”. Of course, the data transfer will be based on the communication channels supported by the particular MPI library.
  • Page 26: Authentication Within Parastation5

    PSP_SHM or PSP_SHAREDMEM Don't use shared memory for communication within the same node. PSP_P4S or PSP_P4SOCK Don't use ParaStation p4sock protocol for communication. PSP_MVAPI Don't use Mellanox InfiniBand vapi for communication. PSP_OPENIB Don't use OpenIB InfiniBand vapi for communication. PSP_GM Don't use GM (Myrinet) for communication.
  • Page 27: Homogeneous User Id Space

    Homogeneous user ID space etc/passwd. Usage of common authentication schemes like NIS is not required and therefore limits user management to the frontend nodes. Authentication of users is restricted to login or frontend nodes and is outside of the scope of ParaStation. 5.10.
  • Page 28: Integration With Afs

    5.14. Integration with AFS To run parallel tasks spawned by ParaStation on clusters using AFS, ParaStation provides the scripts env2tok and tok2env. On the frontend side, calling . tok2env will create an environment variable AFS_TOKEN containing an encoded access token for AFS. This variable must be added to the list of exported variables PSI_EXPORTS="AFS_TOKEN,$PSI_EXPORTS"...
  • Page 29: Integration With Pbs Pro

    Integration with PBS PRO If an external queuing system is used, the environment variable PSI_NODES_SORT should be set to "none", thus no sorting of any predefined node list will be done by ParaStation. ParaStation includes its own queuing facility. For more details, refer to Section 5.4, “Using the ParaStation5 queuing facility”...
  • Page 30: Copying Files In Parallel

    # UseMCast statement. If Multicast is enabled, the ParaStation daemons exchange status information using multicast messages. Thus, a Linux kernel supporting multicast on all nodes of the cluster is required. This is usually no problem, since all standard kernels from all common distribution are compiled with multicast support. If a customized kernel is used, multicast support must be enabled within the kernel configuration! In order to learn more about multicast take a look at the Multicast over TCP/IP HOWTO.
  • Page 31: Using Parastation Process Pinning

    Using ParaStation process pinning To list, sort and filter all the collected information, the command psaccview is available. See psaccounter(8) and psaccview(8) for details. 5.19. Using ParaStation process pinning ParaStation is able to pin down compute tasks to particular cores. This will avoid 'hoping' processes between different cores or CPUs during runtime, controlled by the OS scheduler.
  • Page 32 Changing the default ports for psid(8) and change the default port number 888. Modify the entry port = 888 within the file /etc/xinet.d/psidstarter to reflect the newly assigned port numbers. In addition, the ParaStation daemon psid(8) uses the UDP port 886 for RDP connections. To change this port, use the RDPPort directive within parastation.conf.
  • Page 33: Chapter 6. Troubleshooting

    Chapter 6. Troubleshooting This chapter provides some hints to common problems seen while installing or using ParaStation5. Of course, more help will be provided by <support@par-tec.com>. 6.1. Problem: psiadmin returns error When starting up the ParaStation admin command psiadmin, an error is reported: # psiadmin PSC: PSC_startDaemon: connect() fails: Connection refused Reason: the local ParaStation daemon could not be contacted.
  • Page 34: Problem: Cannot Start Parallel Task

    Or logged on to this node, run psiadmin which also starts up the ParaStation daemon psid. See Section 6.1, “ Problem: psiadmin returns error ” for more details. Check the logfile /var/log/messages on this node for error messages. Verify that all nodes have an identical configuration (/etc/parastation.conf).
  • Page 35: Warning Issued On Task Startup

    Warning issued on task startup This typically happens, if the frontend or head node is included as compute node and also acts as gateway for the compute nodes. The "external" address of the frontend is not known to the compute nodes. Use the PSP_NETWORK environment variable to re-direct all traffic to the cluster-internal network.
  • Page 36: Problem: Processes Cannot Access Files On Remote Nodes

    Problem: processes cannot access files on remote nodes Make sure no other process uses this port. Or use the RDPPort directive within parastation.conf to re-define this port for all daemons within the cluster. See also parastation.conf(5). 6.10. Problem: processes cannot access files on remote nodes Problem: processes created by ParaStation on remote nodes are not able to access files, if this files have enabled access only for a supplementary group the current user belongs to.
  • Page 37: Reference Pages

    Reference Pages This appendix lists all reference pages related to ParaStation5 administration tasks. For reference pages describing user related commands and information, refer to the ParaStation5 User's Guide. ParaStation5 Administrator's Guide...
  • Page 38 ParaStation5 Administrator's Guide...
  • Page 39: Parastation.conf

    parastation.conf parastation.conf — the ParaStation configuration file Description Upon execution, the ParaStation daemon psid(8) reads its configuration information from a configuration file which, by default, is /etc/parastation.conf. There are various parameters that can be modified persistently within this configuration file. The main syntax of the configuration file is one parameter per line.
  • Page 40 The following five types of parameters within the Hardware environment will get a special handling from the ParaStation daemon psid(8). These define different script files called in order to execute various operations towards the corresponding communication hardware. All these entries have the form of the parameter's name followed by the corresponding value. The value might be enclosed by single or double quotes in order to allow a space within.
  • Page 41 p4sock Use optimized communication via (Gigabit) Ethernet. The script handling this hardware type ps_p4sock is also located in the config subdirectory. It understands the following two environment variables: PS_TCP If set to an address range, e.g. 192.168.10.0-192.168.10.128, the TCP bypass feature of the p4sock protocol is enabled for the given address range.
  • Page 42 accounter This is actually a pseudo communication layer. It is only used for configuring nodes running the ParaStation accounting daemon and should be used only in a particular Nodes entry. NrOfNodes num Define the number of connected nodes including the frontend node. The nodes will be numbered 0 …...
  • Page 43 Node[s] hostname id [HWType-entry] [starter-entry] [runJobs-entry] [env name value] [env { name value ... }] Node[s] { {hostname id [HWType-entry] [starter-entry] [runJobs-entry] [env name value] [env { name value ... }] }... } Node[s] $GENERATE from-to/step nodestr idstr [HWType-entry] [starter-entry] [runJobs-entry] [env name value] [env { name value ...
  • Page 44 SelectTime time Set the timeout of the central select(2) of the ParaStation daemon psid(8) to time seconds. The default value is 2 seconds. This parameter can be set during runtime via the set selecttime directive within the ParaStation administration and management tool psiadmin(1). DeadInterval num The ParaStation daemon psid(8) will declare other daemons as dead after num consecutively missing multicast pings.
  • Page 45 The default port to use is 886. RLimit { Core size | CPUTime time | DataSize size | MemLock size | StackSize size | RSSize size } RLimit { { Core size | CPUTime time | DataSize size | MemLock size | StackSize size | RSSize size }...
  • Page 46 The value part of each line either is a single word or an expression enclosed by single or double quotes. The expression might contain whitespace characters. If the expression is enclosed by single quotes, it is allowed to use balanced or unbalanced double quotes within this expression and vice versa. This command might be used for example in order to set the PSP_NETWORK environment variable globally without the need of every user to adjust this parameter in his own environment.
  • Page 47 This only comes into play, if the user does not define a sorting strategy explicitely via PSI_NODES_SORT. Be aware of the fact that using a batch-system like PBS or LSF *will* set the strategy explicitely, namely to NONE. overbook { true | yes | 1 | false | no | 0 } If the argument is one of yes, true or 1, all nodes may be overbooked by the user using the PSI_OVERBOOK environment variable.
  • Page 48 rdpMaxRetrans number Set the maximum number of retransmissions within the RDP facility. If more than this number of retransmission would have been necessary to deliver the packet to the remote destination, this connection is declared to be down. See also psiadmin(1). statusBroadcasts number Set the maximum number of status broadcasts per round.
  • Page 49 ACK is sent piggyback within the next regular packet to this node or as soon as a retransmission occurred. If set to 1, each RDP packet received is acknowledged by an explicit ACK. Errors No known errors. See also psid(8), psiadmin(1) ParaStation5 Administrator's Guide...
  • Page 50 ParaStation5 Administrator's Guide...
  • Page 51: Psiadmin

    psiadmin psiadmin — the ParaStation administration and management tool Synopsis psiadmin [ -denqrsv? ] [ -c command ] [ -f program-file ] [ --usage ] Description The psiadmin command provides an administrator interface to the ParaStation system. The command reads directives from standard input in interactive mode. The syntax of each directive is checked and the appropriate request is sent to the local ParaStation daemon psid(8).
  • Page 52 --usage Display a brief usage message. Standard Input The psiadmin command reads standard input for directives until end of file is reached, or the exit or quit directive is read. Standard Output If Standard Output is connected to a terminal, a command prompt will be written to standard output when psiadmin is ready to read a directive.
  • Page 53 If nodes is empty, the node range preselected via the range command is used. The default preselected node range contains all nodes of the ParaStation cluster. The from and to parts of each range are node IDs. They might be given in decimal or hexadecimal notation and must be in the range between 0 and NumberOfNodes-1.
  • Page 54 count [hw hw] List the status of the communication system(s) on the selected node(s). Various counters are displayed. If the hw option is given, only the counters concerning the hw hardware type are displayed. The default is to display the counters of all enabled hardware types on this node. down List all nodes which are marked as "DOWN".
  • Page 55 TaskID The ParaStation task ID of the process, both as decimal and hexadecimal number. The task ID of a process is unique within the cluster and is composed out of the ParaStation ID of the node the process is running on and the local process ID of the process, i.e. the result of calling getpid(2).
  • Page 56 range {[nodes] | all } Preselect or display the default set of nodes If nodes or all is given, this directive modifies the default set of nodes all following directives will act on. nodes is given in the same syntax as within any other directive, i.e. a comma separated list of node ranges from-to, where a range might be trivial containing only the from part.
  • Page 57 master [nodes] Show the current master on the selected node(s). The master node's task is the management and allocation of resources within the cluster. It is elected among the running nodes during runtime. Thus usually all nodes should give the same answer to this question.
  • Page 58 cpumap [nodes] Show the CPU-slot to core mapping list for the selected nodes. bindmem [nodes] Show flag marking if this nodes uses binding as NUMA policy. adminuser [nodes] Show users allowed to start admin-tasks, i.e. unaccounted tasks. admingroup [nodes] Show groups allowed to start admin-tasks, i.e. unaccounted tasks. rl_addressspace [nodes] Show RLIMIT_AS on this node.
  • Page 59 rl_sigpending [nodes] Show RLIMIT_SIGPENDING on this node. rl_stack [nodes] Show RLIMIT_STACK on this node. supplementaryGroups [nodes] Show supplementaryGroups flag. statusBroadcasts [nodes] Show the maximum number of status broadcasts initiated by lost connections to other daemon. rdpTimeout [nodes] Show the RDP timeout configured in ms. deadLimit [nodes] Show the dead-limit of the RDP status module.
  • Page 60 hwstart [hw { hw | all } ] [nodes] Start the declared hardware on the selected nodes. Starting a specific hardware will be tried on the selected nodes regardless, if this hardware is specified for this nodes within the parastation.conf configuration file or not. On the other hand, if hw all is specified or the hw option is missing at all, only the hardware types specified within the configuration file are started.
  • Page 61 adminuser [ + | - ] { name | any } [nodes] Grant authorization to start admin-tasks, i.e. task not blocking a dedicated CPU, to a particular or any user. Name might be a user name or a numerical UID. If name is preceeded by a '+' or '-', this user is added to or removed from the list of adminusers respectively.
  • Page 62 Pattern Name PSC_LOG_PART 0x0000001 PSC_LOG_TASK 0x0000002 PSC_LOG_VERB 0x0000004 PSID_LOG_SIGNAL 0x0000010 PSID_LOG_TIMER 0x0000020 PSID_LOG_HW 0x0000040 PSID_LOG_RESET 0x0000080 PSID_LOG_STATUS 0x0000100 PSID_LOG_CLIENT 0x0000200 PSID_LOG_SPAWN 0x0000400 PSID_LOG_TASK 0x0000800 PSID_LOG_RDP 0x0001000 PSID_LOG_MCAST 0x0002000 PSID_LOG_VERB 0x0004000 PSID_LOG_SIGDBG 0x0008000 PSID_LOG_COMM 0x0010000 PSID_LOG_OPTION 0x0020000 PSID_LOG_INFO 0x0040000 PSID_LOG_PART 0x0080000 PSID_LOG_ECHO 0x0100000 PSID_LOG_FILE...
  • Page 63 Pattern Name RDP_LOG_CONN 0x0001 RDP_LOG_INIT 0x0002 RDP_LOG_INTR 0x0004 RDP_LOG_DROP 0x0008 RDP_LOG_CNTR 0x0010 RDP_LOG_EXTD 0x0020 RDP_LOG_COMM 0x0040 RDP_LOG_ACKS 0x0080 Table 3. RDP debug flags mcastdebug mask [nodes] Set the debugging mask of the MCast protocol within the ParaStation daemon psid(8) to mask on the selected node(s).
  • Page 64 nodesSort { PROC | LOAD_1 | LOAD_5 | LOAD_15 | PROC+LOAD | NONE } [nodes] Define the default sorting strategy for nodes when attaching them to a partition. The different possible values have the following meaning: PROC Sort by the number of processes managed by ParaStation on the corresponding nodes LOAD_1 Sort by the load average during the last minute on the corresponding nodes LOAD_5...
  • Page 65 bindmem [ 0 | 1 ] [nodes] Set flag marking if this nodes will use memory-binding as NUMA policy. Relevant values are 'false', 'true', 'no', 'yes', 0 or different from 0. cpumap map [nodes] Set the map used to assign CPU-slots to physical cores to map. Map is a quoted string containing a space-separated permutation of the number 0 to Ncore-1.
  • Page 66 quiet Quiet execution. Only a short message is printed if the test was successful. normal Normal execution with some messages during runtime. This is the default. verbose Very verbose execution with many message during runtime. Files Upon startup, psiadmin tries to find .psiadminrc in the current directory or in the user's home directory. The first file found is parsed and the directives within are executed.
  • Page 67: Psid

    psid psid — the ParaStation daemon. The organizer of the ParaStation software architecture. Synopsis psid [-v?] [-d level] [-f configfile] [-l logfile] [--usage] Description The ParaStation daemon is implemented as a Unix daemon process. It supervises allocated resources, cleans up after application shutdowns, and controls access to common resources. Thus, it takes care of tasks which are usually managed by the operating system.
  • Page 68 Options -d , --debug=level Activate the debugging mode and set the debugging level to level. If debugging is enabled, i.e. if level is larger than 0 and option -l is set to stdout, no fork(2) is made on startup, which is usually done in order to run psid as a daemon process in background.
  • Page 69: Test_Config

    test_config test_config — verify the ParaStation4 configuration file. Synopsis test_config [-vad? ] [-v ] [-a ] [-d ] [-? ] [-f filename] Description test_config reads and analyses the ParaStation4 configuration file. Any errors or anomalies are reported. By default, the configuration file /etc/parastation.conf will be used. Options -f filename Use configuration file filename.
  • Page 70 ParaStation5 Administrator's Guide...
  • Page 71: Test_Nodes

    test_nodes test_nodes — test physical connections within a cluster. Synopsis test_nodes [-np num] [-cnt count] [-map] [-type] Description Tests all or some physical (low level) connections within a cluster. Therefore the program is started on num nodes. After all processes came up correctly, each of them starts to send test packets to every other node of the cluster.
  • Page 72 ParaStation5 Administrator's Guide...
  • Page 73: Test_Pse

    test_pse test_pse — test virtual connections within a cluster. Synopsis test_pse [-np num] Description This command spawns num processes within the cluster. It's intended to test the process spawning capabilities of ParaStation. It does not test any communication facilities within ParaStation. Options -np num Spawn num processes.
  • Page 74 ParaStation5 Administrator's Guide...
  • Page 75: P4Stat

    p4stat p4stat — display information about the p4sock protocol. Synopsis p4stat [ -v ] [ -s ] [ -n ] [ -? ] [ --sock ] [ --net ] [ --version ] [ --help ] [ --usage ] Description Display information for sockets and network connections using the ParaStation4 protocol p4sock. Options -s, --sock Display information about open p4sock sockets.
  • Page 76 ParaStation5 Administrator's Guide...
  • Page 77: P4Tcp

    p4tcp p4tcp — configure the ParaStation4 TCP bypass. Synopsis p4tcp [ -v ] [ -a ] [ -d ] [ -? ] [ from [ to ]] Description p4tcp configures the ParaStation4 TCP bypass. Without an argument, the current configuration is printed. From and to are IP addresses forming an address range for which the bypass feature should be activated.
  • Page 78 ParaStation5 Administrator's Guide...
  • Page 79: Psaccounter

    psaccounter psaccounter — Write accounting information from the ParaStation psid to the accounting files. Synopsis psaccounter [ -e | --extend ] [ -d | --debug=pattern ] [ -F | --foreground ] [ -l | --logdir=dir ] [ -f | -- logfile=filename ] [ -p | --logpro ] [ -c | --dumpcore ] [ --coredir=dir ] [ -v | --version ] [ -? | --help ] [--usage] Description The command psaccounter collects information about jobs from the ParaStation psid daemon and writes...
  • Page 80 Calling psaccounter with -p gzip would call the command gzip yyyymmdd and therefore compress least recently used accounting file. -c, --dumpcore Define that a core file should be written in case of a catastrophy. By default, the core file will be written to /tmp.
  • Page 81: Psaccview

    psaccview psaccview — Print ParaStation accounting information. Synopsis psaccview [ -? | --help ] [ -h | --human ] [ -nh | --noheader ] [ -l | --logdir=dir ] [ -e | --exit=exitcode ] [ - q | --queue=queue ] [ -u | --user=user ] [ -g | --group=group ] [ -j | --jobname=jobname ] [ -lj | --ljobs ] [ - lu | --ltotuser ] [ -lg | --ltotgroup ] [ -ls | --ltotsum ] [ -st | --stotopt=optstring ] [ -sj | --sjobopt=optstring ] [ -t | --timespan=period ] [ -b | --begin=yyyymmdd ] [ -e | --end=yyyymmdd ] [ --jsort=criteria ] [ -- usort=criteria ] [ --gsort=criteria ] [ -v | --version ] [--usage]...
  • Page 82: Sort Options

    Grouping jobs -lj, --ljobs Print detailed jobs list. Lists all jobs, one per line. -lu, --ltotuser Print user list. Lists job summary per user, one user per line. -lg, --ltotgroup Print group list. Lists job summary per group, one group per line. -ls, --ltotsum Print total job summary.
  • Page 83 Upon startup psaccview tries to find the file .psaccviewrc in the user's home directory. Within this file, pre-defined variables in the command my be re-defined. See the configuration section within the psaccview script. The command expects one file per day, named as yyyymmdd, where yyyy represents the year, mm the month and dd the day for the data contained.
  • Page 84 These column names may also be used for sorting lists, where applicable. Files /var/account/* , /var/account/*.gz , /var/account/*.bz2 Accounting files, one per day. $HOME/.psaccviewrc Initialization file. See also psaccounter(8). ParaStation5 Administrator's Guide...
  • Page 85: Mlisten

    mlisten mlisten — display multicast pings from the ParaStation daemon psid(8) Synopsis mlisten [-dv?] [-m MCAST] [-p PORT] [-n IP] [-# NODES] [--usage] Description Display the multicast pings the ParaStation daemon psid(8) is emitting continuously. These pings are displayed by spinning bars. Each ping received from node N lets the Nth bar spin around one more step.
  • Page 86 ParaStation5 Administrator's Guide...
  • Page 87: Appendix A. Quick Installation Guide

    Appendix A. Quick Installation Guide This appendix gives a brief overview how to install ParaStation5 on a cluster. A detailed description can be found in Chapter 3, Installation and Chapter 4, Configuration. 1. Shutdown If this is an update of ParaStation, first shut down the ParaStation system. In order to do this, startup psiadmin and issue a shutdown command.
  • Page 88 Provided the ParaStation daemon is started by the xinetd, run the psiadmin(1) command located in / opt/parastation/bin and execute the add command. This will bring up the ParaStation daemon psid(8) on every node. # /opt/parastation/bin/psiadmin psiadmin> add Alternatively you can start psiadmin(1) with the -s option. To install the ParaStation daemon as a system service, started up at boot time, use # chk_config -a /etc/init.d/parastation This step must be repeated for each node.
  • Page 89: Appendix B. Parastation License

    Appendix B. ParaStation license The ParaStation software may be used under the following terms and conditions only. Software and Know-how License Agreement Version 1.0 between ParTec Cluster Competence Center GmbH place of business: Possartstr. 20, 81679 München represented by: Bernhard Frohwitter - in the following referred to as ParTec - and you - in the following referred to as "Licensee"...
  • Page 90 Commercial Use means any non-consumer use that is not covered by University Use. Know-how means program documents and information which relates to Software, also in machine readable form, in particular the Base Version Code and the detailed comments on the Base Version Code, provided together with the Base Version Code.
  • Page 91 § 6 Grant-Back 1. Licensee grants ParTec for Modifications being severable improvements a nonexclusive, perpetual, irrevocable, worldwide and royalty-free license, and for Modifications being non-severable improvements an exclusive, perpetual, irrevocable, worldwide and royalty-free license to a. use, reproduce, modify, display, prepare derivative works of and distribute its Modifications and derivative works thereof, in whole or in part, in source code and object code form, as part of the Software or other technologies based in whole or in part on Base Version Code or Technology;...
  • Page 92 2. A breach by Licensee of any one of the obligations under sections §4, §5 and §6, will automatically terminate Licensee's rights under this license. § 12 Rights after Expiration of the Agreement 1. All rights of Licensee on the use of the Base Version Code end at the expiration or termination of this agreement.
  • Page 93: Appendix C. Upgrading Parastation4 To Parastation5

    Appendix C. Upgrading ParaStation4 to ParaStation5 This appendix explains how to upgrade an existing ParaStation4 installation to the current ParaStation5 version. C.1. Building and installing ParaStation5 packages Just recompile the packages: # rpmbuild --rebuild psmgmt.5.0.0-0.src.rpm # rpm -U psmgmt.5.0.0-0.i586.rpm # rpmbuild --rebuild pscom.5.0.0-0.src.rpm # rpm -U pscom.5.0.0-0.i586.rpm # rpm -U pscom-modules.5.0.0-0.i586.rpm # rpmbuild --rebuild psmpi2.5.0.0-1.src.rpm...
  • Page 94 Changes to the runtime environment Use the mpiexec command instead! Executables linked with ParaStation4 can be run using the new mpiexec command. In this case, the option -b or --bnr is required. The environment variable PSP_P4SOCK was renamed to PSP_P4S, but still recognized. Within this version of ParaStation, both names may be used.
  • Page 95: Glossary

    Glossary Address Resolution Protocol Administration Network Administrative Task admin-task Data Network Direct Memory Access ParaStation5 Administrator's Guide A sending host decides, through a protocols routing mechanism, that it wants to transmit to a target host located some place on a connected piece of a physical network.
  • Page 96 Forwarder Logger Master Node Network Interface Card Non-Uniform memory access (NUMA) Parallel Task ParaStation Logger ParaStation Forwarder Process to store it to a given address. The rest of the jobs is done by this controller without producing further load to the CPU. Obviously this concept helps to disburden the CPU from work which is not its first task and thus gives more power to solve the actual application.
  • Page 97 Serial Task ParaStation5 Administrator's Guide A single process running on one of the compute nodes within the cluster. This process does not communicate with other processes using MPI. ParaStation knows about this process and where it is started from. A serial task may use multiple threads to execute, but all this threads have to share a common address space within a node.
  • Page 98 ParaStation5 Administrator's Guide...

This manual is also suitable for:

Parastation5

Table of Contents