Summary of Contents for MACROMEDIA COLDFUSION MX 61 - CONFIGURING AND ADMINISTERING COLDFUSION MX
Page 1
Configuring and Administering ColdFusion MX...
Page 2
If you access a third-party website mentioned in this guide, then you do so at your own risk. Macromedia provides these links only as a convenience, and the inclusion of the link does not imply that Macromedia endorses or accepts any responsibility for the content on those third-party sites.
Page 6
About squeezing deleted documents ....... . . 99 About optimized Verity databases ........100 Performance tuning options .
Page 7
CHAPTER 11: Searching Collections with the rcvdk Utility ....143 Using the Verity rcvdk utility........143 Attaching to a collection using the rcvdk utility.
INTRODUCTION Configuring and Administering ColdFusion MX is intended for anyone who needs to configure and manage their ColdFusion development environment. About Macromedia ColdFusion MX documentation The ColdFusion documentation is designed to provide support for the complete spectrum of participants. Documentation set...
Page 11
PART I Administering ColdFusion MX This part describes how to use the ColdFusion MX Administrator to manage the ColdFusion environment, including connecting to your data sources and configuring security for your applications Chapter 1: Administering ColdFusion MX......13 Chapter 2: Basic ColdFusion MX Administration .
CHAPTER 1 Administering ColdFusion MX This chapter presents an overview of the ColdFusion MX Administrator and how you can use it to manage your development environment. For procedures, see the ColdFusion MX Administrator online Help. Contents About the ColdFusion MX Administrator ........13 Accessing user assistance.
Administrator layout The home page of the ColdFusion MX Administrator includes links to Documentation, the Macromedia Servers TechNotes Knowledge Base, Release Notes, System Information, online Help, and Code Examples. The tasks that you perform in the ColdFusion MX Administrator are grouped into the following sections.
• Client Variables Configure an external data source, the operating system registry, or web browser cookies to store client variables. These can use and store information about a client browsing your site to provide customized page content. • Memory Variables Specify timeout values for Application and Session variables.
• System Probes Manage probes that monitor your application’s status. If a potential problem is detected, a system probe can send an alert e-mail message and execute a recovery script. • Code Analyzer Evaluate application code for potential incompatibilities between ColdFusion MX and ColdFusion Server 5.
CHAPTER 2 Basic ColdFusion MX Administration This chapter explains the basic ColdFusion MX administration tasks, following the structure of the ColdFusion MX Administrator sections. Contents Initial administration tasks........... . 17 Server Settings section .
Task Description Set up e-mail E-mail lets ColdFusion MX and ColdFusion applications send automated mail messages. To configure an e-mail server and mail options, use the Mail Server page of the Administrator. For more information, see “Mail Server page” on page Change passwords You might have to change the passwords that you set for the ColdFusion MX Administrator and RDS during ColdFusion MX installation.
Setting Description Missing Template Handler Specify a page to execute when ColdFusion MX cannot find a requested page. This specification is relative to the web root. If the user is running Internet Explorer with "Show Friendly HTTP error messages" enabled in advanced settings (the default), Internet Explorer will only display this page if it contains more than 512 bytes.
• In the operating system registry Caution: Macromedia recommends that you do not store client variables in the registry because it can critically degrade performance of the server. If you do use the registry to store client variables, you must allocate sufficient memory and disk space.
Page 21
Migrating client variable data To migrate your client variable data to another data source, you should know the structure of the database tables that store this information. Client variables stored externally use two simple database tables, like those shown in the following tables: CDATA Table Column Data type...
</cfquery> <cfquery name="global2" datasource="#DSN#"> CREATE INDEX id2 ON CGLOBAL (cfid) </cfquery> <cfquery name="global2" datasource="#DSN#"> CREATE INDEX id3 ON CGLOBAL (lvisit) </cfquery> Memory Variables page You use the Memory Variables page of the ColdFusion Administrator to enable application and session variables server-wide. By default, application and session variables are enabled when you install ColdFusion MX.
Mail Server page You use the Mail Server page of the ColdFusion MX Administrator to specify a mail server to send automated e-mail messages. ColdFusion MX supports the Simple Mail Transfer Protocol (SMTP) for sending e-mail messages and the Post Office Protocol (POP) for retrieving e-mail messages from your mail server.
Setting Description Spool mail messages for Select this option to route outgoing mail messages to the mail spooler. If delivery you disable this option, ColdFusion MX delivers outgoing mail messages immediately. In ColdFusion MX Enterprise Edition, you can spool (Memory spooling messages either to disk (slower, but messages persist across shutdowns) available for Enterprise or to memory (faster, but messages do not persist).
Setting Description Max number of charting Specify the maximum number of chart requests that can be processed threads concurrently. The minimum number is 1 and the maximum is 5. Higher numbers are more memory intensive. Disk cache location When caching to disk, specify the directory in which to store the generated charts.
After you archive the information, you can use the Administrator to deploy your web applications to the same ColdFusion MX server or to a ColdFusion MX server running on a different computer. Additionally, you can use these features to deploy and receive any ColdFusion archive file electronically.
For more information about building search interfaces, see the chapters about the cfindex , and tags in Developing ColdFusion MX Applications. cfsearch cfcollection ColdFusion lets you manage your collections from the Administrator. You can index, repair, optimize, purge, or delete Verity collections that are connected to ColdFusion. You use the buttons along the bottom of the Connected Verity Collections table to perform the following actions: Action...
This section also includes pages for managing your Log Files, Scheduled Tasks, System Probes, and the Code Compatibility Analyzer. Debugging Settings page The Debugging Settings page provides the following debugging options: Setting Description Enable Robust Exception Displays detailed information in the exceptions page, including Information the template’s physical path and URI, the line number and snippet, the SQL statement used (if any), the data source...
Page 29
Using the cfstat utility The cfstat command-line utility provides real-time performance metrics for ColdFusion MX. Using a socket connection to obtain metric data, the cfstat utility displays the information that ColdFusion MX writes to the System Monitor without actually using the System Monitor application.
Metric abbreviation Metric name Description Bytes In/Sec Bytes incoming per second The number of bytes that ColdFusion MX read in the last second (not an average). Bytes Out/Sec Bytes outgoing per second The number of bytes that ColdFusion MX wrote in the last second (not an average).
Logging Settings page You use the Logging Settings page of the Administrator to change ColdFusion MX logging options. The following table describes the settings: Setting Description Log directory* Directory to which error log files are written. Maximum file size (kb) Set the maximum file size for log files. Once a file hits this size, it will be automatically archived.
Description car.log Records errors associated with Site Archive and Restore operations. mail.log Records errors generated by an SMTP mail server. mailsent.log Records messages sent by ColdFusion MX. flash.log Records entries for Flash Remoting. Scheduled Tasks page You use the Scheduled Tasks page to schedule the execution of local and remote web pages and to generate static HTML pages.
Code Compatibility Analyzer page The Code Compatibility Analyzer evaluates your ColdFusion pages for potential incompatibilities between ColdFusion MX and ColdFusion Server 5. Extensions section You use the Extensions section of the Administrator to configure ColdFusion MX to work with other technologies, such as Java and CORBA. This section contains the Java Applets, CFX Tags, Custom Tag Paths, and CORBA Connectors pages.
(embedded) Note: Macromedia will provide implementations of the connectors for some of the popular ORBs. For those that are not supported, Macromedia will make the source available under NDA to a select group of third-party candidates and/or ORB vendors.
Sandbox Security page You use the Sandbox Security page (called Resource Security in the Standard Edition) to specify security permissions for data sources, tags, functions, files, and directories. Sandbox security uses the location of your ColdFusion pages to determine functionality. A sandbox is a designated area (CFM files or directories containing CFM files) of your site to which you apply security restrictions.
CHAPTER 3 Data Source Management This chapter describes the configuration options for ColdFusion MX data sources. For basic information on data sources and connecting to databases, see Developing ColdFusion MX Applications. Contents About JDBC ............. . 37 Adding data sources .
Disadvantages The ODBC driver, and possibly the client database libraries, must reside on the ColdFusion server computer. Performance is also below par. Macromedia does not recommend this driver type unless your application requires specific features of these drivers. Native-API/partly Converts JDBC calls into database-specific calls.
Driver Type Reference Microsoft Access with Unicode “Connecting to Microsoft Access with Unicode” support on page 46 Microsoft SQL Server 7.x, 2000 “Connecting to Microsoft SQL Server 7.x, 2000” on page 47 MySQL “Connecting to MySQL” on page 48 ODBC Socket “Connecting to ODBC Socket”...
Page 40
Click Add. A form for additional DSN information appears. The available fields in this form depend on the Driver that you selected. In the Database field, enter the name of the database; for example, Northwind. In the Server field, enter the network name or IP address of the server that hosts the database, and enter any required Port value;...
Connecting to DB2 Universal Database 6.x, 7.2, and OS/390 Use the settings in the following table to connect ColdFusion to DB2 Universal Database 6.x, 7.2, and OS/390 data sources: Setting Description CF Data Source Name The data source name (DSN) used by ColdFusion MX to connect to the data source.
Page 42
Setting Description Limit Connections Specifies whether ColdFusion MX limits the number of database connections for the data source. If you enable this option, use the Restrict Connections to field to specify the maximum. Restrict Connections Specifies the maximum number of database connections for the data source.
Connecting to Informix 9.x Use the settings in the following table to connect ColdFusion MX to Informix 9.x data sources: Setting Description CF Data Source Name The data source name (DSN) used by ColdFusion MX to connect to the data source. Database The database to which this data source connects.
Setting Description CLOB Select to return the entire contents of any CLOB/Text columns in the database for this data source. If not selected, ColdFusion MX retrieves the amount specified in the Long Text Buffer setting. BLOB Select to return the entire contents of any BLOB/Image columns in the database for this data source.
Page 45
Setting Description Default Password The password (16-character limit) that ColdFusion MX passes to the JDBC driver to connect to the data source if a ColdFusion application does not supply a password-for example, in a tag. cfquery Return Timestamp as Enable this setting if your application retrieves Date/Time data and then String re-uses it in SQL statements without applying formatting (using functions such as DateFormat, TimeFormat, and CreateODBCDateTime).
Connecting to Microsoft Access with Unicode Type 2 driver. Use the settings in the following table to connect ColdFusion MX to Microsoft Access with Unicode data sources: Setting Description CF Data Source Name The data source name (DSN) used by ColdFusion MX to connect to the data source.
Setting Description BLOB Select to return the entire contents of any BLOB/Image columns in the database for this data source. If not selected, ColdFusion MX retrieves the amount specified in the Blob Buffer setting. LongText Buffer The default buffer size, used if Enable Long Text Retrieval(CLOB) is not selected.
Setting Description Maintain Connections ColdFusion MX establishes a connection to a data source for every operation that requires one. Enable this option to improve performance by caching the data source connection. String Format Enable this option if your application uses Unicode data in DBMS- specific Unicode datatypes such as National Character or nchar.
Page 49
Setting Description Username The user name that ColdFusion MX passes to the JDBC driver to connect to the data source if a ColdFusion application does not supply a user name; for example, in a tag. cfquery Password The password (16-character limit) that ColdFusion MX passes to the JDBC driver to connect to the data source if a ColdFusion application does not supply a password;...
Connecting to ODBC Socket Type 3 driver. Use the settings in the following table to connect ColdFusion MX to ODBC Socket data sources: Setting Description CF Data Source Name The data source name (DSN) used by ColdFusion MX to connect to the data source.
Setting Description BLOB Buffer The default buffer size, used if Enable binary large object retrieval (BLOB) is not selected. Default is 64000 bytes. Allowed SQL The SQL operations that can interact with the current data source. Connecting to Oracle R3 (8.1.7), Oracle 9i Use the settings in the following table to connect ColdFusion MX to Oracle R3 (8.1.7), Oracle 9i data sources: Setting...
Setting Description Interval (min) The time (in minutes) that the server waits between cycles to check for expired data source connections to close. Disable Connections If selected, suspends all client connections. Login Timeout (sec) The number of seconds before ColdFusion MX times out the data source connection login attempt.
Page 53
Setting Description Maintain Connections ColdFusion MX establishes a connection to a data source for every operation that requires one. Enable this option to improve performance by caching the data source connection. Timeout (min) The maximum number of minutes after the data source connection is made that you want ColdFusion MX to cache a connection after it is used.
Connecting to Sybase 11.5, 11.9, 12.0, and 12.5 Use the settings in the following table to connect ColdFusion MX to Sybase 11.5, 11.9, 12.0, and 12.5 data sources: Setting Description CF Data Source Name The data source name (DSN) used by ColdFusion MX to connect to the data source.
Setting Description Max Pooled Statements Enables reuse of prepared statements (that is, stored procedures and queries that use the tag). Although you tune this setting cfqueryparam based on your application, start by setting it to the sum of the following: •...
CHAPTER 4 Web Server Management This chapter discusses connecting ColdFusion MX to the built-in web server and to external web servers, such as Apache, IIS, and SunONE Web Server (formerly known as iPlanet). It explores common scenarios, security, multi-hosting, and other issues that you might find helpful. The discussions in this chapter apply when running ColdFusion MX in the server configuration;...
All web servers listen on a TCP/IP port and this port can be specified in the URL. By default, web servers listen for HTTP request on port 80 (for example, http://www.macromedia.com and http:/ /www.macromedia.com:80 are the same). Similarly, 443 is the default port for HTTPS requests.
Using an external web server ColdFusion MX uses the JRun web server connector to forward requests from an external web server to the ColdFusion MX runtime system. When a request is made for a CFM page, the connector on the web server opens a network connection to the JRun proxy service.
Open a console window. Tip: In Windows, you can start the Web Server Configuration Tool by selecting Start > Programs > Macromedia ColdFusion MX > Web Server Configuration Tool. Change to the cf_root/runtime/lib (server configuration) or jrun_root/lib (JRun J2EE configuration) directory.
Page 61
Option Description -site Specifies the IIS website name. Specify All or 0 to configure the connector at a global level, which applies to all IIS websites. Specifies the ColdFusion server address. The default is localhost. -host -server Specifies the ColdFusion server name. The default is default. Specifies a username defined to the JRun server.The default is guest -username account.
Using the batch files and shell scripts ColdFusion MX ships with batch files and shell scripts that implement typical command-line connector configurations. These files are in cf_root/bin/connectors. For example, IIS_connector.bat configures all sites in IIS to site 0, which establishes a globally defined connector so that all sites inherit the filter and mappings.
Page 63
Enables native OS memory allocation rather than the web server’s allocator (for use on Solaris with iPlanet at the direction of Macromedia Support staff). Each time you run the Web Server Configuration Tool, it creates a new directory beneath cf_root/ runtime/lib/wsconfig.
Page 64
#JRunConfig Errorurl <optionally redirect to this URL on errors> AddHandler jrun-handler .cfm .cfc .cfml .jsp .jws </IfModule> IIS configuration file For IIS, JRun uses the jrun.ini file to initialize jrun.dll (jrun_iis6.dll on IIS 6). A typical jrun.ini file follows: verbose=false scriptpath=/JRunScripts/jrun.dll serverstore=C:/CFusionMX/runtime/lib/wsconfig/1/jrunserver.store bootstrap=127.0.0.1:51010...
Advanced configurations You typically use the Web Server Configuration Tool to configure a connection between the web server and ColdFusion server running on the same computer. However, you can use the web server connector to route requests to multiple virtual sites to a single ColdFusion server. This section also describes how to configure SSL between the web server and ColdFusion MX.
Page 66
The JavaScript validation used by the tag references the CFIDE/scripts/cfform.js file. cfform However in a multi-homed environment, each virtual website may not contain this directory and file. Either copy this file and store it in your virtual website’s web root in a CFIDE/scripts directory or modify all tags to use the attribute to specify the location of the...
Page 67
Restart Apache to ensure that the virtual hosts are defined correctly. You store CFM files for each virtual host in the directory specified by the DocumentRoot directive. Test each virtual host to ensure that HTML pages are served correctly. Run the Web Server Configuration Tool, as follows: Specify Apache for the Web Server, specify the directory containing the httpd.conf file, and select the Configure web server for ColdFusion MX applications option.
For example: keytool keytool -genkey -dname "cn=<server name or IP address>, ou=CFEngineering, o=Macromedia, L=Newton, ST=MA, C=US" -keyalg rsa -keystore <keystore name> When prompted, enter appropriate passwords that are six or more characters in length. Rerun to add certificates to the keystore.
CHAPTER 5 Administering Security You can secure a number of ColdFusion MX resources with password access and configure sandbox security. This chapter describes configuration options for ColdFusion security. Contents About ColdFusion MX security ..........69 Using sandbox security.
Administrator. Password protection for accessing the Administrator helps guard against unauthorized modifications of ColdFusion MX, and Macromedia highly recommends using passwords. You can disable or change the Administrator password on the Security > CF Admin Password page. RDS password protection...
This hierarchical arrangement of security permits the configuration of personalized sandboxes for users with different security levels. For example, if you are a web hosting administrator who hosts several clients on a ColdFusion shared server, you can configure a sandbox for each customer. This prevents one customer from accessing the data sources or files of another customer.
Adding a sandbox (Enterprise Edition only) ColdFusion MX Enterprise Edition lets you define multiple security sandboxes. To add a sandbox: Open the Security > Sandbox Security page in the ColdFusion MX Administrator. The Sandbox Security Permissions page appears. In the Add Security Sandbox box, enter the name of the new sandbox. This name must be either a ColdFusion mapping (defined in the Administrator) or an absolute path.
Page 73
To enable files or directories, in the File Path box, enter or browse to the files or directories; for example, C:\pix. A file path consisting of the special token <<ALL FILES>> matches any file. For information on using the \- and \* wildcard characters, see “About directories and permissions”...
CHAPTER 6 Using Multiple Server Instances When you install ColdFusion MX Enterprise using a J2EE deployment, you can use J2EE application-server-specific functionality to create multiple server instances. Deploying ColdFusion MX on multiple server instances lets you isolate individual applications and leverage clustering functionality.
File location considerations In the J2EE configuration, you can store CFM pages either under the external web server root or under the ColdFusion web application root. ColdFusion MX first looks for CFM files in the web application root and then looks in the external web server root. The discussions in this chapter assume that you are using an external web server and that you store your CFM pages under the external web server root.
Enabling application isolation When you install the J2EE version of ColdFusion MX Enterprise on top of JRun, you can use the JMC to create multiple server instances and deploy ColdFusion MX on each instance. This configuration provides multiple ColdFusion MX web applications in fully independent processes, with no shared ColdFusion or J2EE server resources.
Open the ColdFusion MX Administrator on the server instance using the built-in web server (hostname:portnumber/CFIDE/administrator/index.cfm) and define the resources (such as data sources and Verity collections) required for the application. Performing this step also ensures that ColdFusion MX was deployed successfully. Using your web-server-specific method, create a virtual website (or separate website) for the application.
Page 79
To configure multiple server instances for application isolation when using Apache: Run the Web Server Configuration Tool once, specifying the location of the Apache httpd.conf file and any other required information. The Web Server Configuration Tool creates a sequentially numbered subdirectory under jrun_root/lib/wsconfig.
Page 80
ServerName myemployee ErrorLog logs/error-employee.log </VirtualHost> For each directive, copy the directive from its default location outside VirtualHost IfModule directive to the last element in the directive. VirtualHost VirtualHost Delete the , and elements in the directive for each Apialloc Ignoresuffixmap IfModule virtual host.
Configuring application isolation in SunONE Web Server When using multiple virtual hosts with multiple server instances under SunONE Web Server, you create multiple SunONE Web Server instances, one for each ColdFusion server instance. This discussion assumes that you have already created server instances, as described in “Enabling application isolation”...
Page 82
Open the cluster by clicking the cluster name in the left panel. Open the first server instance by clicking its name in the list. Open the Macromedia ColdFusion MX application. Specify the context path (usually /). Select Enable Session Replication.
Page 83
PART II Administering Verity This part describes the Verity search tools and utilities that you can use for configuring the Verity K2 Server search engine, as well as creating, managing, and troubleshooting Verity collection. Chapter 7: Introducing Verity Tools ......85 Chapter 8: Managing Collections with the mkvdk Utility .
CHAPTER 7 Introducing Verity Tools This chapter provides an overview of the advanced Verity features included in ColdFusion MX. These include several utilities that you can use to configure, manage, and troubleshoot search functionality in your ColdFusion applications. This chapter also introduces the Verity K2 Server, which lets you provide high-performance search capabilities for your ColdFusion applications.
125,000 documents for ColdFusion MX Professional 250,000 documents for ColdFusion MX Enterprise 750,000 documents for Macromedia Spectra sites Note: Each row in a database table is considered a document. If you install a fully licensed version of Verity Server and you configure ColdFusion MX to use it, ColdFusion MX will not restrict document searches.
Verity search modes in ColdFusion MX Your ColdFusion MX applications can search Verity collections using two modes: • VDK mode The default ColdFusion MX search mode. You register a collection with ColdFusion MX by using the tag or by using the Verity Collections page in the cfcollection ColdFusion MX Administrator (which also uses the tag).
Verity information storage All Verity configuration data and collection name registration information are stored in an XML file (neo-verity.xml), which is used solely by the ColdFusion server. This XML file, which is located in cf_root/lib, contains two collection lists. One list contains collections that are registered with ColdFusion MX;...
CHAPTER 8 Managing Collections with the mkvdk Utility The mkvdk utility is a command-line utility installed with ColdFusion MX. You can use it to perform maintenance operations on Verity collections. Contents About the Verity mkvdk utility ..........89 Getting started with the Verity mkvdk utility.
Page 90
A new partition is created, which includes an index and an attribute table. Assist data is generated, which might include a spanning word list. When problems occur during an operation, the mkvdk utility writes error messages to the system log file (sysinfo.log). You can direct error and other messages to the console by using the mkvdk command with the option.
Getting started with the Verity mkvdk utility The following is the basic mkvdk syntax: mkvdk -collection path [option] [...] [filespec] [...] Where: Square brackets ( [ ] ) indicate optional items. An ellipsis (...) indicates repetition of the previous item. Thus, indicates [filespec] [...] an optional series of filespec items.
Collection setup options The mkvdk utility has a variety of collection setup options, which the following table describes: Option Description -create Creates a collection in the specified collection directory. It creates the directory structure, determines the index contents and sets up the document’s table schema according to the style files used.
General processing options The mkvdk utility provides a variety of general processing options, which the following table describes: Option Description Specifies the path of the collection to create or open. This option is required -collection path to execute the mkvdk utility. Turns off file locking.
Page 94
Option Description -noindex Prevents indexing by this instance of mkvdk. Documents are not inserted or deleted. Using this option turns off the service-level VdkServiceType_Index. (Service types are described under -nooptimize Specifies the name of the character set to which to map all strings for your -charmap name application.
Bulk inserting or deleting The following command specifies bulk insertion of a list of documents: mkvdk -collection coll -bulk -insert filespec Where filespec is the list of files to insert. Since insert is the default, the following command is equivalent to the preceding command: mkvdk -collection coll -bulk filespec The following command specifies bulk deletion of a list of documents: mkvdk -collection coll -bulk -delete filespec...
Message options The mkvdk utility provides a variety of messaging options, as described in the following table: Option Description -quiet Displays only fatal and error messages to the console. It overrides the setting. For a list of message types, see the table in “The mkvdk -outlevel utility syntax”...
Bulk submit options The mkvdk utility provides a variety of bulk submit options, as described in the following table: Option Description -bulk Interprets filespec as a bulk submit file. You can use this option with the -insert , and options. -update -delete Specifies the offset into a bulk submit file or files.
Option Description -persist Services the collection repeatedly, at default intervals of 30 seconds. Use the option to set a different interval. -sleeptime Specifies the interval between service calls when the mkvdk utility is run with the -sleeptime sec option. -persist Performs various optimizations on the collection, depending on the value of -optimize spec spec.
Performs the most comprehensive housekeeping possible, and removes out-of-date collection files. Macromedia recommends this optimization only when you are preparing an isolated collection for publication. When using this type, if the collection is being searched, files sometimes get deleted too early, which can affect search results.
You can safely squeeze deleted documents for a collection at anytime, because the mkvdk utility ensures that the collection is available for searching and servicing through its self-administration features. The application does not need to temporarily disable a collection to squeeze deleted documents, because when a squeeze request is made, the mkvdk utility assigns a new revision code to the collection.
CHAPTER 9 Indexing Collections with Verity Spider This chapter contains basic Verity Spider information and explains how to index documents on your website. Contents About Verity Spider ............101 About Verity Spider syntax.
Web standard support Verity Spider supports key web standards used by Internet and intranet sites. Standard HREF links and frames pointers are recognized, so that navigation through them is supported. Redirected pages are followed so that the real underlying document is indexed. Verity Spider adheres to the robots exclusion standard specified in robots.txt files, so that administrators can maintain friendly visits to remote websites.
Multithreading Since version 3.1, Verity Spider has separated the gathering and indexing jobs into multiple threads for concurrence. Verity Spider V3.7 can create concurrent connections to web servers for fetching documents, and have concurrent indexing threads for maximum utilization. This translates to an overall improvement in throughput.
At its most basic level, a Verity Spider command consists of the following: vspider -initialize -collection coll [options] Where (when starting points have changed), and -initialize -start -refresh is required to provide a target for the Verity Spider, and can be a near- -collection [options] limitless combination of the options described later in this chapter.
Page 105
If an indexing task halts, you can rerun the task as-is. The persistent store for the specified collection is read, and only those candidate URLs that are in the queue but not yet processed are parsed. Candidate URLs correspond to URLs of the following status, as reported by vsdb: cand, used, inse, upda, dele, fail Repository type Starting point...
For better readability, put each option and any parameters on a single line. Verity Spider can properly parse the lines. Note: Macromedia strongly recommends that you take advantage of the abstraction offered by this option. This can greatly reduce user error in erroneously including or omitting options in subsequent indexing jobs.
-style Syntax: -style path Specifies the path to the style files to use when creating a new collection. option is not specified, Verity Spider uses the default style files in cf_root/lib/ If the -style common/style. Note: You can safely omit the option when resubmitting an indexing job, as the style -style information will already be part of the collection.
Page 108
-maxindmem Syntax: -maxindmem kilobytes Specifies the maximum amount of memory, in kilobytes, used by each indexing thread. Specify the number of threads with the option. -indexers By default, each indexing thread uses as much memory as is available from the system. -maxnumdoc Syntax: -maxnumdoc num_docs...
Page 109
-nodupdetect Type: Web crawling only Disables checksum-based detection of duplicates when indexing websites. URL-based duplicate detection is still performed. By default, a document checksum is computed based on the CRC-32 algorithm. The checksum combined with the document size is used to determine if the document is a duplicate. See also -followdup -noindex...
Page 110
-preferred Type: Web crawling only Syntax: -preferred exp_1 [exp_n] ... Specifies a list of hosts or domains that are preferred when retrieving documents for viewing. You can use wildcard expressions, where the asterisk (*) is for text strings and the question mark (?) is for single characters.
Page 111
For example, if you want to use a script called fix_bif to add customized information to BIF files, use the following command: vspider -cmdfile filename Where is the text-only command file that contains the following (along with any other filename necessary options): -processbif 'fix_bif !*' Your command file will include other options as well.
-temp Syntax: -temp path Specifies the directory for temporary files (disk cache). By default, the temp directory is under the job directory (optionally specified with the option). -jobpath If you do not specify a value for this option, Verity Spider creates a /spider/temp directory within the collection.
Page 113
-header Type: Web crawling only Syntax: -header string Specifies an HTTP header to add to the spidering request; for example: -header "Referer: http://www.verity.com/" Verity Spider sends some predefined headers, such as Accept and User-Agent, by default. Special headers are sometimes necessary to correctly index a site. For example, earlier versions of Verity Spider did not support the Host header, which is needed for Virtual Host indexing.
Page 114
You cannot use the question mark (?) wildcard, and the option does not let you use -regexp regular expressions. In Windows, include double-quotation marks around the argument to protect the asterisk special character (*). On UNIX, use single-quotation marks. This is only required when you run the indexing job from a command line.
Path and URL options The following sections describe the Verity Spider path and URL options. -auth Syntax: -auth path_and_filename Specifies an authorization file to support authentication for secure paths. Use the option to specify the authorization file. The file contains one record per line. Each -auth line consists of server, realm, username, and password, separated by whitespace.
Page 116
-followdup Specifies that Verity Spider follows links within duplicate documents, although only the first instance of any duplicate documents is indexed. You might find this option useful if you use the same home page on multiple sites. By default, only the first instance of the document is indexed, while subsequent instances are skipped. If you have different secondary documents on the different sites, using the option lets you -followdup...
Page 117
-nofollow Type: Web crawling only Syntax: -nofollow "exp" Specifies that Verity Spider cannot follow any URLs that match the exp expression. If you do not specify an exp value for the option, Verity Spider assumes a value of "*", where no -nofollow documents are followed.
Page 118
Example For the following URL, the path length would be four: http://www.spider:80/comics/fun/funny/world.html <-1-> <2> <-3-> <---4---> For the following file system path, the path length would be three: C:\files\docs\datasheets <-1-><-2-><---3---> The default value is 100 path segments. -refreshtime Syntax: -refreshtime timeunits Specifies not to refresh any documents that have been indexed since the timeunits value began.
Normally, when Verity Spider resolves host names, it uses DNS lookups to convert the names to canonical names, of which there can be only one per machine. This allows for the detection of duplicate documents, to prevent results from being diluted. In the case of multiple aliased hosts, however, duplication is not a barrier as documents can be referred to by more than one alias and yet remain distinct because of the different alias names.
Page 120
-include Specifies that only those files, paths, and URLs that match the specified expression or expressions will be followed. If you use backslashes, you must double them so that they are properly escaped; for example: C:\\test\\docs\\path You can use wildcard expressions, where the asterisk (*) is for text strings and the question mark (?) is for single characters;...
Page 121
Where the option prevents Verity Spider from even following anything that matches -exclude the specified expressions, the option allows Verity Spider to follow anything while -indexclude only skipping that which matches the specified expressions. For document types, use the option instead. -indmimeexclude Note: When specifying a URL, you must use full, absolute paths using the same format as appears in the HTML hyperlink.
Page 122
-indmimeexclude Syntax: -indmimeexclude mime_1 [mime_n] ... Specifies that only those MIME types that match the expressions be followed but not indexed. In Windows, include double-quotation marks around the argument to protect the special characters, such as the asterisk (*). On UNIX, use single-quotation marks. This is only required when you run the indexing job from a command line.
Page 123
-indskip Syntax: -indskip HTML_tag "exp" Type: Web crawling only Specifies that Verity Spider follow and parse links, but not index, any HTML document that contains the text of exp within the given HTML_tag. For multiple HTML_tag and exp combinations, use multiple instances of the option.
Page 124
-metafile Type: Web crawling only Syntax: -metafile path_and_filename Allows you to use a text file to map custom meta tags to valid HTTP header fields. If you use backslashes, you must double them so that they are properly escaped; for example: C:\\test\\docs\\path This means that you can use your own meta tag, in the document, to replace what is returned by the web server, or to insert it if nothing is returned.
Page 125
-mimeinclude Syntax: -mimeinclude mime_1 [mime_n] ... Specifies MIME types to be included. In Windows, include double-quotation marks around the argument to protect the special characters, such as the asterisk (*). On UNIX, use single-quotation marks. This is only required when you run the indexing job from a command line. Quotation marks are not necessary within a command file (the option).
Example 2 To skip all HTML documents that contain both the word "private" and the phrase "internal user" in any paragraph element, use the following: -skip title "personnel" -skip p "*internal use*" See also -regexp Locale options The following sections describe the Verity Spider locale options. -charmap Syntax: -charmap name...
Logging options The following sections describe the Verity Spider logging options. -loglevel Syntax: -loglevel [nostdout] argument Specifies the types of messages to log. By default, messages are written to standard output and to various log files in the subdirectory named /log beneath the Verity Spider job directory. If you add nostdout to the option, messages are not written to standard output.
Loglevel arguments Description debug Includes the following message types: information, warning, error, badkey, progress, summary, skip, debug Note: Only use this argument at the direction of Verity technical support or for troubleshooting indexing problems. trace Includes the following message types: information, warning, error, badkey, progress, summary, skip, debug, trace Note: Only use this argument at the direction of Verity technical support or for troubleshooting indexing problems.
Setting MIME types You can use the MIME type criteria options, -mimeinclude -indmimeinclude -mimeexclude , to include or exclude MIME types. -indmimeexclude Syntax restrictions When you specify MIME type criteria, keep in mind the restrictions described in the following sections. Using the wildcard character (*) The asterisk (*) wildcard character does not operate as a regular expression for the value of the MIME type criteria.
You can examine the indexing job’s log files for indications that files are being skipped due to MIME types. For example, a typical ASCII file you might want indexed is a log file (filename.log). Unless the web server understands that files with .LOG extensions are ASCII text, of MIME type text/plain, you will see in the indexing job log file that .LOG files are skipped because of MIME type, even if you use the following: -mimeinclude ’text/*’...
Known MIME types for file system indexing The following table lists the MIME types that Verity Spider recognizes when indexing file systems: Format MIME type Extension HTML text/html htm, html ASCII text/plain txt, text ASCII, source files text/plain c, h, cpp, cxx application/pdf MS Word application/msword...
Page 132
Chapter 9: Indexing Collections with Verity Spider...
CHAPTER 10 Searching Collections with K2 Server This chapter provides information about how to configure the Verity K2 Server, which is installed with ColdFusion MX. Contents Using K2 Server ............133 Stopping K2 Server .
Page 134
To edit the k2server.ini file: Open the k2server.ini file in your text editor. Tip: Use your text editor’s search function to locate the appropriate code. For example, to locate the settings for the port number, as described in the next step of this procedure, search for portNo=.
-ntservice 1 -inifile k2server.ini Note: Macromedia does not recommend running K2 Server as a Windows service. You must stop the service before you modify or delete collections registered with K2 Server. You must then remember to restart the service. You must also verify that the vdkHome information in your k2server.ini file is uncommented—that is, it has no leading pound (#) signs—and points to the correct...
100 file handles allocated for each search thread. The search engine determines default values per operating system. For large or fragmented collections, Macromedia recommends that you explicitly set a value for maxFiles. Chapter 10: Searching Collections with K2 Server...
Parameter Description portNo The TCP port number for client connections. The value of portNo is the same value assigned to portNo in the k2broker.ini file that identifies the broker referring to this server. numListeners The maximum number of clients that can connect to the server at one time. The numListeners value must be equal to or greater than the sum of all numThreads values specified by all K2 Brokers in the K2 search system.
Keyword Description locale The name of the locale (combination of language, dialect, and character set) to use for all internal Verity engine operations. This name must correspond to a subdirectory in the common directory where the configuration file for the locale is found and where the message database and other locale-specific files are located.
Page 139
Increment the block label for each collection that you configure, starting with Coll-0. The following table describes the keywords used to configure each collection and search service: Keyword Description collPath The pathname identifying the collection home directory. collAlias An arbitrary name used to identify the collection. topicSet The pathname to a directory for the default topic set, which is an indexed set of topics.
Keyword Description charMap A string that names the character set to use for strings that are sent into the server and generated by the server. This string must match the name of a .cs file in the root of the common directory that configures a character set and its mappings.
Page 141
rck2 command Description c <collections> The list of collections to search. Multiple collections must be specified in a space-separated list. For example: c coll1 coll2 coll3 The list of fields to retrieve. For example: f <fields> f k2dockey title date s <query text>...
Page 142
Chapter 10: Searching Collections with K2 Server...
CHAPTER 11 Searching Collections with the rcvdk Utility This chapter provides information about using the rcvdk utility to search Verity collections. Contents Using the Verity rcvdk utility ..........143 Attaching to a collection using the rcvdk utility .
command produces the following list of available commands: help RC> help Available commands: search s Search documents. results r Display search results. clusters c Display clustered search results. view v View document. summarize z Summarize documents. attach a Attach to one or more collections. detach d Detach from one or more collections.
Viewing results of the rcvdk utility After you have attached to a collection and issued a search command successfully, you can view the results list and look at the retrieved documents. You can use the options in the following table: Option Description Displays the results list, starting with the first document.
The following table describes each of the default fields: Field name Description Number The rank of the document in the results list. The document with the highest score is ranked number 1. Score The score assigned to each retrieved document, based on its relevance to the query. For a NULL query, no scores are assigned, so the Score column in the results list is blank.
Page 147
17: Custom Zone Definitions 18: The KeyView Filter Kit RC> Displaying multiple fields You can specify multiple fields with the command, as shown in the following example. fields The field order corresponds to the order of the columns, with the first field specified appearing in the second column.
Page 148
Chapter 11: Searching Collections with the rcvdk Utility...
CHAPTER 12 Troubleshooting Collections with Verity Utilities This chapter provides information about using Verity utilities to configure, maintain, and troubleshoot Verity collections. Contents Overview of Verity utilities........... 149 Using the Verity didump utility .
Using the Verity didump utility Using the didump utility, you can view key components of the word index per partition. The word list is a list of all words indexed by the Verity engine. The zone list is a list of all zones and the zone attribute list is a list of the zone attributes found by the Verity engine.
Viewing the zone list with the didump utility The zone list contains a list of the zones identified by the zone filter. You can search the zones listed using the Verity IN operator in a query. To view the contents of the zone list, use the didump utility with the flag plus the pathname to a partition, like the following: -zones...
The columns in the display indicate the following: • Size The number of bytes used by the Verity engine to store information about the zone attribute • The number of unique documents in which the zone attribute appears • Word The total number of occurrences of a zone attribute for the partition Using the Verity browse utility A documents table is built for each partition in a collection.
Displaying fields You can use several options to control the display of field information. To display all the document fields: At the Action prompt, enter ## Press Return twice to display the fields for the first document record. Press Return to view the document fields for the next sequential record. The following partial display of the results of the browse command includes internal fields, used by the Verity search engine.
Merging collections using the merge utility The following is the syntax for using the merge utility to merge multiple collections into a single collection: merge <newCollection> <srcCollection1> <srcCollection2> [srcCollectionN] The utility reads srcCollection1, srcCollection2 and so on and merges them into a single collection with the directory name given for newCollection If the directory name given for newCollection does not exist, it is created.
CHAPTER 13 Verity Error Messages This chapter provides information about error messages that might occur when using Verity in either VDK mode or K2 mode. Contents VDK mode error codes ........... . . 155 K2 mode error codes .
Error code Description VdkError_NestedFree (-18) VdkSessionFree called reentrantly. VdkError_Unsupported (-19) Using an unsupported feature. Runtime error codes Error code Description VdkError_NoMsgDb (-20) Cannot find the message database. VdkError_FatalError (-21) Fatal error. VdkError_OutOfMemory (-22) Out of memory. VdkError_DiskFull (-23) Out of disk space. VdkError_NoFileHandles (-24) Out of file handles.
Error code Description VdkError_LicenseField (-117) No support for field search. VdkError_LicenseAccrue (-118) No support for the ACCRUE operator. VdkError_LicenseProximity (-119) No support for the proximity operators. VdkError_LicenseStem (-120) No stemming. VdkError_LicenseWildcard (-121) No support for wildcard queries. VdkError_LicenseTypo (-122) No support for typo assist. VdkError_LicenseOperator (-123) Unlicensed operator.
Error code Description VdkError_FilterLoadFailed (-144) Error occurred during filter initialization. VdkError_FileOpenFailed (-145) File could not be opened. Dispatch error codes Error code Description VdkError_CouldntLoadDLL (-200) Cannot load DLL. VdkError_NoSuchFunction (-201) Function not available. Warning error codes Error code Description VdkWarning_CollectionDown (10) The collection was down when it was opened.
Error code Description K2Error_CollPurge (-38) Purge failed due to problems deleting from any of the following directories: pdd, work, trans K2Error_CollPathTooBig (-39) Collection path supplied for the path member in K2CollectionOpenArgRec is too long. K2Error_LocaleIncompat (-101) Collection and session locales are incompatible. K2Error_KBNotOpened (-102) Knowledge base cannot be opened.
Warning error codes Error code Description K2Warning_CollectionDown (10) The collection was down when it was opened. K2Warning_QueryComplex (11) Too many matching words. K2Warning_LowMemory (12) Memory is low for indexing. K2Warning_CollectionReadOnly (13) The collection is read-only. K2Warning_DriverNotFound (14) Couldn’t locate specified driver. K2Warning_LargeToken (15) Returned a token greater than maxSize.
Page 164
layout 14 Mail Server page 23 Data & Services section, ColdFusion MX Mappings page 22 Administrator 26 Memory Variables page 22 data sources password 70 adding to ColdFusion MX Administrator 39 RDS Password page 34 adding to ColdFusion MX Administrator, Sandbox Security page 35 considerations 40 Security section 34, 69...
Page 165
jrun_iis6_wildcard.dll 59 jrun_nsapi35.dll 59 failover 81 JRunScripts directory 63 files and directories, security 72 custom JVM for a JRun server 77 Java and JVM Settings page 25 hosting JWS port number 76 application isolation 77 multihoming 65 httpd.conf file K2 Server application isolation 78 about 88 elements added to 59...
Page 166
Microsoft Access with Unicode, connecting to 46 defining a JRun server 76 Microsoft Access, connecting to 44 failover 81 Microsoft SQL Server, connecting to 47 load balancing 81 migrating client variable data 21 overview 75 mkvdk utility web server configuration (application isolation) 78 bulk insert and delete, using 97 MySQL, connecting to 48 getting started 91...
Page 168
web root, built-in web server 58 Web Server Configuration Tool advanced configurations 65 batch files 62 cluster 82 command-line interface 60 configuration files 62 GUI mode 60 shell scripts 62 SSL 68 using 59 web servers built-in web server 58 configuring 59 configuring for load balancing and failover 81 external 59...
Need help?
Do you have a question about the COLDFUSION MX 61 - CONFIGURING AND ADMINISTERING COLDFUSION MX and is the answer not in the manual?
Questions and answers