JBoss Enterprise Application Platform ..............23 Oracle WebLogic Server ..................27 Installation of CloverETL Server License ................ 29 Installation of CloverETL Server License using a Web Form ........29 Installation of CloverETL Server License using a license.file Property ......32 Separate License WAR ..................32 IBM InfoSphere MDM Plugin Installation ...............
Page 4
Job Config Properties ....................108 WebDAV Access to Sandboxes ................... 111 WebDAV Clients ....................111 WebDAV Authentication/Authorization ..............111 16. CloverETL Server Monitoring ..................113 Standalone Server Detail ..................... 113 Cluster Overview ....................... 118 Node Detail ......................119 Server Logs ......................120 17.
Page 5
CloverETL Server API Extensibility ................193 Groovy Code API ....................193 Embedded OSGi Framework ................194 25. Recommendations for Transformations Developers ............. 196 26. Extensibility - CloverETL Engine Plugins ................. 197 27. Troubleshooting ......................198 VI. Cluster ..........................199 28. Clustering Features ....................... 200 High Availability .......................
CloverETL Server into existing application portfolios and processes. The CloverETL Server is a Java application built to J2EE standards. We support a wide range of application servers including Apache Tomcat, Jetty, IBM WebSphere, Sun Glassfish, JBoss AS, and Oracle WebLogic.
Chapter 1. What is CloverETL Server? Table 1.1. CloverETL Server and CloverETL Engine comparison CloverETL Server CloverEngine as executable tool possibilities of executing by calling http (or JMX, etc.) APIs (See by executing external process or by graphs details in Simple HTTP API (p.
> 50 GB This may vary depending on total number of nodes and cores in license. Minimum value, the disk space depends on data. Disk space for shared sandboxes is required only for CloverETL Cluster. Software Requirements Operating system • Microsoft Windows Server 2003/2008/2012 32/64 bit •...
Chapter 2. System Requirements for CloverETL Server Table 2.2. CloverETL Server Compatibility Matrix CloverETL 3.5 CloverETL 4.0 CloverETL 4.1, 4.2, and 4.3 Application Server Java 6 and 7 Java 7 Java 7 Java 8 Tomcat 6 Tomcat 7 Tomcat 8 Pivotal tc Server Standard (3.1.3,...
Page 12
Chapter 2. System Requirements for CloverETL Server • IMAP/POP3 (EmailReader) • FTP/SFTP/FTPS (readers, writers)
(p. 12) includes details about further testing and production on your chosen app-container and database. To create a fully working instance of Enterprise CloverETL Server you should: • install an application server • create a database dedicated to CloverETL server •...
Chapter 3. Installing Evaluation Server The default installation of CloverETL Server does not need any extra database server. It uses the embedded Apache Derby DB. What is more, it does not need any subsequent configuration. CloverETL Server configures itself during the first startup. Database tables and some necessary records are automatically created on first startup with an empty database.
Page 15
6. Check whether CloverETL Server is running on URLs: • Web-app root http://[host]:[port]/[contextPath] The default Tomcat port for the http connector is 8080 and the default contextPath for CloverETL Server is "clover", thus the default URL is: http://localhost:8080/clover/ • Web GUI...
Page 16
Chapter 3. Installing 7. CloverETL Server is now installed and prepared for basic evaluation. There are couple of sandboxes with various demo transformations installed.
This section describes installation of CloverETL Server on various app-containers in detail, also describes the ways how to configure the server. If you need just quickly evaluate CloverETL Server features which don't need any configuration, evaluation installation may be suitable: Evaluation Server (p.
Page 18
The restart of operating system is needed to apply changes. In case that Tomcat is installed as a Windows service, CloverETL configuration is performed using configuration of the respective service. The configuration can be performed either by graphical utility [tomcat_home]/bin/ Tomcat8w.exe or by command line utility [tomcat_home]/bin/Tomcat8.exe.
Page 19
• JAVA_HOME or JRE_HOME environment variable has to be set. • Apache Tomcat 6.0.x or 7.0.x or 8.0.x is installed. CloverETL Server is developed and tested with the Apache Tomcat 6.0.x, 7.0.x and 8.0.x containers (it may work unpredictably with other versions). See...
Page 20
5. Check whether CloverETL Server is running on URLs: • Web-app root http://[host]:[port]/[contextPath] The default Tomcat port for the http connector is 8080 and the default contextPath for CloverETL Server is "clover", thus the default URL is: http://localhost:8080/clover/ • Web GUI...
Configuration of CloverETL Server on Jetty (p. 16) Installation of CloverETL Server 1. Download the web archive file (clover.war) containing the CloverETL Server application which is built for Jetty. 2. Check if prerequisites are met: • Oracle JDK or JRE (See Java Virtual Machine (p.
(p. 5) for required Java version.) Important In order to ensure reliable function of CloverETL Server always use the latest version of IBM Java SDK. At least SDK 7.0 SR6 (package IBM WebSphere SDK Java Technology Edition V7.0.6.1) is recommended. Using older SDKs may lead to deadlocks during execution of specific ETL graphs.
Page 23
• Go to Integrated Solutions Console ( http://localhost:9060/ibm/console/) • Go to Applications →New Application →New Enterprise Application Here select a WAR archive of the CloverETL server and deploy it to the application server, but do not start it. 4. Configure application class loading...
Page 24
Note Please note, that some CloverETL features using third party libraries don't work properly on IBM WebSphere • Hadoop is guaranteed to run only on Oracle Java 1.6+, but Hadoop developers do make an effort to remove any Oracle/Sun-specific code. See Hadoop Java Versions on Hadoop Wiki.
It is accessible at http://localhost:4848/ by default. • Go to Applications →Web Applications and click Deploy ..• Upload WAR file with CloverETL server application or select the file from filesystem if it is present on the machine running Glassfish.
Configuration of CloverETL Server on JBoss AS (p. 22) Installation of CloverETL Server 1. Get CloverETL Server web archive file ( clover.war ) that is built for JBoss AS. 2. Check if you meet prerequisites • Oracle JDK or JRE (See Java Virtual Machine (p.
Page 27
Chapter 3. Installing 3. Create a separate JBoss server configuration However it may be useful to use a specific JBoss server configuration, when it is necessary to run CloverETL: • isolated from other JBoss applications • with a different set of services •...
Configuration of CloverETL Server on JBoss EAP (p. 25) Installation of CloverETL Server 1. Get CloverETL Server web archive file ( clover.war ) which is built for JBoss EAP. 2. Check if you meet prerequisites • Oracle JDK or JRE (See Java Virtual Machine (p.
Page 29
In order to be able to connect to the database, one needs to define global module so that the driver is available for CloverETL web application - copying the driver to the lib/ext directory of the server will not work. Such module is created and deployed in few steps (the example is for MySQL and module's name is mysql.driver...
Page 30
<module name="mysql.driver" slot="main" /> </global-modules> <spec-descriptor-property-replacement>false</spec-descriptor-property-replacement> <jboss-descriptor-property-replacement>true</jboss-descriptor-property-replacement> </subsystem> 6. Configure CloverETL Server according to a description in the next section (p. 25) . 7. Deploy WAR file Copy the file clover.war to [jboss-home]/standalone/deployments 8. Run [jboss-home]/bin/standalone.sh (or standalone.bat on Windows OS) to start the JBoss platform.
Page 31
). You can set the path to the license file, too. • Alternatively, you can set "JDBC" datasource.type and configure the database connection to be managed directly by CloverETL Server (provided that you have deployed proper JDBC driver module to the server): datasource.type=JDBC jdbc.url=jdbc:mysql://localhost:3306/cloverServerDB...
Java Virtual Machine (p. 5) for required Java version.) • WebLogic (CloverETL Server is tested with WebLogic Server 11g (10.3.6) and WebLogic Server 12c (12.1.2), see http://www.oracle.com/technetwork/middleware/ias/downloads/wls-main-097127.html) WebLogic has to be running and a domain has to be configured. You can check it by connecting to...
Page 33
• Set JAVA_OPTIONS variable in the WebLogic domain start script [domainHome]/startWebLogic.sh JAVA_OPTIONS="${JAVA_OPTIONS} -Dclover_config_file=/path/to/clover-config.properties • This change requires restarting WebLogic. Important When CloverETL Server is deployed on WebLogic and JNDI Datasource pointing to Oracle DB is used, there must be an extra config property in the config file: quartz.driverDelegateClass=org.quartz.impl.jdbcjobstore.oracle.weblogic.WebLogicOracleDelegate Continue with: Installation of CloverETL Server License (p.
Installation of CloverETL Server License using a Web Form If the CloverETL Server has been started without assigning any license, you can use Add license form in the server gui to install it. In this case the hyperlink No license available in system. Add new license is displayed on...
Page 35
You can paste a license text into License key or use Browse button to search for license file in the filesystem. After clicking Update button the license is validated and saved to the database table clover_licenses. If the license is valid, a table with license's description appears. To proceed to CloverETL Serve console click Continue to server console.
Page 36
Update of CloverETL Server License in the Configuration Section If the license has been already installed, you can still change it by using form in the server web gui. • Go to server web GUI →Configuration →CloverETL Info →License • Click Update license.
If you assign more valid licenses, the most recent one is used. Installation of CloverETL Server License using a license.file Property 1. Get the license.dat file. 2. Set the CloverETL Server license.file parameter to the path to license.dat. Set its value to full path to the license.dat file. See Chapter 9, List of Properties (p.
Server. 4. To verify that the plugin was loaded successfully, login to the Server's Reporting Console and look in the Configuration > CloverETL Info > Plugins page. In the list of plugins you should see cloveretl.engine.initiate.
Chapter 3. Installing Possible Issues during Installation Since CloverETL Server is considered a universal JEE application running on various application servers, databases and jvm implementations, problems may occur during the installation. These can be solved with a proper configuration of the server environment. This section contains tips for the configuration.
Page 40
Apache Tomcat Context Parameters Do Not Have Any Effect Tomcat may sometimes ignore some of context parameters. It may cause weird CloverETL Server behaviour, since it looks like configured, but only partially. Some parameters are accepted, some are ignored. Usually it works fine, however it may occur in some environments.
Page 41
Failed to load webapp: Failed to load webapp: Context root /* is already bound. Cannot start application CloverETL If you can see it, then this is the case. Getting rid of the issue, the easiest way is to stop all other (sample) applications and leave only clover.war running on the server.
Page 42
If you are setting environment variables like clover_license_file or clover_config_file , remember you should not be running more than one CloverETL Server. Therefore if you ever needed to run more instances at once, use other ways of setting parameters (see Part III, “Configuration” (p. 43) for description of all possibilities) The reason is the environment variable is shared by all applications in use causing them to share configurations and fail unexpectedly.
Page 43
Chapter 3. Installing could not execute query You have an error in your SQL syntax; check the manual that coresponds to your MySQL server version for the right to use near 'OPTION SQL_SELECT_LIMIT=DEFAULT' at line 1...
Thus whole application server, together with WARs and EARs running on it, share one memory space. Default JVM memory settings is too low for running application container with CloverETL Server. Some application servers, like IBM WebSphere, increase JVM defaults themselves, however they still may be too low.
Therefore it is recommended to increase the limit for production systems. Reasonable limits vary from 10,000 to about 100,000 depending on the expected load of CloverETL Server and the complexity of your graphs. The current limit can be displayed in most UNIX-like systems using the ulimit -Hn command.
Examples of DB Connection Configuration (p. 54) • Having a separate sandbox with test graph that can be run anytime to verify that CloverETL Server runs correctly and allows for running jobs Upgrade Instructions 1. Suspend all sandboxes, wait for running graphs to finish processing 2.
Page 47
Chapter 5. Upgrading Server to Newer Version 9. Review that contents of all tabs in CloverETL Server Console, especially scheduling and event listeners looks 10.Update graphs to be compatible with the particular version of CloverETL Server (this should be prepared and tested in advance) 11.Resume the test sandbox and run a test graph to verify functionality...
Part III. Configuration We recommend the default installation (without any configuration) only for evaluation purposes. For production use, we recommend configuring a dedicated database and properly configuring the SMTP server for sending notifications.
Source is a common properties file (text file with key-value pairs): [property-key]=[property-value] By default, CloverETL tries to find the config file [workingDir]/cloverServer.properties. Properties File on Specified Location A file has the same file structure as in case above, but its location is specified with a clover_config_file or clover.config.file environment variable or system property.
Example for Apache Tomcat On Tomcat, it is possible to specify context parameters in a context configuration file. [tomcat_home]/conf/ Catalina/localhost/clover.xml which is created automatically just after deployment of a CloverETL Server web application. You can specify a property with adding this element: <Parameter name="[propertyName]"...
Chapter 7. Setup CloverETL Server Setup helps you with configuration of CloverETL server. Instead of typing the whole configuration file in a text editor, the Setup generates content of the configuration file according to your instructions. It let you set up License and configure Database Connection, LDAP Connection, SMTP Server Connection, Sandbox Paths, Encryption and Cluster Configuration.
Page 52
Chapter 7. Setup See also Jetty (p. 16). Glassfish Add clover.config.file property in application server GUI (accessible on http://localhost:4848). The property can be added under Configuration →System Properties. See also Glassfish / Sun Java System Application Server (p. 20). JBoss See also JBoss Application Server (p.
Chapter 7. Setup Configuring Particular Items Use Setup. Items configured in Setup are saved into a file defined with clover.config.file. If you need encryption, configure the Encryption first. Configure connection to database and then update license. Later, you can configure other setup items. Some Setup items (Database and Cluster) require restart of an application server.
Page 54
Chapter 7. Setup Database Database tab let you configure connection to database. You can connect via JDBC. Or you can use JNDI to access the datasource on an application server level. Choose a suitable item of a JNDI tree. Sandboxes Sandboxes let you configure path to sandboxes: shared, local, partitioned.
Page 55
Chapter 7. Setup Encryption Encryption tab let you enable encryption of sensitive items of the configuration file. You can choose an encryption provider and an encryption algorithm. An alternative encryption provider can be used; the libs have to be added to classpath.
Page 56
Chapter 7. Setup LDAP LDAP tab let you use an existing LDAP database for user authentication.
Page 57
Chapter 7. Setup Firstly, you should specify connection to the LDAP server. Secondly, define pattern for user DN. The login can be validated using any user matching the pattern. See also LDAP Authentication (p. 92). Cluster Cluster tab let you configure clustering features.
Page 58
Chapter 7. Setup Note You can use the setup in a fresh installation of CloverETL Server, even if it had not been activated yet: log in into Server Console and use Close button to access the menu.
Chapter 8. Examples of DB Connection Configuration In standalone deployment (non-clustered), configuration of DB connection is optional, since embedded Apache Derby DB is used by default and it is sufficient for evaluation. However, configuration of external DB connection is strongly recommended for production deployment. It is possible to specify common JDBC DB connection attributes (URL, username, password) or JNDI location of DB DataSource.
This subdirectory will be created in the directory which is set as derby.system.home (or in the working directory if derby.system.home is not set). Value databases/cloverDb is a default value, you may change it. Derby JDBC 4 compliant driver is bundled with CloverETL Server, thus there is no need to add it on the classpath.
Chapter 8. Examples of DB Connection Configuration MySQL CloverETL Server supports MySQL 5, up to version 5.5 included. If you use a properties file for configuration, specify these parameters: jdbc.driverClassName, jdbc.url, jdbc.username, jdbc.password, jdbc.dialect. For example: jdbc.driverClassName=com.mysql.jdbc.Driver jdbc.url=jdbc:mysql://127.0.0.1:3306/clover?useUnicode=true&characterEncoding=utf8 jdbc.username=root jdbc.password= jdbc.dialect=org.hibernate.dialect.MySQLDialect...
Database clover has to be created with suitable PAGESIZE. DB2 has several possible values for this property: 4096, 8192, 16384 or 32768. CloverETL Server should work on DB with PAGESIZE set to 16384 or 32768. If PAGESIZE value is not set properly, there should be error message in the log file after failed CloverETL Server startup: ERROR: DB2 SQL Error: SQLCODE=-286, SQLSTATE=42727, SQLERRMC=16384;...
DB2 does not allow ALTER TABLE which trims DB column length. This problem depends on DB2 configuration and we've experienced this only on some AS400s so far. CloverETL Server applies set of DP patches during the first installation after application upgrade. Some of these patches may apply column modifications which trims length of the text columns.
Please don't forget to add a JDBC 4 compliant driver on the classpath. A JDBC Driver which doesn't meet the JDBC 4 specification won't work properly. These are privileges which have to be granted to schema used by CloverETL Server: CONNECT...
Chapter 8. Examples of DB Connection Configuration MS SQL MS SQL requires configuration of DB server. • Allowing of TCP/IP connection: • execute tool SQL Server Configuration Manager • go to Client protocols • switch on TCP/IP (default port is 1433) •...
Chapter 8. Examples of DB Connection Configuration Postgre SQL If you use a properties file for configuration, specify these parameters: jdbc.driverClassName, jdbc.url, jdbc.username, jdbc.password, jdbc.dialect. For example: jdbc.driverClassName=org.postgresql.Driver jdbc.url=jdbc:postgresql://localhost/clover?charSet=UTF-8 jdbc.username=postgres jdbc.password= jdbc.dialect=org.hibernate.dialect.PostgreSQLDialect Please don't forget to a add JDBC 4 compliant driver on the classpath. A JDBC Driver which doesn't meet the JDBC 4 specification won't work properly.
Chapter 8. Examples of DB Connection Configuration JNDI DB DataSource CloverETL Server can connect to database using JNDI DataSource, which is configured in application server or container. However there are some CloverETL parameters which must be set, otherwise the behaviour may be unpredictable: datasource.type=JNDI # type of datasource;...
Encrypted JNDI on WebLogic (p. 68) Encrypted JNDI on Tomcat You need secure-cfg-tool to encrypt the passwords. Use the version of secure-cfg-tool corresponding to the version of CloverETL Server. Usage of the tool is described in Chapter 10, Secure Configuration Properties (p. 75).
Page 69
Chapter 8. Examples of DB Connection Configuration type="javax.sql.DataSource" driverClassName="org.postgresql.Driver" url="jdbc:postgresql://127.0.0.1:5432/clover410m1?charSet=UTF-8" username="conf#Ws9IuHKo9h7hMjPllr31VxdI1A9LKIaYfGEUmLet9rA=" password="conf#Cj1v59Z5nCBHaktn6Ubgst4Iz69JLQ/q6/32Xwr/IEE=" maxActive="20" maxIdle="10" maxWait="-1"/> Encrypted JNDI on Jetty 9 (9.2.6) http://eclipse.org/jetty/documentation/current/configuring-security-secure-passwords.html Configuration of a JNDI jdbc connection pool is stored in the plain text file, $JETTY_HOME/etc/jetty.xml. <New id="MysqlDB" class="org.eclipse.jetty.plus.jndi.Resource"> <Arg></Arg>...
Chapter 8. Examples of DB Connection Configuration <datasources> <local-tx-datasource> <jndi-name>MysqlDS</jndi-name> <connection-url>jdbc:mysql://127.0.0.1:3306/clover</connection-url> <driver-class>com.mysql.jdbc.Driver</driver-class> <user-name>user</user-name> <password>password</password> </local-tx-datasource> </datasources> Encrypt the data source password Linux java -cp client/jboss-logging.jar:lib/jbosssx.jar org.jboss.resource.security.SecureIdentityLoginModule password Windows java -cp client\jboss-logging.jar;lib\jbosssx.jar org.jboss.resource.security.SecureIdentityLoginModule password NOTE: in the JBoss documentation client/jboss-logging-spi.jar is used, but there is no such a file in my JBossAS [6.0.0.Final "Neo"], but client/jboss-logging.jar can be used instead.
Page 71
<max-pool-size>30</max-pool-size> </pool> <security> <user-name>user</user-name> <password>password</password> </security> </datasource> <drivers> <driver name="mysql" module="com.cloveretl.jdbc"> <driver-class>com.mysql.jdbc.Driver</driver-class> </driver> </drivers> <datasources> In JBOSS_HOME directory run cli command: java -cp modules/system/layers/base/org/picketbox/main/picketbox-4.0.19.SP2-redhat-1.jar:client/jboss-logging.jar The command will return an encrypted password, e.g. 5dfc52b51bd35553df8592078de921bc. Add a new security-domain to security-domains, the password value is a result of the command from the previous step.
Page 72
Chapter 8. Examples of DB Connection Configuration </pool> <security> <security-domain>EncryptDBPassword</security-domain> </security> </datasource> <drivers> <driver name="mysql" module="com.cloveretl.jdbc"> <driver-class>com.mysql.jdbc.Driver</driver-class> </driver> </drivers> </datasources> The same mechanism can be probably used also for JMS. http://middlewaremagic.com/jboss/?p=1026 Encrypted JNDI on Glassfish 3 (3.1.2.2) Configuration of jdbc connection pool is stored in the plain text file, $DOMAIN/config/domain.xml.
Page 73
Chapter 8. Examples of DB Connection Configuration The same mechanism can be used also for JMS connection. (Configuring an external JMS provider: https://www.ibm.com/developerworks/community/blogs/timdp/entry/ using_activemq_as_a_jms_provider_in_websphere_application_server_7149?lang=en Encrypted JNDI on WebLogic Password in a JNDI datasource file is encrypted by default when created by admin's web console (Service/ Datasource).
Extensibility - CloverETL Engine Plugins (p. 197). datasource.type Set this explicitly to JNDI if you need CloverETL JDBC Server to connect to a DB using JNDI datasource. In such case, "datasource.jndiName" and "jdbc.dialect" parameters must be set properly. Possible values: JNDI...
Page 75
"datasource.type" is set to "JNDI". clover_server jdbc.driverClassName class name for jdbc driver name jdbc.url jdbc url used by CloverETL Server to store data jdbc.username jdbc database user name jdbc.password jdbc database user name jdbc.dialect hibernate dialect to use in ORM quartz.driverDelegateClass...
Page 76
Switch whether the A1 Digest for HTTP Digest Authentication should be generated and stored or not. Since there is no CloverETL Server API using the HTTP Digest Authentication by default, it's recommended to keep it disabled. This option is not automatically enabled when any feature is specified security.digest_authentication.features_list...
Max number of records deleted in one batch. It is used for deleting of archived run records. launch.http_header_prefix Prefix of HTTP headers added by launch services to the X-cloveretl HTTP response. task.archivator. Prefix of archive files created by the archivator.
Page 78
Users are strongly discouraged from modification of the property. The property name changed since CloverETL 4.2, however also the obsolete name is still accepted to maintain backwards compatibility.
Page 79
Chapter 9. List of Properties Table 9.2. Defaults for job execution configuration - see Job Config Properties (p. 108) for details description default executor.tracking_interval An interval in milliseconds for scanning of a current status of a 2000 running graph. The shorter interval, the bigger log file. executor.log_level Log level of graph runs.
Basic Utility Usage 1. Get a utility archive file (secure-cfg-tool.zip) and unzip it. The utility is available in the download section of your CloverETL account - at the same location as the download of CloverETL Server. 2. Execute the script given for your operating system, encrypt.bat for MS Windows, encrypt.sh for Linux.
Page 81
Chapter 10. Secure Configuration Properties Important Values encrypted by a Secure parameter form (Chapter 13, Secure Parameters (p. 88) ) cannot be used as a value of a configuration property. Advanced Usage - Custom Settings The way how configuration values are encrypted described so far, uses default configuration settings (a default provider and algorithm).
Page 82
Configuring an application server CloverETL Server application needs to know how the values have been encrypted, therefore the properties must be passed to the server (see details in Part III, “Configuration” (p. 43)). For example:...
Page 83
Chapter 10. Secure Configuration Properties security.config_properties.encryptor.providerClassName=org.bouncycastle.jce.provider.BouncyCastleProvider security.config_properties.encryptor.algorithm=PBEWITHSHA256AND256BITAES-CBC-BC Important If a third-party provider is used, its classes must be accessible for the application server. Property security.config_properties.encryptor.providerLocation will be ignored.
(p. 80) Main Logs The CloverETL Server uses the log4j library for logging. The WAR file contains the default log4j configuration. The log4j configuration file log4j.xml is placed in WEB-INF/classes directory. By default, log files are produced in the directory specified by system property "java.io.tmpdir" in the cloverlogs subdirectory.
Page 85
By default, these log files are saved in the subdirectory cloverLogs/graph in the directory specified by "java.io.tmpdir" system property. It’s possible to specify a different location for these logs with the CloverETL "graph.logs_path" property. This property does not influence main Server logs.
Chapter 12. Temp Space Management Many of the components available in the CloverETL Server require temporary files or directories in order to work correctly. Temp space is a physical location on the file system where these files or directories are created and maintained.
(p. 86) Initialization When CloverETL Server is starting the system checks temp space configuration: in case no temp space is configured a new default temp space is created in the directory where java.io.tmpdir system property points. The directory is named as follows: •...
Page 89
Chapter 12. Temp Space Management Figure 12.2. Newly added global temp space. Using environment variables and system properties Environment variables and system properties can be used in the temp space path as a placeholder; they can be arbitrarily combined and resolved paths for each node may differ in accord with its configuration. Note The environment variables have higher priority than system properties of the same name.
Page 90
Chapter 12. Temp Space Management Figure 12.3. Temp spaces using environment variables and system properties Disabling Temp Space To disable a temp space click on "Disable" link in the panel. Once the temp space has been disabled, no new temporary files will be created in it, but the files already created may be still used by running jobs. In case there are files left from previous or current job executions a notification is displayed.
Page 91
Chapter 12. Temp Space Management Figure 12.4. Disable operation reports action performed Enabling Temp Space To enable a temp space click on "Enable" link in the panel. Enabled temp space is active, i.e. available for temporary files and directories creation. Removing Temp Space To remove a temp space click on "Remove"...
Page 92
Chapter 12. Temp Space Management Figure 12.5. Remove operation asks for confirmation in case there are data present in the temp space...
Secure parameters are automatically decrypted by server in graph runtime. A parameter value can also be encrypted in the CloverETL Server Console in the Configuration > Secure Parameters page - use the Encrypt text section. Figure 13.2. Graph parameters tab with initialized master password If you change the master password, the secure parameters encrypted using the old master password cannot be decrypted correctly anymore.
Castle JCE provider. Another provider would be installed similarly. 1. Download Bouncy Castle provider jar (e.g. bcprov-jdk15on-150.jar) from http://bouncycastle.org/ latest_releases.html 2. Add the jar to the classpath of your application container running CloverETL Server, e.g. to directory WEB- INF/lib 3. Set value security.job_parameters.encryptor.providerClassName...
Chapter 14. Users and Groups The CloverETL Server has a built-in security module that manages users and groups. User groups control access permissions to sandboxes and operations the users can perform on the Server, including authenticated calls to Server API functions. A single user can belong to multiple groups.
Each user, event though logged-in using LDAP authentication, must have his own "user" record (with related groups) in the CloverETL security module. So there must be the user with the same username and domain set to "LDAP". Such record has to be created by a Server administrator before the the user can log in.
Page 98
Chapter 14. Users and Groups Basic LDAP connection properties # Implementation of context factory security.ldap.ctx_factory=com.sun.jndi.ldap.LdapCtxFactory # URL of LDAP server security.ldap.url=ldap://hostname:port # User DN pattern that will be used to create LDAP user DN from login name. security.ldap.user_dn_pattern=uid=${username},dc=company,dc=com Depending on the LDAP server configuration the property security.ldap.user_dn_pattern can be pattern for user's actual distinguished name in the LDAP directory, or just the login name - in such case just set the property to ${username}.
Page 99
Chapter 14. Users and Groups security.ldap.user_search.filter=(uid=${username}) # Scope specifies type of search in "base". There are three possible values: SUBTREE | ONELEVEL | OBJECT # http://download.oracle.com/javase/6/docs/api/javax/naming/directory/SearchControls.html security.ldap.user_search.scope=SUBTREE Following properties are names of attributes from the search defined above. They are used for getting basic info about the LDAP user in case the user record has to be created/updated by Clover security module: (step [6] in the login process above) security.ldap.user_search.attribute.firstname=fn...
First name Last name E-mail E-mail address which may be used by CloverETL administrator or by CloverETL server for automatic notifications. See Task - Send Email (p. 153) for details.
Page 101
Chapter 14. Users and Groups Edit user record User with permission "Create user" or "Edit user" can use this form to set basic user parameters. Figure 14.2. Web GUI - edit user Change users Password If user looses his password, the new one must be set. So user with permission "Change passwords" can use this form to do it.
Page 102
Chapter 14. Users and Groups Figure 14.4. Web GUI - groups assignment Disabling / enabling users Since user record has various relations to the logs and history records, it can't be deleted. So it's disabled instead. It basically means, that the record doesn't display in the list and the user can't login. However disabled user may be enabled again.
Every single CloverETL user is assigned to this group by default. It is possible to remove user from this group, but it is not a recommended approach. This group is useful for some permissions to sandbox or some operation, which you would like to make accessible for all users without exceptions.
Page 104
Chapter 14. Users and Groups Figure 14.6. Web GUI - users assignment Groups permissions Groups permissions are structured as a tree, where permissions are inherited from the root to leafs. Thus if some permission (tree node) is enabled (blue dot), all permissions in sub tree are automatically enabled (white dot). Permissions with red cross are disabled.
Server and accessed remotely. Nonetheless, you can do everything with Server Projects the same way as with local projects – copy and paste files, create, edit, and debug graphs, etcetera. See the CloverETL Designer manual for details on configuring a connection to the Server.
Page 106
Instead of the absolute path, it's recommended to use ${sandboxes.home} placeholder, which may be configurable in the CloverETL Server configuration. So e.g. for the sandbox with ID "dataReports" the specified value of the "root path"...
(ETL graph or Jobflow). • sandbox:// URLs Sandbox URL allows user to reference the resource from different sandboxes with standalone CloverETL Server or the cluster. In cluster environment, CloverETL Server transparently manages remote streaming if the resource is accessible only on some specific cluster node.
Another users may have access according to sandbox settings. Figure 15.2. Sandbox Permissions in CloverETL Server Web GUI Permissions to a specific sandbox are modifiable in Permissions tab in sandbox detail. In this tab, selected user groups may be allowed to perform particular operations.
Chapter 15. Server Side Job Files - Sandboxes Sandbox Content Sandbox should contain jobflows, graphs, metadata, external connection and all related files. Files, especially graph or jobflow files, are identified by relative path from sandbox root. Thus you need two values to identify specific job file: sandbox and path in sandbox.
Chapter 15. Server Side Job Files - Sandboxes Figure 15.5. Web GUI - download sandbox as ZIP Upload ZIP to sandbox Select a sandbox in left panel. You must have write permission to the selected sandbox. Then select tab "Upload ZIP"...
Chapter 15. Server Side Job Files - Sandboxes Figure 15.7. Web GUI - upload ZIP results Table 15.3. ZIP upload parameters Label Description Encoding of packed file File names which contain special characters (non ASCII) are encoded. By this select names box, you choose right encoding, so filenames are decoded properly.
Page 112
Chapter 15. Server Side Job Files - Sandboxes Download file HTTP API It is possible to download/view sandbox file accessing "download servlet" by simple HTTP GET request: http://[host]:[port]/[Clover Context]/downloadFile?[Parameters] Server requires BASIC HTTP Authentication. Thus with linux command line HTTP client "wget" it would look like this: wget --user=clover --password=clover http://localhost:8080/clover/downloadFile?sandbox=default\&file=data-out/data.dat...
Chapter 15. Server Side Job Files - Sandboxes Job Config Properties Each ETL graph or Jobflow may have set of config properties, which are applied during the execution. Properties are editable in web GUI section "sandboxes". Select job file and go to tab "Config properties". The same config properties are editable even for each sandbox.
Page 114
Chapter 15. Server Side Job Files - Sandboxes Property name Default value Description "DEFAULT_PATH_SEPARATOR_REGEX". Directory path must always end with a slash character "/", otherwise ClassLoader doesn't recognize it's a directory. Server always automatically adds "trans" subdirectory of job's sandbox, so It doesn't have to be added explicitly.
Chapter 15. Server Side Job Files - Sandboxes Property name Default value Description graph from the server console sets the debug_mode to false. delete_obsolete_temp_files false If true, system will remove temporary files produced during previous finished runs of the respective job. This property is useful together with enabled debug mode ensuring that obsolete debug files from previous runs of a job are removed from temp...
WebDAV Authentication/Authorization CloverETL Server WebDAV API uses the HTTP Basic Authentication by default. However it may be reconfigured to use HTTP Digest Authentication. Please see Part III, “Configuration” (p. 43) for details. Digest Authentication may be useful, since some WebDAV clients can't work with HTTP Basic Authentication,...
Page 117
Chapter 15. Server Side Job Files - Sandboxes HTTP Digest Authentication is feature added to the version 3.1. If you upgraded your older CloverETL Server distribution, users created before the upgrade cannot use the HTTP Digest Authentication until they reset their passwords.
Monitoring section in the server Web GUI displays useful information about current performance of the standalone CloverETL Server or all cluster nodes if the clustering is enabled. Monitoring section of the standalone server has slightly different design from cluster environment. In case of standalone server, the server-view is the same as node detail in cluster environment.
Chapter 16. CloverETL Server Monitoring Performance The Performance panel contains a chart with two basic performance statistics: a number of running jobs and an amount of used heap memory. The graph displays values gathered within a specific interval. The interval can be set up with the combo box above the graph or it can be configured by "cluster.node.sendinfo.history.interval"...
Page 120
Chapter 16. CloverETL Server Monitoring Figure 16.5. Running jobs System System panel contains info about operating system and license. Figure 16.6. System Status History Status history panel displays node statuses history since restart of the server. Figure 16.7. Status History User's Access User's Access panel lists info about activities on files performed by users.
Status panel displays current node status since last server restart. It displays current server status (ready, stopped, ...), exact Java version, exact CloverETL Server version, way of access to database, URLs for synchronous and asynchronous messaging, available heap and non-heap memory, etc.
Page 122
Chapter 16. CloverETL Server Monitoring Threads Threads panel lists java threads and their states. Figure 16.12. Threads Quartz Quartz panel lists scheduled actions: their name, description, start time, end time, time of previous event, time of next event and expected final event.
Chapter 16. CloverETL Server Monitoring Cluster Overview Cluster overview displays info collected from all cluster nodes. The info is grouped in several panels: • List of nodes with a toolbar - allows manipulation with selected nodes • Status history - Displays last 10 status changes for all cluster nodes •...
Chapter 16. CloverETL Server Monitoring Node Detail Node Detail is similar to the "Standalone server detail" mentioned above, however it displays detail info about node selected in the tree on the left. Figure 16.15. Node detail...
• CLUSTER - Only cluster - related messages are visible in this log • LAUNCH_SERVICES - Only requests for launch services • AUDIT - Detail logging of operations called on the CloverETL Server core. Since the full logging may affect server performance, it's disabled by default. See Server Audit Logs (p.
Chapter 17. Server Configuration Migration CloverETL Server provides means to migrate its configuration (e.g. event listeners, schedules etc.) or parts of the configuration between separate instances of the server. A typical use case is deployment from test environment to production - this involves not only deployment of CloverETL graphs, but also copying parts of configuration such as file event listeners etc.
XSD schema. The schema for a configuration XML document can be found at http://[host]:[port]/[contextPath]/schemas/clover-server-config.xsd. The XML file contains selected items of the CloverETL server instance. The file can by modified before the import to another server instance - for example to import schedules only.
Configuration Import Process Uploading Configuration The first step in the configuration import is to upload the XML file to the CloverETL server. After clicking on CloverETL Configuration File button a window is opened where user can select an XML file with the configuration to import.
• Changes only button will display only items that have been either added or actually changed by update • All updates button will display all of imported items, event those identical to already present ones Example 17.1. Example of simple configuration defining one new server user. <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <cloverServerConfiguration xmlns="http://cloveretl.com/server/data" timeZone="Europe/Berlin"> <usersList> <user disabled="false"> <username>johnsmith</username>...
Page 130
Chapter 17. Server Configuration Migration <groupCode>job_managers</groupCode> </userGroups> </user> </usersList> </cloverServerConfiguration> Figure 17.4. Outcome of the import preview for configuration from Example 17.1 (p. 124) The Summary in the Import Log says whether the dry run was successful. Should there by any problems with items imported, the item is displayed along with the cause of the error (see Figure 17.4 (p.
Page 131
User is notified about these problems in Import Log with link to the problematic item. One should check such items in appropriate section of the CloverETL Server console and change their settings to fix the issue or remove them.
Chapter 18. Diagnostics CloverETL Server allows you to create a thread dump or a heap dump. The thread and heap dumps are useful for investigation of performance and memory issues. In server GUI, go to Configuration →System Info →Diagnostics. Heap Dump Heap Dump is content of a JVM process memory stored in a binary file.
Chapter 19. Graph/Jobflow Parameters The CloverETL Server passes a set of parameters to each graph or jobflow execution. Keep in mind that ${paramName} placeholders (parameters) are resolved only during the initialization (loading of XML definition file), so if you need the parameters to be resolved for each job execution, you cannot set the job to be pooled.
Chapter 19. Graph/ Jobflow Parameters Parameters by Execution Type Additional parameters are passed to the graph depending on how the graph is executed. Executed from Web GUI Graphs executed from a web gui have no additional parameters. Executed by Launch Service Invocation Service parameters which have Pass to graph attribute enabled are passed to the graph not only as "dictionary"...
Chapter 19. Graph/ Jobflow Parameters ALL parameters from a "source" job are passed to the executed job. This switch is implemented for backwards compatibility. Regarding to the default behaviour: in the editor of graph event listener, you can specify a list of parameters to pass.
Chapter 20. Manual Task Execution Since 3.1 Manual task execution allows you to invoke a task directly with an immediate effect, without defining and triggering an event. There are a number of task types that are usually associated with a triggering event, such as a file listener or a graph/jobflow listener.
Chapter 21. Scheduling The scheduling module allows you to create a time schedule for operations you need to trigger in a repetitive or timely manner. Similar to “cron” from Unix systems, each schedule represents a separate time schedule definition and a task to perform.
Chapter 21. Scheduling Timetable Setting This section should describe how to specify WHEN schedule should be triggered. Please keep in mind, that exact trigger times are not guaranteed. There may be couple of seconds delay. Schedule itself can be specified in different ways.
Chapter 21. Scheduling Figure 21.3. Web GUI - schedule form - calendar Periodical schedule by Interval This type of schedule is the simplest periodical type. Trigger times are specified by these attributes: Table 21.2. Periodical schedule attributes Type "periodic" Periodicity "interval"...
Chapter 21. Scheduling Figure 21.4. Web GUI - periodical schedule form Periodical schedule by timetable (Cron Expression) Timetable is specified by powerful (but a little bit tricky) cron expression. Table 21.3. Cron periodical schedule attributes Type "periodic" Periodicity "interval" Not active before date/time Date and time, specified with minutes precision.
Chapter 21. Scheduling Tasks Task basically specifies WHAT to do at trigger time. There are several tasks implemented for schedule and for graph event listener as follows: • Task - Execution of Graph (p. 138) • Task - Execution of Jobflow (p.
Page 144
Chapter 21. Scheduling Table 21.4. Attributes of "Graph execution" task Task type "Start a graph" Node IDs to process the task This attribute is accessible only in the cluster environment. It's comma- separated list of node IDs which may process the task. If it's empty, it may be any node, if there are nodes specified, the task will be processed on the first node which is online and ready.
Page 145
Chapter 21. Scheduling Figure 21.6. Web GUI - Graph execution task Task - Execution of Jobflow Please note that behaviour of this task type is almost the same as Task - Execution of Graph (p. 138)
Page 146
Chapter 21. Scheduling Table 21.5. Attributes of "Jobflow execution" task Task type "Start a jobflow" Node IDs to process the task This attribute is accessible only in the cluster environment. It's a comma- separated list of node IDs which may process the task. If it's empty, it may be any node, if there are nodes specified, the task will be processed on the first node which is online and ready.
Page 147
Chapter 21. Scheduling Table 21.6. Attributes of "Abort job" task Task type "Abort job" Node IDs to process the task This attribute is accessible only in the cluster environment. It's a comma- separated list of node IDs which may process the task. If it's empty, it may be any node, if there are nodes specified, the task will be processed on the first node which is online and ready.
Page 148
IDs which may process the task. If it's empty, it may be any node, if there are nodes specified, the task will be processed on the first node which is online and ready. CloverETL Server contains Groovy version 2.0.0 Table 21.8. List of variables available in Groovy code...
Page 149
CloverETL Server serverFacade com.cloveretl.server.facade. Reference to the facade every time api.ServerFacade interface. Useful for calling CloverETL Server core. WAR file contains JavaDoc of facade API and it is accessible on URL: http://host:port/ clover/javadoc/index.html sessionToken String Valid session token of the every time user who owns the event.
Page 150
Chapter 21. Scheduling Table 21.9. Attributes of "Archivator" task Task type "Archivator" Node IDs to process the task This attribute is accessible only in the cluster environment. It's a comma- separated list of node IDs which may process the task. If it's empty, it may be any node, if there are nodes specified, the task will be processed on the first node which is online and ready.
Chapter 22. Viewing Job Runs - Executions History Executions History shows the history of all jobs that the Server has executed – transformation graphs, jobflows, and Data Profiler jobs. You can use it to find out why a job failed, see the parameters that were used for a specific run, and much more.
Chapter 22. Viewing Job Runs - Executions History Figure 22.2. Executions History - overall perspective Since the detail panel and expecially job logs may be wide, it may be useful to hide a table on the left, so the detail panel spreads.
Chapter 23. Listeners Listeners can be seen as hooks. They wait for a specific event and take a used-defined action if the event occurs. The event is specific to the particular listener (Graph Event Listeners (p. 151), Jobflow Event Listeners (p.
(Jobflow Event Listeners (p. 160)) – for CloverETL Server both are simply “jobs”. In the Cluster, the event and the associated task are executed on the same node the job was executed on by default. If the graph is distributed, the task will be executed on the master worker node. However, you can override where the task will be executed by explicitly specifying a Node IDs in the task definition.
Chapter 23. Listeners graph timeout Graph timeout event is created, when graph runs longer than for a specified interval. Thus you shell specify a "Job timeout interval" attribute for each listener of a graph timeout event. You can specify this interval in seconds or in minutes or in hours.
Page 158
Chapter 23. Listeners Note: You can use task of any type for both scheduling and graph event listener. Description of task types is divided into two sections just to show the most obvious use cases. In the Cluster environment, all tasks have an additional attribute "Node IDs to process the task". It's the comma separated list of cluster nodes, which may process the task.
Page 159
Chapter 23. Listeners Figure 23.2. Web GUI - send e-mail Note: Do not forget to configure connection to SMTP server (See Part III, “Configuration” (p. 43) for details). Placeholders Placeholder may be used in some fields of tasks. They are especially useful for e-mail tasks, where you can generate content of e-mail according to context variables.
Page 160
Chapter 23. Listeners Some of them may be empty depending on type of event. E.g., if task is processed because of graph event, then run and sandbox variables contain related data, otherwise they are empty, Table 23.2. Placeholders useful in e-mail templates Variable name Contains Current date-time...
Chapter 23. Listeners Table 23.3. Attributes of JMS message task Task type "JMS message" Initial context class name A full class name of javax.naming.InitialContext implementation. Each JMS provider has its own implementation. I.e., for Apache MQ it is "org.apache.activemq.jndi.ActiveMQInitialContextFactory". If it is empty, server uses the default initial context.
Chapter 23. Listeners Use Cases Possible use cases are the following: • Execute graphs in chain (p. 157) • Email notification about graph failure (p. 158) • Email notification about graph success (p. 158) • Backup of data processed by graph (p.
(p. 152)) in many ways, since ETL Graphs and Jobflows are both "jobs" from the point of view of the CloverETL Server. In the Cluster, the event and the associated task are executed on the same node the job was executed on. If the jobflow is distributed, the task will be executed on the master worker node.
Chapter 23. Listeners jobflow timeout A Jobflow timeout event is created, when jobflow runs longer then specified interval. Thus you have to specify "Job timeout interval" attribute for each listener of jobflow timeout event. You can specify this interval in seconds or in minutes or in hours.
Oracle website: http://docs.oracle.com/javaee/6/tutorial/doc/bncdq.html Note that the JMS implementation is dependent on the application server that the CloverETL Server is running in. In Cluster, you can either explicitly specify which node will listen to JMS or not. If unspecified, all nodes will register as listeners.
Page 168
Chapter 23. Listeners Attribute Description URL of a JMS message broker Durable subscriber (only If it is false, message consumer is connected to the broker as "non-durable", so for Topics) it receives only messages which are sent while the connection is active. Other messages are lost.
ServletContext javax.jms.Message instance of a JMS message com.cloveretl.server.api.ServerFacade serverFacade instance of serverFacade usable for calling CloverETL Server core features. String sessionToken sessionToken, needed for calling serverFacade methods Message data available for further processing A JMS message is processed and the data it contains is stored into two data structures: Properties and Data.
Chapter 23. Listeners Table 23.6. Properties Elements description JMS_PROP_[property key] For each message property is created one entry, where "key" is made of a "JMS_PROP_" prefix and property key. JMS_MAP_[map entry key] If the message is instance of MapMessage, for each map entry is created one entry, where "key"...
Page 171
Chapter 23. Listeners The “Data” container is passed to a task that can use it, depending on its implementation. For example, the task "execute graph" passes it to the executed graph as “dictionary entries.” In the Cluster environment, you can specify explicitly node IDs, which can execute the task. However, if the “data” payload is not serializable and the receiving and executing node differ, an error will be thrown as the Cluster cannot pass the “data”...
For example, you can continually check for essential data sources before starting a graph. Or, you can do complex checks of a running graph and, for example, decide to kill it if necessary. You can even call the CloverETL Server core functions using the ServerFacade interface, see Javadoc: http://host:port/clover/javadoc/index.html...
Page 173
ServletContext com.cloveretl.server.api.ServerFacade serverFacade instance of serverFacade usable for calling CloverETL Server core features. String sessionToken sessionToken, needed for calling serverFacade methods...
Chapter 23. Listeners File event listeners Since 1.3 File Event Listeners allow you to monitor changes on a specific file system path – for example, new files appearing in a folder – and react to such an event with a predefined task. You can either specify an exact path or use a wildcard, then set a checking interval in seconds, and finally, define a task to process the event.
Chapter 23. Listeners Observed File Observed file is specified by directory path and file name pattern. User may specify just one exact file name or file name pattern for observing more matching files in specified directory. If there are more changed files matching the pattern, separated event is triggered for each of these files. There are three ways how to specify file name pattern of observed file(s) •...
CloverETL Server may detect it. File moving/renaming should be atomic operation. Event of this type does not occur when the file has been updated (change of timestamp or size) between two checks.
Chapter 24. API Simple HTTP API The Simple HTTP API is a basic Server automation tool that lets you control the Server from external applications using simple HTTP calls. Most of operations is accessible using the HTTP GET method and return plain text. Thus, both “request” and “response”...
Page 178
Chapter 24. API Operation graph_run Call this operation to start execution of the specified job. The operation is called graph_run for backward compatibility, however it may execute ETL graph, jobflow or profiler job. parameters Table 24.1. Parameters of graph_run parameter name mandatory default description...
Page 179
Description is returned as plain text with a pipe as a separator, or as XML. A schema describing XML format of the XML response is accessible on CloverETL Server URL: http://[host]:[port]/clover/ schemas/executions.xsd In dependence on waitForStatus parameter it may return result immediately or wait for a specified status.
Page 180
Chapter 24. API http://localhost:8080/clover/request_processor/graph_kill?runID=123456&returnType=DESCRIPTION Operation server_jobs parameters returns List of runIDs of currently running jobs. example http://localhost:8080/clover/request_processor/server_jobs Operation sandbox_list parameters returns List of all sandbox text IDs. In next versions will return only accessible ones. example http://localhost:8080/clover/request_processor/sandbox_list Operation sandbox_content parameters Table 24.4.
Page 181
Chapter 24. API Table 24.5. Parameters of executions_history parameter name mandatory default description sandbox text ID of sandbox from Lower datetime limit of start of execution. The operation will return only records after (and equal) this datetime. Format: "yyyy-MM-dd HH:mm" (must be URL encoded). Upper datetime limit of start of execution.
Page 182
For returnType==DESCRIPTION_XML returns complex data structure describing one or more selected executions in XML format. A schema describing XML format of the XML response is accessible on CloverETL Server URL: http://[host]:[port]/clover/schemas/executions.xsd Operation suspend Suspends server or sandbox (if specified). Suspension means, that no graphs may me executed on suspended server/sandbox.
Chapter 24. API Result message Operation sandbox_create This operation creates a specified sandbox. If it is sandbox of "partitioned" or "local" type, it also creates locations by "sandbox_add_location" operation. parameters Table 24.8. Parameters of sandbox create parameter name mandatory default description sandbox Text Id of sandbox to be created.
Page 184
Chapter 24. API parameters Table 24.10. Parameters of sandbox add location parameter name mandatory default description sandbox Removes specified location from its sandbox. location Location storage ID. If the specified location isn't attached to the specified sandbox, sandbox won't be changed. verbose MESSAGE MESSAGE | FULL - how verbose should possible error message be.
Chapter 24. API returns Result message example of request (with using curl CLI tool (http://curl.haxx.se/)) curl -u username:password -F "overwriteExisting=true" -F "zipFile=@/tmp/my-sandbox.zip" http://localhost:8080/clover/simpleHttpApi/upload_sandbox_zip Operation cluster_status This operation displays cluster's nodes list. parameters returns Cluster's nodes list. Operation export_server_config This operation exports a current server configuration in XML format. parameters Table 24.13.
Chapter 24. API wget http://localhost:8080/clover/simpleHttpApi/export_server_config Operation import_server_config This operation imports server configuration. parameters Table 24.14. Parameters of server configuration import parameter name mandatory default description xmlFile An XML file with server's configuration. dryRun true If true, a dry run is performed with no actual changes written.
Chapter 24. API JMX mBean The CloverETL Server JMX mBean is an API that you can use for monitoring the internal status of the Server. MBean is registered with the name: com.cloveretl.server.api.jmx:name=cloverServerJmxMBean JMX Configuration Application's JMX MBeans aren't accessible outside of JVM by default. It needs some changes in an application server configuration to make JMX Beans accessible.
Page 188
JMX server of JVM. Use admin/adminadmin as user/password. (admin/adminadmin are default glassfish values) How to Configure JMX on WebSphere WebSphere does not require any special configuration, but the clover MBean is registered with the name that depends on application server configuration: com.cloveretl.server.api.jmx:cell=[cellName],name=cloverServerJmxMBean,node=[nodeName], process=[instanceName]...
Java version 1.6. Solution is quite easy, just set these two system properties: -Djava.rmi.server.hostname=[hostname address] Djava.net.preferIPv4Stack=true Operations For details about operations please see the JavaDoc of the MBean interface: JMX API MBean JavaDoc is accessible in the running CloverETL Server instance on URL: http://[host]:[port]/ [contextPath]/javadoc-jmx/index.html...
Chapter 24. API SOAP WebService API The CloverETL Server SOAP Web Service is an advanced API that provides an automation alternative to the Simple HTTP API. While most of the HTTP API operations are available in the SOAP interface too (though not all of them), the SOAP API provides additional operations for manipulating sandboxes, monitoring, etc.
The architecture of a Launch Service is layered. It follows the basic design of multi-tiered applications utilizing a web browser. Launch services let you build a user-friendly form that the user fills in and sends to the CloverETL Server for processing.
Dictionary is a key-value temporary data interface between the running transformation and the caller. Usually, although not restricted to, Dictionary is used to pass parameters in and out the executed transformation. For more information about Dictionary, read the “Dictionary” section in the CloverETL Designer User’s Guide. Passing Files to Launch Sevices If Launch service is designed to pass an input file to a graph or jobflow, the input dictionary entry has to be of type readable.channel.
Chapter 24. API Figure 24.5. Creating a new launch configuration Once you create the new Launch Service, you can set additional attributes like: 1. User and group access restrictions and additional configuration options (Edit Configuration) 2. Bind Launch Service parameters to Dictionary entries (Edit Parameters) Figure 24.6.
• Group - Restricts the configuration to a specific group of users. • User - Restricts the configuration to a specified user. • Sandbox - The CloverETL Sandbox where the configuration will be launched. • Job file - Selects the job to run.
Chapter 24. API Figure 24.8. Creating new parameter To add a new parameter binding, click on the “Add parameter” button. Every required a graph/jobflow listenerproperty defined by the job needs to be created here. Figure 24.9. Edit Parameters tab You can set the following fields for each property: •...
(You can use a Launch Services test page, accessible from the login screen, to test drive Launch Services.) [Clover Context]/launch/[Configuration name]?[Parameters] • [Clover Context] is the URL to the context in which the CloverETL Server is running. Usually this is the full URL to the CloverETL Server (for example, for CloverETL Demo Server this would be http://server- demo.cloveretl.com:8080/clover).
Page 197
Launch requests are recorded in the log files in the directory specified by the launch.log.dir property in the CloverETL Server configuration. For each launch configuration, one log file named [Configuration name]#[Launch ID].log is created. For each launch request, this file will contain only one line with following tab- delimited fields: (If the property launch.log.dir is not specified, log files are created in the temp directory...
Groovy Code API Since 3.3 The CloverETL Server Groovy Code API allows clients to execute Groovy code stored on the Server by an HTTP request. Executed code has access to the ServerFacade, instance HTTP request and HTTP response, so it's possible to implement a custom CloverETL Server API in the Groovy code.
OSGi bundle). It can add a new API operation or even extend the Server Console UI. It is independent of the standard clover.war. CloverETL itself isn't based on OSGi technology, OSGi is used only optionally for extending server APIs. OSGi framework is completely disabled by default and is enabled only when the property "plugins.path" is set as described below.
Page 200
OSGi plugin is better choice. E.g. custom API has to use different libraries then the ones on the server classpath. Whereas groovy uses the same classpath as CloverETL, the OSGi plugin has its own isolated classpath.
Connections (JDBC/JMS) may require third-party libraries. We strongly recommended adding these libraries to the app-server classpath. CloverETL allows you to specify these libraries directly in a graph definition so that CloverETL can load these libraries dynamically. However, external libraries may cause memory leak, resulting in "java.lang.OutOfMemoryError: PermGen space"...
See details about the possibilities with CloverETL configuration in Part III, “Configuration” (p. 43) This property must be the absolute path to the directory or zip file with additional CloverETL engine plugins. Both the directory and zip must contain a subdirectory for each plugin. These plugins are not a substitute for plugins packed in a WAR file.
Chapter 27. Troubleshooting Graph hangs and is un-killable Graph can sometimes hang and be un-killable if some network connection in it hangs. This can be improved by setting a shorter tcp-keepalive so that the connection times out earlier. The default value on Linux is 2 hours (7,200 seconds).
CloverETL Server does not recognize any differences between cluster nodes. Thus, there are no "master" or "slave" nodes meaning all nodes can be virtually equal. There is no single point of failure (SPOF) in the CloverETL cluster itself, however SPOFs may be in the input data or some other external element.
Basically, the more nodes we have in the cluster, the more transformation requests (or HTTP requests in general) we can process at one time. This type of scalability is the CloverETL server's ability to support a growing number of clients. This feature is closely related to the use of an HTTP load balancer which is mentioned in the previous section.
Component Allocation Allocation of a single component can be derived in several ways (list is ordered according priority): • Explicit definition - all components have common attribute Allocation. CloverETL Designer allows user to use convenient dialog. Figure 28.3. Component allocation dialog Three different approaches are available for explicit allocation definition: •...
Page 208
Chapter 28. Clustering Features allocation is automatically derived from locations of the partitioned sandbox. So in case you manipulate with one of these components with a file in partitioned sandbox suitable allocation is used automatically. • Adoption from neighbour components By default, allocation is inherited from neighbour components. Components on the left side have higher priority.
As you can see in the screenshot above, you can specify the root path on the filesystem and you can use placeholders or absolute path. Placeholders available are environment variables, system properties or CloverETL Server config property intended for this use sandboxes.home. Default path is set as [user.data.home]/CloverETL/ sandboxes/[sandboxID] where the sandboxID is ID specified by the user.
So each physical location will cause a single worker to run. This worker does not have to actually store any data to "its" location. It is just a way to tell the CloverETL Server: "execute this part of ETL graph in parallel on these nodes"...
Page 211
Chapter 28. Clustering Features CloverETL Server evaluates the sandbox URL on each worker and provides an open stream to a local resource to the component. The sandbox URL may be used on standalone server as well. It is excellent choice when graph references some resources from different sandboxes.
Chapter 28. Clustering Features Graph Allocation Examples Basic component allocation This example shows two component graph, where allocation ensures that the first component will be executed on cluster node1 and the second component will be executed on cluster node2. Basic component allocation with remote data transfer Two components connected with an edge can have different allocation.
Chapter 28. Clustering Features Example of Distributed Execution The following diagram shows a transformation graph used for parsing invoices generated by a few cell phone network providers in Czech Republic. The size of these input files may be up to a few gigabytes, so it is very beneficial to design the graph to work in the cluster environment.
Page 214
Chapter 28. Clustering Features The part of the graph demarcated by the four cluster components may have specified its allocation by the file URL attribute as well, but this part does not work with files at all, so there is no file URL. Thus, we will use the "node allocation"...
Page 215
Chapter 28. Clustering Features • does not contain any data and since the graph does not read or write to this sandbox, it is used only for the definition of "nodes allocation" • on the following figure, allocation is configured for two cluster nodes •...
Chapter 28. Clustering Features Scalability of the Example Transformation The example transformation has been tested in the Amazon Cloud environment with the following conditions for all executions: • the same master node • the same input data: 1.2 GB of input data, 27 million records •...
Chapter 29. Cluster Configuration Cluster can work properly only if each node is properly configured. Clustering must be enabled, nodeID must be unique on each node, all nodes must have access to shared DB (direct connection or proxied by another cluster node) and shared sandboxes, and all properties for inter-node cooperation must be set according to network environment.
String, URL http://localhost:8080/clover description: URL of the CloverETL cluster node. It must be HTTP/HTTPS URL to the root of a web application, thus typically it would be "http:// [hostname]:[port]/clover". Primarily it's used for synchronous inter- node communication from other cluster nodes. It's recommended to use a fully qualified hostname or IP address, so it's accessible from client browser or CloverETL Designer.
Chapter 29. Cluster Configuration Optional Properties Table 29.3. Optional properties - these properties aren't vital for cluster configuration - default values are sufficient property type default description cluster.jgroups.external_address String, IP address of the cluster node. Configure this only if address the cluster nodes are on the different sub-nets, so IP address of the network interface isn't...
Page 221
Must be the same on all cluster nodes. Its protection against fake messages. sandboxes.home.partitioned String ${user.data.home}/ This property is intended to CloverETL/ be used as placeholder in the sandboxes- location path of partitioned partitioned sandboxes. So the sandbox path is specified with the...
Page 222
String local Change this property "remote" if the node doesn't have direct connection to the CloverETL Server database, so it has to use some other cluster node as proxy to handle persistent operations. In such case, also property "cluster.datasource.delegate.nodeIds" must be properly configured.
Page 223
At least one of the listed node IDs must be running, otherwise this node will fail. All listed node IDs must have a direct connection to CloverETL Server database properly configured. Property "cluster.datasource.delegate.nodeIds" is ignored by default. Property "cluster.datasource.type" must be set to "remote"...
Example of 2 Node Cluster Configuration This section contains examples of CloverETL cluster nodes configuration. We assume that the user "clover" is running the JVM process and the license will be uploaded manually in the web GUI. In addition it is necessary to configure: •...
Chapter 29. Cluster Configuration jdbc.password=clover cluster.enabled=true cluster.node.id=node02 cluster.http.url=http://192.168.1.132:8080/clover cluster.jgroups.bind_address=192.168.1.132 cluster.group.name=TheCloverCluster1 If you use Apache Tomcat, the configuration is placed in $CATALINA_HOME/webapps/clover/WEB- INF/config.properties file. The location and file name on other application server may differ. 2-nodes Cluster with Proxied Access to Database This cluster configuration is similar to previous one, but only one node has direct access to database.
These two lines describe access to database via another node. 2-nodes cluster with load balancer If you use any external load balancer, the configuration of CloverETL Cluster will be same as in the first example. Figure 29.3. Configuration of 2-nodes cluster, one node without direct access to database The cluster.http.url and cluster.jgroups.bind_address are urls of particular cluster nodes...
Chapter 29. Cluster Configuration Jobs Load Balancing Properties Multiplicators of load balancing criteria. Load balancer decides which cluster node executes graph. It means, that any node may process request for execution, but graph may be executed on the same or on different node according to current load of the nodes and according to these multiplicators.
Chapter 29. Cluster Configuration Running More Clusters If you run more clusters, each cluster has to have its own unique name. If the name is not unique, the cluster nodes of different clusters may consider foreign cluster nodes as part of the same cluster. The cluster name is configured using cluster.group.name option.
Cluster Reliability in Unreliable Network Environment CloverETL Server instances must cooperate with each other to form a cluster together. If the connection between nodes doesn't work at all, or if it's not configured, cluster can't work properly. This chapter describes cluster nodes behavior in environment, where the connection between nodes is somehow unreliable.
Chapter 29. Cluster Configuration the event from NodeB. Also heart-beat is vital for meaningful load-balancing. The same check-task mentioned above also checks heart-beat from all cluster nodes. Time-line describing the scenario: • 0s network connection between NodeA and NodeB is down •...
Page 231
Chapter 29. Cluster Configuration • Since the network is down, also heart-beat can't be delivered and maybe HTTP connections can't be established, the cluster reacts as described in the sections above. Even though the nodes may be suspended, parent job A keeps waiting for the event from job B •...
Chapter 30. Recommendations for Cluster Deployment 1. All nodes in the cluster should have a synchronized system date-time. 2. All nodes share sandboxes stored on a shared or replicated filesystem. The filesystem shared among all nodes is single point of failure. Thus, the use of a replicated filesystem is strongly recommended. 3.
Chapter 31. Multiple CloverServer Instances on the same Host Running multiple CloverETL Server instances on the same host is not recommended. If you do so, you should ensure that the instances do not interfere with each other. • Each instance must run in a separate application server.
List of Figures 3.1. Adjusting Maximum heap size limit ..................18 3.2. Login page of CloverETL Server without license ................ 30 3.3. Add new license form ......................31 3.4. Update license form ......................32 3.5. Clover Server as the only running application on IBM WebSphere ..........36 12.1.
Page 235
23.9. Web GUI - "File event listeners" section ................169 24.1. Glassfish JMX connector ....................183 24.2. WebSphere configuration ....................184 24.3. Launch Services and CloverETL Server as web application back-end .......... 186 24.4. Launch Services section ...................... 187 24.5. Creating a new launch configuration ..................188 24.6.
List of Tables 1.1. CloverETL Server and CloverETL Engine comparison ..............3 2.1. Hardware requirements of CloverETL Server ................5 2.2. CloverETL Server Compatibility Matrix ..................6 9.1. General configuration ......................69 9.2. Defaults for job execution configuration - see Job Config Properties for details .........
Page 237
CloverETL Server 29.3. Optional properties - these properties aren't vital for cluster configuration - default values are sufficient ..........................215 29.4. Load balancing properties ....................222...
Page 238
List of Examples 17.1. Example of simple configuration defining one new server user........... 124...
Need help?
Do you have a question about the CloverETL 3.5 and is the answer not in the manual?
Questions and answers