Overview of chemalot_knime

The chemalot_knime package contains the chemalot_knime framework. The chemalot_knime dynamically creates KNIME nodes from a configuration file that defines command line programs.
The command line nodes are executed on a remote server using ssh. The framework reads an xml configuration file and automatically creates KNIME nodes that execute command line programs for processing chemical compounds. A few lines of xml in the configuration file suffice to generate a new node. The only requirement for the command line tools is that they read SDF files from stdin and write SDF files to stdout.

In addition to executing the command line tools on a remote server the nodes can also generate the UNIX csh script to execute the same sequence of tools as a UNIX pipe. The result is that KNIME can be used to develop and debug complex UNIX pipes. Once the pipes have been tested and validated a light-weight execution is possible by simply extracting the c-shell syntax from the terminal command line node in KNIME and incorporating it into a UNIX csh shell script.

We recommend that before installing the chemalot_knime package you install the chemalot command line tools on your UNIX server and make sure they are working and included in your default path.

The configuration file (<chemalot_knime_dir>/config/cmdLine/commandLinePrograms.xml) in this package has entries predefined for all of the chemalot command line tools. The KNIME nodes for those will be auto generated but the nodes will only work if the command line tools themselves are in the user's path on the remote host.

Licensing

The chemalot_knime package is released under the GNU General Public License Version 3.

A copy of the license file can be found in the license folder.

Installation

This description assumes that you have downloaded and installed the KNIME Desktop from the KNIME Download Page. In this documentation the directory containing your KNIME executable will be denoted <KNIME_INSTALL_DIR>. To install the chemalot_knime framework and KINME nodes into an existing KNIME environment follow these steps:

The installation was tested using the 3.2.0 version of KNIME. It requires the 3.2.0 version but might work with newer versions. Make sure you have the following modules installed:

Installation of the chemalot_knime package:

For the SDF command line nodes to work you have to setup ssh using private keys in KNIME:

Test the chemalot_knime nodes:

Adding your own command line programs as KNIME nodes.

Command Line nodes are configured in the "<chemalot_knime_dir>/config/cmdLine/commandLinePrograms.xml" file. Open the file with your favorite text editor. Each KNIME node is defined by a <command> element. The example below defines the "babel" command line node:

<command name='babel' label='babel (OE)' subfolder='GNEStructManipulation'>
   <IO in="-in .sdf" out="-out .sdf"/>
   <default>-add2d</default>
   <ports in='sdf' out='sdf'/>
   <help option='--help all'/>
</command>

Attributes to the <command> element:


   GNEAdvanced, 
   GNEAdvanced/GNEAdvUtilities,
   GNEAdvanced/GNEAdvStructManipulation,
   GNEAdvanced/GNEAdvWriter,
   GNEAdvanced/GNEAdvStructManipulation/GNEAdvProps,
   GNEStructManipulation, 
   GNEStructManipulation/GNEStrProps,
   GNEStructManipulation/GNEStrDiversity,
   GNEStructManipulation/GNEStrSearch,
   GNEStructManipulation/GNEStrQSAR,
   GNEWriter
   GNEUserDefined  This is not currently used but could be used for adding your own command line programs.

Sub elements of <command>:

Modifying the chemalot_knime package

To install the KNIME SDK environment in order to debug or modify the code please follow the steps outlined in: developer.readme.md