2.6. Package Structure

2.6.1. Major Components

Let's assume the stable version of EMBOSS was downloaded into the HOME directory /home/auser. The EMBOSS source tree is installed in the EMBOSS-6.4.0 directory, i.e. /home/auser/EMBOSS-6.4.0. Listing this directory reveals the EMBOSS source tree. The top level directories in the EMBOSS source tree are summarised below.

emboss/C source code and ACD files of the applications.

embassy/(CVS release only). It contains a subdirectory for each EMBASSY package, containing the relevant C source code and ACD files.

ajax/AJAX C-programming sub-libraries, which include functions for sequence reading and writing, file handling, string handling and so on.

nucleus/High level C-library functions. These are almost exclusively molecular biology algorithms, for example alignments, pattern matching, restriction enzymes and isoelectric point calculation.

plplot/Third party graphics libraries licensed under the GNU LGPL. This library is called from AJAX.

scripts/Various scripts used by the EMBOSS developers and system administrators for EMBOSS maintenance. You should not normally need to use these files.

test/Various files and databases used for quality assurance. You should not normally need to use these files.

jemboss/Java source code and other files for the Jemboss graphical user interface to EMBOSS.

doc/Subdirectories here contain application documentation in both HTML and plain text formats, the latter being used by the tfm program which returns usage information for the named application.

m4/(CVS release only. Contains files necessary for the creation of the configure file used for configuring EMBOSS.

2.6.2. Sub-components

The important subdirectories are described below.

ajax/acdACD file-handling functions. These control all aspects of ACD file parsing, command line handling and user-prompting. Functions to initialise ACD file parsing and a series of retrieval functions for each ACD datatype (integer, sequence, and so on) are included.

ajax/expatThis is an XML parser in which an application registers handlers for things the parser might find in the XML document (like start tags). See http://expat.sourceforge.net/.

ajax/zlibThis is used for data compression/decompression. See http://www.zlib.net/.

ajax/ajaxdbFunctions for handling sequence database access.

ajax/coreThis is where the bulk of the AJAX sub-libraries are located.

ajax/ensemblFunctions for accessing the Ensembl database (http://www.ensembl.org).

ajax/graphicsFunctions for handling graphics and printing.

ajax/pcreFunctions for handling Perl-compatible regular expressions (http://www.pcre.org/).

emboss/acdEvery EMBOSS application has an associated ACD (Ajax Command Definition) file describing the application interface. The files are kept in this directory and have the extension .acd. For example, the application water has the ACD file water.acd.

emboss/dataMany of the EMBOSS applications use data files that are kept here. This directory is in the search path used by the AJAX library functions for accessing data files used by the applications. For some databases there are many data files and these appear under separate subdirectories. For example, there are directories emboss/data/PROSITE, emboss/data/REBASE and emboss/data/PRINTS. Such directories must be populated with files before they are used. For example, the EMBOSS applications prosextract, rebaseextract and printsextract are used to populate the directories mentioned.

plplot/libThis is used by the graphics applications at run-time, for example, to load font files. The environment variable PLPLOT_LIB can also be used to point to the font files (but need not be set unless you change their location, see Section 6.3, “Global Command Line Qualifiers”).

doc/programs/*Directories here contain the application documentation, which must adhere to the format used by existing applications otherwise commands that use the documentation, like tfm, won't work.

2.6.3. Differences between CVS and Stable Versions

There is very little difference in package structure between the developers (CVS) and stable versions. The CVS version contains m4 files used during installation whereas the stable release doesn't. The stable release contains 'configure' scripts (used to configure EMBOSS) whereas the CVS version contains the files necessary to create them.

The CVS directory structure also has an extra emboss directory at the top level (/home/auser/emboss/). This allows for a convenient place to install the bin, lib and share directories in case EMBOSS is installed using make install (see Section 2.7, “Installation”).