1. Introduction to EMBOSS

The European Molecular Biology Open Software Suite (EMBOSS) is a high quality, well documented package of open source software tools for molecular biology. It includes over 200 applications for molecular sequence analysis and other common tasks in bioinformatics. It integrates the core applications with a range of popular third party software packages under a consistent and powerful command line interface. The software has many useful features; for example, it automatically copes with data in a variety of formats and allows for transparent retrieval of sequence data from the web.

EMBOSS includes extensive C programming libraries with a clean and consistent API. There is much useful inbuilt functionality, for example the handling of the command line and common file formats, making it a powerful and convenient platform to develop and release bioinformatics programs. True to the spirit of Open Source, EMBOSS is free of charge to all and the code is licensed for use by everyone under the GNU General Public Licenses (GPL and LGPL). No one individual or institute 'owns' the code, or ever will. Under the terms of the licenses, it can be downloaded via the Internet, copied, customised and passed on, so long as these same freedoms are preserved for others. Contributions are strongly encouraged!

EMBOSS is well established. It is used in demanding production environments reflecting the maturity of the code base. A major new stable version is released each year. For those who need the latest code, the current source code tree can be downloaded via CVS. There have been many thousands of downloads including site-wide installations by administrators across the world, catering for hundreds or even thousands of users. Many interfaces to EMBOSS are available including easy to use web interfaces and powerful workflow software, enabling applications to be combined into analysis pipelines.