The left hand side of the main Jemboss window (Section 188.8.131.52, “Main Jemboss Window”) gives access to all programs available through the Jemboss interface.
At the top of the pane, the category menus group together programs with similar analysis characteristics.
Alignment and then highlight
global from the submenu to see all programs that offer a global alignment of sequences. Highlight and click on
stretcher to see the program form appear in the central Jemboss pane.
Located on the Jemboss toolbar, the favourites menu offers a selection of commonly used programs. These can be edited (Section 9.7.2, “Programme Selection”) to customise the list and optimise program access.
Click on the
Favourites menu and select
Global Alignments. This will alter the program in the central pane to
Further down the left hand pane all the programs are listed alphabetically. The scroll bar to the right allows access to any one of these programs. However, if the name of the required program is known, access may be quicker using the
Go To box (Section 9.4.5, “Go To Box”).
Directly above the alphabetical program list is an entry field. Any entry accesses the program list and highlights a program name according to the letters in the entry field. This method can be faster than any other selection method as only a few letters of the program name need be typed in.
m in the
Go To box to highlight the first program beginning with
at into the
Go To box so the entry now reads
mat. This will highlight the first entry beginning with
mat, which is the global alignment program matcher. Hit the return button on the computer keyboard to bring up the matcher program form in the central pane.
The same text entry can be used to reselect the same program in the event of mis-entry (see Section 9.7.3, “Input/Output Options”)
Should the results of the selected program require sequence features in any format, then the
Use Feature Information box at the top of the input section should be selected. This option is only available for those programs that retrieve sequences: seqret, seqretsplit, skipseq, splitter and union.
This is the default selection and allows entry of either stored files (including listfiles (Section 6.6, “The Uniform Sequence Address (USA)”) via drag and drop from Local (Section 9.3.1, “Local File Management”) and Remote (Section 9.3.14, “Remote File management”) File Managers as well as from the Sequence List (Section 184.108.40.206, “Sequence Input”). If the file to be dragged is a listfile (Section 220.127.116.11, “Re-writing a File with New Data”) then the entire entry must be prefixed with an
@ sign to indicate to Jemboss the nature of the data.
USAs (Section 6.6, “The Uniform Sequence Address (USA)”) can be entered directly into the field.
Browse files button to the right of the entry field. This immediately accesses the Jemboss home directory (Section 9.3.2, “Home Directory”). Double click on the
Example folder in this directory and select the
bgal_ecoli.fasta file (if this has not been created, see the practical in Section 18.104.22.168, “Saving Analysis Results” and Section 9.3.9, “Rename”) and open the file. The entire path of this file will now be written into the entry field.
Reset button to clear the entry field. Open the local file manager (Section 9.3.1, “Local File Management”) and drag the
bgal_ecoli.fasta file into the entry field. Once there is visual indication that the mouse is over the input field, drop the file by releasing the mouse button. The entire file path will be displayed in the field.
Reset button to clear the field once more.
Open the remote file manager (Section 9.3.14, “Remote File management”) and drag in the
bgal_frag.fasta file (see Section 9.3.16, “Moving Data between File Managers”). The remote path is displayed in the field.
Reset button to remove the remote entry. Click on the
Input Sequence Options button, select the
uniprot option from the
Databases available drop down menu and hit the
OK button. This database is now written in the entry field. Type
bgal_ecoli in the entry field after the colon. The
bgal_ecoli sequence will now be retrieved from the uniprot database.
The database retrieval option using such a USA might only be possible if the desktop computer is connected to the Internet as the sequence may need to be retrieved from a remote database.
Selection of this option allows a sequence or a list of sequences to be pasted into a larger field. Sequences should be pasted in using the desktop shortcut for paste (
<CONTROL> + V for Windows,
<Apple> + V for Macintosh, middle mouse button for Unix)
This option is useful only for those programs requiring a number of input files such as
emma, the multiple sequence alignment tool. It consists of 20 File/ Database Entry fields (Section 22.214.171.124, “File/Database Entry”) and accepts files specified in the usual manner.
Very few of these sequence attributes are necessary for a successful analysis run as they can be detected automatically.
Input Sequence Options button to see potential sequence attributes.
Lists all databases available for a particular installation of Jemboss. Full names plus any name derivatives are shown, e.g. both
uni are often used to specify the uniprot protein database.
Lists all of the EMBOSS-acceptable formats (Section A.1, “Supported Sequence Formats”). It is not normally necessary to specify the format as the program can generally detect this, however if the sequence format is somewhat obscure (e.g.
jackknifer), it may be required.
Used if only a portion of a larger sequence need be analysed. Thus an entire database entry can be retrieved but only the relevant portion will undergo analysis.
Enter 300 in the begin field and 600 in the end field.
A selection here will ensure that the analysis run also includes a check of the reverse complement sequence. It can be used, for example, for nucleotide sequence translations and finding open reading frames or stem loops.
This specifies the type of sequence file used as input. This is generally obvious to the program, but may be necessary for specific types of sequences, for example a peptide sequence composed of a disproportionate number of alanines, threonines, glycines and cytosines, or a nucleotide sequence containing several ambiguity codes. Only one of these options may be selected.
Forces the program to return the sequence text in either upper or lower case. The default is upper case. Only one of these options may be selected.
The UFO (Uniform Feature Object) is the standard way of specifying file formats containing feature information (Section 5.3, “Introduction to Feature Formats”). In order to use this option the
Use feature information box (Section 126.96.36.199, “Features”) should be selected.
You use the
UFO features box to optionally load in a features file in association with any sequence you have specified on the main application form. The UFO command line syntax needs to be used is explained elsewhere (Section 6.7, “The Uniform Feature Object (UFO)”).
This is a large bar running halfway across the central pane with text in red capitals. Its action is to load the sequence in advance of the analysis run. This is only relevant in cases where there are parameter dependencies on the form which are based on the sequence. The most obvious of these cases are alignment programs, which select default matrices and penalties based on whether the sequence is nucleotide or protein.
LOAD SEQUENCE ATTRIBUTES bar to update the default gap penalties. Select
No to the confirmation message so the inputted start and end sites are not overwritten.
This bar will load sequence attributes for the entire sequence, and so will offer to override any attributes selected in the
Input Sequence Options (Section 9.4.8, “Input Sequence Options”).
uni:bgal1_entcl in the second sequence filename entry box and load sequence attributes for that sequence also. Look at the
end sequence attribute options to ensure the full 1028 peptides of the sequence have been loaded.
Any options (Section 6.1, “Introduction to the EMBOSS Command Line”) needed for analysis of the input file are listed after the input section. These parameters are required for the analysis to complete. All mandatory parameters are subject to a default setting, which may or may not be visible to the user. Consult the documentation (Section 9.9, “Documentation”) for each program to ascertain these settings.
Use the drop down menu to alter the matrix selection to
EPAM250. Hit the
GO button to retrieve the local alignment. Minimize the Saved Results (Section 188.8.131.52, “Saved Results Window”) window.
Depending on the program, the output section may contain a single option to alter the output sequence format (such as matcher), or it may contain a more comprehensive list of parameters that may be included in the final output (e.g. remap). All output section parameters are subject to a default setting, which may or may not be visible to the user. Consult the documentation (Section 9.9, “Documentation”) for each program to ascertain these settings.
For all programs returning a sequence an
Output Sequence Name entry field is available, and will name the appropriate results tab (Section 184.108.40.206, “Saved Results Window”) with whatever name is entered. Only the filename is returned, and any filename extensions will be lost.
Select seqret by selecting the
Database Sequence Retrieval option from the
Favourites menu at the top of the Jemboss window. Type
uni:bgal_ecoli into the
Sequence Filename field and
bgal_ecoli_1 into the
Output Sequence Name field in the output section. Hit
GO and note the name of the results tab (Section 220.127.116.11, “Saved Results Window”) containing the returned sequence.
Currently the name is not transferred when the results are saved, it is for display purposes only.
Close the Results window.
Available for any program which outputs a sequence, the output sequence options allow the user to customise a returned sequence should there be such a requirement.
Separate file for each entry option can be toggled on and off and allows the data to be returned as separate results tabs and not as a single, multiple sequence file. This may be easier to view, but each tab must be saved separately whereas a single multiple data tab can be saved in one go.
The default output for any sequence in EMBOSS is fasta , but any one of the formats currently supported can be selected from the drop down menu.
Adds the specified extension to the filename. Anything entered here, however, is overridden by an entry in the
Output Sequence Name box (Section 9.4.18, “Output Section”).
This option is not available if the
Separate file for each entry option is selected.
This option is for programs which return more than one data file. The base filename chosen will be applied to all data and ascending numbers appended to the name.
This option is not available if the
Separate file for each entry option is selected.
The features format only needs to be specified here and no colon ('
:') is required. In order to use this option, the
Use feature Information box (Section 18.104.22.168, “Features”) should be selected.
Use Feature Information box at the top of the seqret program form. Enter
uni:bgal_ecoli in the
Sequence Filename field (resetting any other entries if necessary). Open the
Output Sequence Options and delete any entries currently visible. In the
Features format entry field type
swiss. Hit the
Two results tabs will be returned. The first will be
bgal_ecoli.swiss and contain the features of this protein in swissprot format and the second,
bgal_ecoli.fasta, will be the sequence. If the
swiss format is not entered then Jemboss will return the features in the default GFF format.
Close the Results window.
The features output filename (only) needs to be specified here. In order to use this option, the
Use feature Information (Section 22.214.171.124, “Features”) should be selected.
Use Feature Information option should be selected on the seqret program form. Enter
uni:bgal_ecoli in the
Sequence Filename field (deleting anything else if necessary). Open the
Output Sequence Options and in the
Feature Format entry field, type
swiss. In the
Features Filename entry field type
OK to close the options menu and hit the
GO button. The results will be the same as for the previous example except that the output tab for the features is now called
Separate file for each entry option is selected then the individual sequences appear in separate tabs, but the feature information will appear consecutively in the same tab.
uni:bgal*_e* in the
Sequence Filename field. Select the
Separate file for each entry option in the output sequence options. Leave everything else as in the practical above and hit
GO. Scroll to the end of the
features tab and compare to the end of the
features tab for the last practical. The
bgal1_entcl features, and possibly others, should have been added.
Close the Results window
The default output format for any single sequence returned by EMBOSS is FASTA. The default for alignment programs may differ between programs and the default is displayed in parentheses. These defaults may be altered using the drop down menus.
There are two options for those programs which offer a graphical output. The default PNG output is a static line drawing of the output image. The alternative is Jemboss Graphics which can be selected from the drop down menu. This offers an interactive graphical display.
dotmatcher by typing
do into the
Go To field and hitting return. Enter
uni:bgal_ecoli into the first
Sequence Filename field and
uni:bgal1_entcl into the second. Hit the
GO button to return results as a static image.
PNG graphics files must be saved with a
.png extension to the filename to allow them to be recognised by the software.
Close the graphics window.
Leave the entries in
dotmatcher and alter the drop down menu in the
Output Section to read
Jemboss Graphics. Hit the
GO button. Graphics should appear in an interactive graphic.
The font size may be altered using the drop down menu on the graphics toolbar. The view may be altered using the percentage zoom menu, also on the toolbar. Hover the mouse over anywhere on the graphic to see the coordinates of that location.
File menu on the graph display and select
An EMBOSS data file window opens to reveal a text version of the dotmatcher graphic. This information cannot be saved.
Options menu on the graphic toolbar to alter the axes and label information. Any alterations can be selected using the
OK button to close the options window. The
APPLY button will effect the changes on the graphic without closing the window. These changes will remain even when the
CANCEL button is then applied.
Delete the text in the
Main Title field and enter
bgal_ecoli vs bgal1_entcl. Hit the
This field will accept unlimited characters, but the title appears only on one line of the graphic, centred in the middle of the graph. Thus if the title is too long, it will disappear off the end of the graphic.
The number format for both X and Y axis can be altered using the drop down menu.
The number of ticks displayed on each axis can be entered in the appropriate fields. There is no limit to the number of ticks entered, however too many will result in a thick, black, indistinguishable line under the axis.
End sites are labelled by default. The
Start site is always zero and the
End site represents the length of the sequence. This may lead to irregular axis numbering. There is no limit placed on new entries, thus if the
End site is longer than the actual sequence the graph will move to the left (on the X axis) and down (on the Y axis).
The title of each axis may be altered by entering the required text. There is no limit to the text that may be entered, but longer text may disappear off the end of the axis.
The height and width of the graph may be altered. The plot is created as a disproportional plot, but this can be altered by adjusting the height and/or width of the plot.
Should it be required, the colour of the graph can be altered by clicking once with the left hand mouse button on the
Graph Colour square. A new colour may be selected from the resulting palette. The colour affects the data only, and not the axes. The width of the graph line may also be made thicker by adjusting the
Graph Line Width. There is currently no limit to the line width which can be selected, but a larger line width may obscure data.
Graphics are saved as they are viewed on the screen, and can be saved in a variety of different formats.
File menu at the top of the dotmatcher graphic. Select the
Save field and the default PNG format. Alter the format using the right hand
Select Format drop down menu to
dotmatcher.jpeg into the
File Name field and save to the same folder as the other files in this section.
Advanced program parameters are hidden on the initial program form as they are not required for the analysis run. They are revealed by clicking on the
Advanced Options button and scrolling down the program form.
Advanced Options button to reveal the additional program parameters. Alter the window size to
5 and hit the
GO button to display small but almost identical matches. Compare the results with those of the analysis run using the larger window size.
The majority of programs do not require a great amount of compute power and the results are ready immediately. These programs are run interactively (Section 9.4.38, “Interactive”). Some analyses, however, take extra memory and time and it makes sense to run them in batch mode (Section 9.4.39, “Batch”). The mode in which any program is run can be altered using the drop down
Execution mode menu at the left of the
GO button (in older versions of Jemboss it appears at the bottom right).
This is the default for the majority of programs. Results immediately appear in the Saved Results windows (Section 126.96.36.199, “Saved Results Window”) on screen as the analysis run finishes. During the run the Jemboss screen is locked and it is impossible to conduct any further analyses whilst the current one is running.
This is the mode in which the dotmatcher example above is run.
Any program can be altered to run in batch mode. This may be advantageous if, for example, the desktop computer is slow or there are a number of analyses which need to be carried out before comparing results.
Those analyses that require a greater amount of compute power are by default run in batch mode. The entire analysis is done in the background and Jemboss can continue to be used whilst the analysis is running.
Alter the drop down menu at the bottom left of the central Jemboss pane to read batch and hit the
GO button once again for the dotmatcher analysis. The process is sent to the Job Manager (Section 9.6.3, “Job Manager”) and is noted on screen by the message
sending batch process now. Results can be retrieved once the run is completed.
Any program set to run in batch by default may be altered to run in interactive mode, however this would freeze the Jemboss window for the duration of the run.