Ángel Núñez Pagán
Patricia Gordillo Blanco
Beatriz del Val Romero
The aim of this program is to find ORFs (Open Reading Frames) in FASTA format prokaryote sequences, forwards and for the 3 possible frames. Specifically, only the ORFs located between two stop codons. So, we discard the part of the sequence placed between the beginning and the first stop codon, as well as the part between the last stop codon and the end of the sequence. The program gives a score to each ORF based in a codon usage bias table. Finally, it gives back tabulated results in an ".out" file, which can be modified by the user.
1) This program is written in PERL language. So, a computer with UNIX installed in it is needed to run it.
2) It is very important to enter the parameters, and in the right order (that is, first the name of the file containing the sequence, then the file containing the codon usage bias table). Likewise, it has to be checked that both the sequence and the table are written in the DNA code.
3) It is necessary to check that the sequence and table formats are the right ones. At the moment of downloading the table at the web page recommended it is advisable to follow the next steps:
- select the organism
- submit the obtained table, selecting "standard" format and "A style like Codon Frequency output in
- click on "submit" and save the outcoming table at a .txt file
- run TRANSFORMER with the .txt file to get the same table in the right format to run ORFFINDER
4) Once obtained the ORFFINDER outfile, it can be filtered to obtain, in other outfiles, the ORFs that are more probable to be coding. The cut-off is fixed by the user through UNIX commands. It can be applied not to only to the score, but also to the length of the ORFs.
5) The filtered file has to be turned into GFF format to view the results graphically. It can be done using a UNIX command.
6) The GFF file transforms into PS (Postscript) file through the GFF2PS program .
7) The GHOSTVIEW program, which can be found in SHELL, is used to view the PS file at screen.
NOTE: all the necessary
UNIX commands are illustrated below.
The name of the files is illustrative, it can be chosen.
The sequence (first parameter) entry in NCBI can be found here.
The usage codon table (second parameter) comes from the web page suggested above.
NOTE: The parameters of the filter are changeable.
NOTE: To modify the format of the .ps file obtained, several GFF2PS parameters can be used. They can be found in the user's manual in the GFF2PS web page.
all the people that helped us (¡ya sabeis quien sois!)