#include <Dynalign_object.h>
Public Member Functions | |
Dynalign_object () | |
Dynalign_object (const char sequence1[], const char sequence2[], const bool IsRNA=true) | |
Dynalign_object (const char filename1[], const int type1, const char filename2[], const int type2, const bool IsRNA=true) | |
Constructor. | |
Dynalign_object (const char filename[]) | |
Constructor. | |
Dynalign_object (const char *filename, const short maxtrace, const short bpwin, const short awin, const short percent) | |
int | Dynalign (const short int maxtrace=20, const short int bpwin=5, const short int awin=1, const short int percent=20, const short int imaxseparation=-99, const float gap=0.4, const bool singleinsert=true, const char savefile[]=NULL, const bool optimalonly=false, const short int singlefold_subopt_percent=30, const bool local=false, const short int numProcessors=1, const int maxpairs=-1) |
Predict the lowest free energy structure common to two sequences and suboptimal solutions with the Dynalign algorithm. | |
void | WriteAlignment (const char filename[]) |
Write the alignment to disk. | |
int | ForceAlignment (const int i, const int k) |
Force an alignment during a Dynalign calculation). | |
int | GetForcedAlignment (const int i, const int seq) |
Get an alignment constraint. | |
int | ReadAlignmentConstraints (const char filename[]) |
Read alignment constraints from disk. | |
int | Templatefromct (const char ctfilename[]) |
Read a ct file to determine what pairs will be allowed for sequence 1 in a subsequent dynalign calculation. | |
int | Templatefromdsv (const char dsvfilename[], const float maxdsvchange) |
double | GetBestPairEnergy (const int sequence, const int i, const int j) |
Report the best energy for pair i-j from sequence number sequence (1 or 2). | |
double | GetLowestEnergy () |
Report the lowest total free energy change from a Dynalign calculation. | |
char * | GetErrorMessage (const int error) |
Return error messages based on code from GetErrorCode and other error codes. | |
void | SetProgress (TProgressDialog &Progress) |
void | StopProgress () |
~Dynalign_object () | |
Private Member Functions | |
void | CommonConstructor () |
void | AllocateForceAlign () |
void | storetemplatefilename (const char *name) |
Private Attributes | |
short ** | align |
short ** | forcealign |
bool | dsv_templated |
bool | ct_templated |
char * | templatefilename |
float | MAXDSV |
int | modificationflag |
dynalignarray * | w |
dynalignarray * | vmod |
varray * | v |
wendarray * | w5 |
wendarray * | w3 |
short * | lowend |
short * | highend |
datatable * | data |
short | gap |
short | lowest |
int | Maxtrace |
bool | savefileread |
double *** | array |
The Dynalign_object class provides an entry point for the Dynalign algorithm. The class is inherited from the TwoRNA class, which itself contains two instances to the class RNA.
Dynalign_object::Dynalign_object | ( | ) |
Constructor This is a default constructor that calls the TwoRNA default constructor.
Dynalign_object::Dynalign_object | ( | const char | sequence1[], | |
const char | sequence2[], | |||
const bool | IsRNA = true | |||
) |
This constuctor is available to reach the TwoRNA constructors, from which this class is inherited. This constructor uses two cstrings to provide sequences. IsRNA is true for RNA folding and flase for DNA. Constructor This constuctor is available to reach the TwoRNA constructors, from which this class is inherited. This constructor uses two cstrings to provide sequences. IsRNA is true for RNA folding and false for DNA. Input sequences should contain A,C,G,T,U,a,c,g,t,u,x,X. Capitalization makes no difference. T=t=u=U. If IsRNA is true, the backbone is RNA, so U is assumed. If IsRNA is false, the backbone is DNA, so T is assumed. x=X= nucleotide that neither stacks nor pairs. For now, any unknown nuc is considered 'X'. Both sequences are passed to underlying RNA classes for each sequence.
sequence1 | is a NULL terminated c string for sequence 1. | |
sequence2 | is a NULL terminated c string for sequence 2. | |
IsRNA | is a bool that indicates whether these sequences are RNA or DNA. true=RNA. false=DNA. Default is true. Both sequences must have the same backbone. |
Dynalign_object::Dynalign_object | ( | const char | filename1[], | |
const int | type1, | |||
const char | filename2[], | |||
const int | type2, | |||
const bool | IsRNA = true | |||
) |
Constructor.
This constuctor is available to reach the TwoRNA constructors, from which this class is inherited. This constructor uses two ctsirngs as filenames, accompanied by integers to set the file type. IsRNA is true for RNA folding and false for DNA. The existing files, specified by filenames, can either be a ct file, a sequence, or an RNAstructure save file. Therefore, the user provides a flag for the file type: type = 1 => .ct file, type = 2 => .seq file, type = 3 => partition function save (.pfs) file, type = 4 => folding save file (.sav). The file opening is performed by the constructors for the RNA classes that underlie each sequence. This constructor generates internal error codes that can be accessed by GetErrorCode() after the constructor is called. 0 = no error. The errorcode can be resolved to a c string using GetErrorMessage. Note that the contructor needs to be explicitly told, via IsRNA, what the backbone is because files do not store this information. Note also that save files explicitly store the thermodynamic parameters, therefore changing the backbone type as compaared to the original calculation will not change structure predictions.
filename1 | is a null terminated c string and refers to sequence 1. | |
filename2 | is a null terminated c string and refers to sequence 2. | |
type1 | is an integer that indicates the file type for sequence 1. | |
type2 | is an integer that indicates the file type for sequence 2. | |
IsRNA | is a bool that indicates whether these sequences are RNA or DNA. true=RNA. false=DNA. Default is true. Only one backbone is allowed for both sequences. |
Dynalign_object::Dynalign_object | ( | const char | filename[] | ) |
Constructor.
This constructor allows the user to read a dynalign save file (.dsv) to get base pairing information. This constructor generates internal error codes that can be accessed by GetErrorCode() after the constructor is called. 0 = no error. The errorcode can be resolved to a c string using GetErrorMessage.
filename[] | is a cstring that indicates the filename of a .dsv file. |
Dynalign_object::Dynalign_object | ( | const char * | filename, | |
const short | maxtrace, | |||
const short | bpwin, | |||
const short | awin, | |||
const short | percent | |||
) |
Constructor This constructor is used to perform Dynaligh refolding. This does not allow any changes in constraints, but does allow the creation of different set of suboptimal structures. This constructor generates internal error codes that can be accessed by GetErrorCode() after the constructor is called. 0 = no error. The errorcode can be resolved to a c string using GetErrorMessage.
filename | is the name of a Dynalign save file name (.dsv). | |
maxtrace | is the maximum number of common structures to be determined. The recommended default is 20. | |
bpwin | the the base pair window parameter, where 0 allows the structures to have similar pairs and larger windows make the structures more diverse. The recommended default is 5. | |
awin | is the alignment window parameter, where 0 allows the alignments to be similar and larger values make the alignments more diverse. The recommended default is 1. | |
percent | is the maximum percent difference in total folding free energy change above the lowest for suboptimal common structures. The recommended default is 20. |
Dynalign_object::~Dynalign_object | ( | ) |
void Dynalign_object::AllocateForceAlign | ( | ) | [private] |
void Dynalign_object::CommonConstructor | ( | ) | [private] |
int Dynalign_object::Dynalign | ( | const short int | maxtrace = 20 , |
|
const short int | bpwin = 5 , |
|||
const short int | awin = 1 , |
|||
const short int | percent = 20 , |
|||
const short int | imaxseparation = -99 , |
|||
const float | gap = 0.4 , |
|||
const bool | singleinsert = true , |
|||
const char | savefile[] = NULL , |
|||
const bool | optimalonly = false , |
|||
const short int | singlefold_subopt_percent = 30 , |
|||
const bool | local = false , |
|||
const short int | numProcessors = 1 , |
|||
const int | maxpairs = -1 | |||
) |
Predict the lowest free energy structure common to two sequences and suboptimal solutions with the Dynalign algorithm.
In case of error, the function returns a non-zero that can be parsed by GetErrorMessage() or GetErrorMessageString().
maxtrace | is the maximum number of common structures to be determined. The defaults is 20. | |
bpwin | the the base pair window parameter, where 0 allows the structures to have similar pairs and larger windows make the structures more diverse. The default is 5. | |
awin | is the alignment window parameter, where 0 allows the alignments to be similar and larger values make the alignments more diverse. The default is 1. | |
percent | is the maximum percent difference in total folding free energy change above the lowest for suboptimal common structures. The defaults is 20. | |
imaxseparation | is the maximum separation between aligned nucleotides. Values >= 0 are the traditional parameter, those below zero trigger the HMM alignment method, which is now prefered. | |
gap | is the cost of adding gap nucleotides in the alignment in kcal/mol. | |
singleinsert | is whether single basepair inserts are allowed in one sequence vs the other. | |
savefile | is c-string with the name of a dynalign savefile (*.dsv) to be created. | |
optimalonly | can be used to turn on a calculation of only the energy (when true) and not the structures. | |
singlefold_subopt_percent | is the maximum % difference of folding energy above the lowest free energy structure for pairs in single sequence folding that will be allowed in the dynalign calculation. | |
local | is whether Dynalign is being run in local (true) or global mode (false). | |
numProcessors | is the number of processors to use for the calculation. This requires a compilation for SMP. | |
maxpairs | is under development for multiple sequence folding. Use -1 (default) for now. |
int Dynalign_object::ForceAlignment | ( | const int | i, | |
const int | k | |||
) |
Force an alignment during a Dynalign calculation).
Nucleotide i from sequence 1 will be aligned to nucleotide k in sequence 2 in subsequent Dynalign calculation. The function returns 0 with no error and a non-zero otherwise that can be parsed by GetErrorMessage() or GetErrorMessageString().
i | is the index of nucleotide from sequence 1. | |
k | is the index of nucleotide from sequence 2. |
double Dynalign_object::GetBestPairEnergy | ( | const int | sequence, | |
const int | i, | |||
const int | j | |||
) |
Report the best energy for pair i-j from sequence number sequence (1 or 2).
This function reports the lowest ffolding free energy for any pairs between i-j in sequence number sequence (1 or 2). This requires a search over all possible pairs in the second sequence. NOTE: This function ONLY works after reading a Dynalign save file (.dsv) using the constructor. This is because the Dynalign energies are not normally stored after calling Dynalign. This function generates internal error codes that can be accessed by GetErrorCode() after the constructor is called. 0 = no error, 107 = Data not available, 108 = nucleotide out of range. The errorcode can be resolved to a c string using GetErrorMessage.
sequence | is an integer indicating the sequence # (must be 1 or 2). | |
i | is the 5' nucleotide in a pair. | |
j | is the 3' nucleotide in a pair. |
char * Dynalign_object::GetErrorMessage | ( | const int | error | ) |
Return error messages based on code from GetErrorCode and other error codes.
0 = no error 100-999 = Error associated with Dynalign, to be handled here. >=1000 = Errors for underlying sequence, get message from TwoRNA base class. Current errors handled here are: 100 "Nucleotide from sequence 1 is out of range.\n"; 101 "Nucleotide from sequence 2 is out of range.\n"; 102 "Alignment constraint file not found.\n"; 103 "Error reading alignment constraint file.\n"; 104 "CT file not found.\n"; 105 "A template has already been specified; only one is allowed.\n"; 106 "DSV file not found.\n"; 107 "Data not available to calculate energy.\n" 108 "Nucleotide out of range.\n"; 109 "Value of maxpairs is too large to be achievable.\n" 110 "Error reading thermodynamic parameters."
error | is the integer error code provided by GetErrorCode() or by a call to a function that returns an error. |
Reimplemented from TwoRNA.
int Dynalign_object::GetForcedAlignment | ( | const int | i, | |
const int | seq | |||
) |
Get an alignment constraint.
i | is the nucleotide number. | |
seq | is the sequence (1 or 2) from which i is derived. |
double Dynalign_object::GetLowestEnergy | ( | ) |
Report the lowest total free energy change from a Dynalign calculation.
NOTE: This function ONLY works after reading a Dynalign save file (.dsv) using the constructor. This is because the Dynalign energies are not normally stored after calling Dynalign. This function generates internal error codes that can be accessed by GetErrorCode() after the constructor is called. 0 = no error, 107 = Data not available, 108 = nucleotide out of range. The errorcode can be resolved to a c string using GetErrorMessage.
int Dynalign_object::ReadAlignmentConstraints | ( | const char | filename[] | ) |
Read alignment constraints from disk.
The file format is: i1 k1 i2 k2 -1 -1 Where each line gives a aligned pair (i from sequence 1 and k from sequence 2). The file terminates with -1 -1 to indicate the file end. The function returns 0 with no error and a non-zero otherwise that can be parsed by GetErrorMessage() or GetErrorMessageString().
filename | is a c string that is the file name to be read. |
void Dynalign_object::SetProgress | ( | TProgressDialog & | Progress | ) |
Provide a TProgressDialog for following calculation progress. A TProgressDialog class has a public function void update(int percent) that indicates the progress of a long calculation.
Progress | is a TProgressDialog class. |
void Dynalign_object::StopProgress | ( | ) |
Provide a means to stop using a TProgressDialog. StopProgress tells the RNA class to no longer follow progress. This should be called if the TProgressDialog is deleted, so that this class does not make reference to it.
void Dynalign_object::storetemplatefilename | ( | const char * | name | ) | [private] |
int Dynalign_object::Templatefromct | ( | const char | ctfilename[] | ) |
Read a ct file to determine what pairs will be allowed for sequence 1 in a subsequent dynalign calculation.
This results in all pairs but those in the ct being disallowed.
ctfilename | is the name of the ct file to be read to provide the template. |
int Dynalign_object::Templatefromdsv | ( | const char | dsvfilename[], | |
const float | maxdsvchange | |||
) |
This reads a dsv file and only allows pairs with folding free energy change between the lowest and lowest + maxdsvchange in a subsequent dynalign calculation.
dsvfilename | is the name of the ct file to be read to provide the template. | |
maxdsvchange | in a float that gives a percent difference in free energy above the lowest free energy change. |
void Dynalign_object::WriteAlignment | ( | const char | filename[] | ) |
Write the alignment to disk.
This function should be called after loading a dynalign save file or after a dynalign calculation has been performed. This function generates no error flag. Nothing can go wrong...
filename | is the file to which the alignment should be written. |
short** Dynalign_object::align [private] |
double*** Dynalign_object::array [private] |
bool Dynalign_object::ct_templated [private] |
datatable* Dynalign_object::data [private] |
bool Dynalign_object::dsv_templated [private] |
short** Dynalign_object::forcealign [private] |
short Dynalign_object::gap [private] |
short * Dynalign_object::highend [private] |
short* Dynalign_object::lowend [private] |
short Dynalign_object::lowest [private] |
float Dynalign_object::MAXDSV [private] |
int Dynalign_object::Maxtrace [private] |
int Dynalign_object::modificationflag [private] |
bool Dynalign_object::savefileread [private] |
char* Dynalign_object::templatefilename [private] |
varray* Dynalign_object::v [private] |
dynalignarray * Dynalign_object::vmod [private] |
dynalignarray* Dynalign_object::w [private] |
wendarray * Dynalign_object::w3 [private] |
wendarray* Dynalign_object::w5 [private] |