Cruciform DNAs are formed at the site of direct inverted repeats. These inverted repeats can be perfect i.eW?1 . without any mismatched bp among the repeat sequence orW?2 imperfect i.eW?3 . contains one or more mismatchedW?4 bp among the repeat sequence. Perfect or imperfect inverted repeats of 6 or more nucleotides pair with one another on each strand resulting in the formation of two stem-loop structures that are opposite to each other with a 4-way junction (Figure 1 & Table 1). First attempt to computationally predict cruciform forming sequence motifs was made in 1995 (Schroth & Ho, 1995). In this study W?5 the researchers developed the computer-aided pattern matching method to hunt for the occurrence of putative cruciform forming sequences. To accomplish this task they used the concept of searching two complementary sets (IRs) of DNA bases (>8 bases but with no maximum size limit of each set), in both directions of the sequence that lie 3 to 6 nucleotide apart from each other (a gap that may or may not conform to the IR symmetry). This algorithm searches along the whole length of DNA sequence to identify all putative centers of repeatW?6 with IR starting at between 1.5 and 3.0 on their both sides. All IRs does not have the ability toW?7 form cruciforms. Studies show that long repeat sequences form more stable non-B-DNA structures (Murchie & Lilley, 1992; Zheng et al, 1991)W?8 . To select potential repeats that bear non-B-DNA conformations, Schroth and Ho decided to filter total population of repeats based on the following criteria : (i) the shortest length of the IR was >10 bp in each half of the repeat; (ii) the repeat had no mismatches unless the IR was >12bp in length on each half.
Another study aimed to identify potential cruciform structures in DNA sequences (Lexa et al, 2012)W?9 . First, they identified all IRs in a DNA sequence by implementing a modified Landau-Vishkin algorithm (Landau et al., 1986) with suffix arrays. The candidate IRs are then subjected to UNAFold (Markham & Zuker, 2008) to assess their ability to form alternative structure. UNAFold calculates the free energy of the structures by assessing the optimal pairing of nucleotides in nucleic acid strands. Structures with the lowest free energy are hypothesized to exist in vivo and hence put through further analysis. In the next step W?10 low energy structures forming IRs are simulated through event-based stochastic mathematical model proposed by Matej Lexa and coworkers. This model operates on superhelical density, amount of the relaxed state of DNA, which is the characteristic of the topological state of DNA (Zheng et al., 1991). This model calculates the likelihood of a given DNA segment to exist in the canonical or non-canonical structure. Experimentally W?11 identified threshold value of -0.05 of spherical density is used above which no cruciform formation is possible in the simulated molecules (Singleton & Wells, 1982).