Sitemap

IntroductionSearch designsUpload your designDownload database

MaD - Microarray Designer is a search tool and database of near-optimal designs for dual-channel microarray experiments.Using the search interface, you can browse the pre-generated designs, or trigger a new search of designs. You can also upload your own design files in order to compare with, or to improve the database. You can also download a daily snapshot of the database.

Microarrays are used to quantify the differential expression of genes across different experimental conditions. The cDNA from two samples are labelled with Cy3 Green, and Cy5 Red fluorophores, respectively, and hybridized onto an array of probes. The relative intensities of the fluorophores are then used to infer the differential expression of genes between the two samples. The data generated by microarray experiments is highly multidimensional and contains a significant amount of noise. Therefore, careful planning of the experimental setup is required in order to obtain statistically significant and biologically valid conclusions.


A sample design graph and design matrix
for 4 samples and 5 arrays


Microarray experiment designs are represented as directed graphs where the vertices represent the experimental samples (a.k.a. conditions, or varieties), and the edges represent individual microarray slides (a.k.a arrays), with the directionality of the edge denoting the dye-assignments of the corresponding two samples (e.g., Cy3 Green --> Cy5 Red labeling). A corresponding design matrix is a matrix of number-of-arrays by number-of-samples array where each row stands for an edge, and the columns corresponding to the source and destination of the edge are denoted by 1, and -1, respectively.

For a given experimental setup, various optimality criteria have been defined to estimate, in advance, the expected accuracy of the experiment results. A, L, and D-optimality are the most commonly used criteria (Please see Wit et.al., 2005 for details). We have transformed these optimality criteria such that the optimal design is defined in terms of minimizing the respective criteria:



  • A-optimality is defined as the average variance of the parameter estimates:
  • L-optimality is defined as the variance of the parameter estimates with respect to all parameter contrasts (C):
  • D-optimality is defined in terms of the determinant of the design matrix. We take the logarithm for numerical convenience:

For small experiments, it is possible to enumerate all of the designs and find the best design (the one that gives the minimal optimality value). However, the number of different designs (graphs) grow exponentially as the number of arrays and slides increase. Therefore, a guided search and sampling of the designs is required to find near-optimal designs. Here is a summary of each of the methods available through the search interface:

  • Heuristic Search: This is a heuristic approach to search for near-optimal designs, such that a given design is optimized by repeatedly adding and removing edges (slides) to improve the optimality score.
  • Interwoven Loop Design: This is a special class of designs where the samples are connected as a loop and edges are introduced at certain intervals (jumps). The values for the jumps are determined by ensuring that pairwise path lengths between samples are minimized. The original method Interwoven Loop (smida) is also provided as an interface to the loop design as implemented in the smida software package. The smida loop construction may be slow if you are triggering new searches, so we recommend the use of our native heuristic construction. Note also that smida does not support loop construction if the number of arrays is not an exact multiple of the number of samples.
  • Simulated Annealing (smida): This is the simulated annealing approach described in Wit et.al. (2005) and implemented in smida software package.
  • Heuristic-enhanced Simulated Annealing: The Heuristic Search is integrated into the Simulated Annealing as one of the possible steps. The other steps include random changes in the edges of the graph. The probabilities for following the possible steps have been re-optimized.
  • User-uploaded files: These are the collection of user-contributed design files.

Except for the user-uploaded design files, only the best design found for each method is kept in the database. A background process in the server continously re-runs the search method to see if a better design can be obtained, and updates the database accordingly.

References

  • Wit, E., Nobile, A., and Khanin, R. (2005). Near-optimal designs for dual channel microarray studies. Journal of the Royal Statistical Society: Series C (Applied Statistics), 54(5):817–830.