This subsection examines how the number of memory arrays (n) affects the delay, area, and flexibility of an FCM architecture. For all architectures in this section, both the number of external data buses (m) and the number of external address buses (r) are set to 4; thus, all architectures can implement at most 4 logical memories. In addition, the nominal data width of each array is fixed at 16, and the set of allowable effective data widths is . Four memory sizes (parameter b) from 8Kbits to 64Kbits were considered. For each memory size, the number of arrays was varied from 4 to 64 (a greater number of arrays implies that each array is smaller, since the total number of bits is kept constant). Figure 5(a) shows that as the number of arrays increases, the chip area required to implement the configurable memory increases. This is due to the need for more decoders, drivers, sense amplifiers, and mapping blocks.
Figure 5(b) shows the effect on the average logical memory access time (in the 0.8um CMOS process used in ). As the number of arrays increases, the delay due to the mapping blocks increases. However, smaller arrays are faster (due to the shorter wordlines, bitlines, and the smaller decoder). The two competing trends cause a minimum in the access time graph, which is especially clear in the 64Kbit case.
Figure 6 shows the effect of changing the number of arrays on the flexibility of the configurable memory (in order to focus on the dependency of flexibility on the number of arrays, only the 64Kbit results are shown). The vertical scale is the proportion of test configurations that could be successfully mapped. As mentioned in Section 3, one of the reasons a mapping might be unsuccessful is that the granularity of the arrays is too course. Since each array can be connected to at most one address bus, an array can not be shared between two logical memories. Thus, if a logical memory uses only half an array, the remainder is wasted. If the arrays are only half the size, however, those bits would be available to implement another logical memory (recall that each logical configuration contains several logical memories). As the graph shows, the increase in flexibility is not significant, especially past eight arrays. Thus, the access time and area should be of primary concern when choosing a value for n.
The results in Figures 5 and 6 apply only to the architecture parameters described above and the workload parameters in Table 3. For these parameters, the graphs suggest that the memory should be broken into eight blocks. Although these results are specific to the parameters shown, we believe that the trends shown in the graphs apply over a wide range of architecture and workload parameters.
It is important to point out that in order to concentrate on how the memory granularity affects flexibility, we have fixed the number of external buses (r and m) in this experiment. If these parameters were allowed to increase with n, architectures with more arrays would be able to implement configurations with more logical memories, and would thus be more flexible.
Figure 5: Effects of changing number of arrays (n) on delay and area
Figure 6: Effects of changing number of arrays (n) on flexibility