PARA'04 State-of-the-Art
in Scientific Computing
June 20-23, 2004 (Home page)

Updated: 9 February 2004

Compilation Techniques for a Chip-Multiprocessor with Two Execution Modes

Chao-Chin Wu
Department of Computer Science and Information Engineering
National Changhua University of Education,
Changhua, Taiwan, R.O.C.
Tel: +886-4-7232105 ext. 7120 Fax: +886-4-7211081
E-mail: ccwu@cc.ncue.edu.tw

Recent studies have shown that a conventional chip multiprocessor (CMP) cannot outperform a superscalar processor when executing integer operation-intensive applications. Therefore, we have proposed a novel microprocessor that supports the speculative multithreading mode and the wide-superscalar execution mode. Both execution modes provide a peak issue rate of sixteen instructions per cycle. The former mode behaves like a conventional CMP, while the latter mode integrates all the processing elements into a single-logic superscalar processor. Furthermore, we extend this innovative microarchitecture to support a third execution mode,whereby the processor keeps switching between the first and second modes when executing an application, according to the characteristics of subsequent instructions. As a result, this third mode can use both the advantages of a CMP and of a superscalar to execute an application. According to the performance analysis, our processor can provide an optimum system performance for all benchmark programs, regardless of workload characteristics. Furthermore, our CMP outperforms a conventional CMP, exhibiting a speedup of up to 1.32.

To improve an application's performance executing on this new microprocessor, compiler techniques can be applied to judge and optimize which parts of the application should be executed by which execution mode. The speculative multithreading outperforms the wide superscalar only when the former mode can exploit more parallelisms from different tasks. If complicated data and control dependences exist between tasks, data dependence violations may occur frequently in the speculative executions. As a result, the speculative executions of these tasks have to be terminated and restarted.

Therefore, we have developed several compilation techniques to analyse the dependences between tasks. Since the unique feature of the speculative multithreading mode is that, at most, four tasks are executed in parallel at any time, tasks with dependence distances equal to or larger than four will not incur dependence violations. Consequently, compilers may only be concerned with tasks with dependence distances less than four. Several conventional loop optimization techniques have been examined and modified to lengthen dependence distances for better loop-level parallelisms. On the other hand, if complicated dependencies exist and cannot be resolved, then compilers annotate that these tasks should be executed by the wide-superscalar mode. For this case, we can use compilation techniques to exploit the instruction-level parallelism. In short, by the assistance of compilers, the system performance of the novel CMP architecture can be improved further.

Home page


2004-02-09