Kernel MG is the benchmark of a three dimensional multi-grid method. I used segment directed data distribution and with overlapping methods up to 26% improvement was achieved. In the class A of kernel MG, the communication wait time is about 50%, and calculation time is about 40%, and receive time is about 10%. We can expect up to 50% improvement for this kernel, but the achieved improvement was only 26%. Then we should invest more efficient improvement for this kernel.
Table 11: The execution time with performance improvement. The case 1 is sample problem trial with 6 SUN SparcStation-2, the case 2 is also sample problem with 8 processors, the case 3 is class A problem with 64 processors. The model 1 is original(non-tuned), model 2 is calculation order tuning, model 3 is overlapping grid assignment to the neighbor processors, model 4 communication/calculation overlapping, model 5 segment directed task assignment, respectively.