Over the past few years, advances in CPU clock rates have stagnated as designers wrestle with power dissipation issues. Instead of pursuing higher clock rates, the latest trends in the semiconductor industry have pushed processor development toward multi-core designs. These multi-core processors allow for more total computation to be done at lower power levels; however, applications must employ concurrent programming techniques to benefit from these processors. Sequential scientific computation will only see marginal benefits in performance from the next several generations of processors as on-chip cache sizes increase.
To advance these sequential scientific computations, we explore a methodology for utilizing available computational resources through high level algorithm analysis and partitioning. These techniques rely on developing an underlying model for computation and communication in order to exploit parallel computation.