As it was discussed in a previous blog post, “Moore’s Law, the perfect extrapolation or coming to an end?” written by stri, we are now confronting difficulties with doubling the transistor desnity and computing power. Thanks to engineers’ effort, we have seen the transistor density double every two years and processing speed improve exponentially in last twenty years, as Moore has interpolated and expected. While the number of transistors are still increasing at a fast rate today, the processing speed of computers has started to improve at a slower rate as wire delay and circuit complexity has become a bottleneck. To overcome this, computer architects have came up with having multiple cores in computers to run programs in parallel and speed them up. As a result, most of the new computers on market right now have multi-core CPU, such as Intel’s core2 duo/quad and AMD’s Phenom processor (quad), on them. Moreover, Intel is planning to plant six cores on their CPUs by this year, and we expect to have a lot more (more than hundreds) cores in future.
The linked article discusses about the increasing demand for more computing power from various industry, including financial and medical institutions. Unfortunately, currently there are difficulties in having high performance computing (HPC), and HPC has been “the playground of the privileged few” so far. While lots of HPC demands require scientific computing, the serious bottleneck of HPC has been lack of parallel, multi-threaded HPC applications, which is necessary to make full use of contemporary parallel high performance computers. To reduce the gap between software parallelism and the power of parallel high-performance computers, it is important to parallelize algorithms and methods we use in scientific computing.
From what we have learned in class, it would be difficult to parallelize bisection, Newton’s or secant method, since they have a single dependence chain (requiring its previous step’s result to start the next step). However, polynomial interpolation could be easily parallelized by distributing the computations of calculating each coefficient to multiple cores. Also, our programming assignment 1, ray tracing, could have been accelerated by having multiple threads work on rays. With a 400×400 image, for example, we would have 160,000 root findings to do and, with four threads and a core2 quad processor, each core would only have to do approximately 40,000 root findings and finish 4 times faster. Since each ray tracing would take different time, allocating right amount of jobs to each thread, or core, could be a difficult job. Also, when we have multiple threads read/write to a shared location, synchronizing them might be a hard task. These issues can be studied in a linked text book.
Sources:
Article - http://www.hpcwire.com/hpc/2111454.html - The Next Challenge in High Performance Computing - Applications and Application Enablement by Ashwini K. Nanda
Text on parallel programming - http://www.mhpcc.edu/training/workshop/parallel_intro/MAIN.html
Moore’s law: - http://en.wikipedia.org/wiki/Moore’s_law






Leave a Comment
You must be logged in to post a comment.
* You can follow any responses to this entry through the RSS 2.0 feed.