High-Level Parallel Programming and the Efficient Implementation of Numerical Algorithms George Botorog High-level programming languages generally enable less efficient programs than their lower-level counterparts do. This drawback has lead to a restricted use of high-level approaches in areas where efficiency is essential, such as parallel processing. On the other hand, parallel processing could benefit enormously from programming paradigms that offer a high abstraction level. The present thesis aims to demonstrate that the two opposite goals, high level and efficiency, can be unified into a good trade-off. Moreover, it shows that high-level programming can be employed successfully in solving real-world problems in parallel. The approach presented here is based on the concept of algorithmic skeletons, representing useful patterns of parallel computations that are embedded in a sequential programming language. Being implemented as polymorphic higher-order functions, skeletons are endowed with a high generality and flexibility. This thesis shows that, in spite of their high level, skeletons enable writing parallel programs that reach a high efficiency and scalability. The first part of the dissertation deals with the sequential host language Skil, an imperative language enhanced with a series of functional features. The imperative character of Skil, together with a specially developed compile-time technique for the elimination of the functional features, leads to a highly efficient implementation of Skil programs. This is demonstrated by a series of parallel matrix applications, which run several times faster than previous skeleton-based implementations that use functional languages. At the same time, the Skil programs run only between 3% and 40% slower than the corresponding hand-coded C programs with message-passing. The second part of the dissertation concentrates on the use of skeletons in solving real-world problems in parallel. The application area considered here is that of multigrid methods, a class of numerical algorithms that provides the fastest known solvers for systems of discretized partial differential equations. Apart from their importance in numerical mathematics, multigrid methods are generally suited for parallelization. Moreover, these methods fit naturally in the skeletal concept, as they use a small set of basic operations as building blocks. Run-time results show that the skeleton-based implementations of two multigrid applications achieve a high scalability, with an efficiency of up to 70% on 256 processors.