OpenMP FAQs
Q1: What is OpenMP?
Q2: What does the MP in OpenMP stand for?
Q3: Why a new standard?
Q4: How is the OpenMP specification different from the
X3H5 draft standard?
Q5: How does OpenMP compare with ... ?
Q6: What about nested parallelism?
Q7: What about task parallelism?
Q8: What if I just want loop-level parallelism?
Q9: What does orphaning mean?
Q10: What languages does OpenMP work with?
Q11: Is support for other languages planned?
Q12: Is OpenMP scalable?
Q13: What about non-shared memory machines or networks of workstations?
Q14: Why should OpenMP succeed when PCF and X3H5 failed?
Q15: How will OpenMP be managed as a specification, over the long-term? Who
owns it?
Q16: How do I get other questions answered?
Q1: What is OpenMP?
A1: OpenMP is a specification for a set of compiler directives, library routines,
and environment variables that can be used to specify shared memory parallelism
in Fortran and C/C++ programs.
Q2: What does the MP in OpenMP stand for?
A2: The MP in OpenMP stands for Multi Processing. We provide Open specifications
for Multi Processing via collaborative work with interested parties from
the hardware and software industry, government and academia.
Q3: Why a new standard?
A3: Shared-memory parallel programming directives have never been standardized
in the industry. An earlier standardization effort, ANSI X3H5 was never formally
adopted. So vendors have each provided a different set of directives, very
similar in syntax and semantics, and each used a unique comment or pragma
notation for "portability". OpenMP consolidates these directive
sets into a single syntax and semantics, and finally delivers the long- awaited
promise of single source portability for shared-memory parallelism.
OpenMP also addresses the inability of previous shared-memory directive sets
to deal with coarse grain parallelism. In the past, limited support for coarse
grain work has led to developers thinking that shared-memory parallel programming
was inherently limited to fine-grain parallelism -- this isn't the case with
OpenMP. Orphaned directives in OpenMP offer the features necessary to represent
coarse-grained algorithms.
Q4: How is the OpenMP specification different from the X3H5 draft standard?
A4: The ANSI/X3 authorized subcommittee X3H5 was chartered for the purpose
of developing an ANSI standard based on the work done by the Parallel Computing
Forum (PCF). PCF was an informal industry group that attempted to complete
the work on standardized spelling of DO loop oriented parallelism. They produced
a draft standard, but never completed the work. The OpenMP specification
addresses the same problem. The difference is that the OpenMP Architecture
Review Board completed the task and the specification is gaining industry-wide
support. The OpenMP specification is an agreement reached between industry
vendors and users - it is not a formal standard.
Q5: How does OpenMP compare with ... ?
A5: MPI? Message-passing has become accepted as a portable style of parallel
programming, but has several significant weaknesses that limit its effectiveness
and scalability. Message-passing in general is difficult to program and doesn't
support incremental parallelization of an existing sequential program. Message-passing
was initially defined for client/server applications running across a network,
and so includes costly semantics (including message queuing and selection
and the assumption of wholly separate memories) that are often not required
by tightly-coded scientific applications running on modern scalable systems
with globally addressable and cache coherent distributed memories.
HPF? HPF has never really gained wide acceptance among parallel application
developers or hardware vendors. Some applications written in HPF perform well,
but others find that limitations resulting from the HPF language itself or
the compiler implementations lead to disappointing performance. HPF's focus
on data parallelism has also limited its appeal.
Pthreads? Pthreads have never been targeted toward the technical/HPC market.
This is reflected in the minimal Fortran support, and its lack of support for
data parallelism. Even for C applications, pthreads requires programming at
a level lower than most technical developers would prefer.
FORALL loops? FORALL loops are not rich or general enough to use as a complete
parallel programming model. Their focus on loops and the rule that subroutines
called by those loops can't have side effects effectively limit their scalability.
FORALL loops are useful for providing information to automatic parallelizing
compilers and preprocessors.
BSP or LINDA or SISAL or...? There are lots of parallel programming languages
being researched or prototyped in the industry. These may be targeted towards
a specific architecture, or focused on exploring one key requirement. If you
have a question about how OpenMP compares with a specific language or model,
we can help you figure this out.
Q6: What about nested parallelism?
A6: Nested parallelism is permitted by the OpenMP specification. Supporting
nested parallelism effectively can be difficult, and we expect most vendors
will start out by executing nested parallel constructs on a single thread.
OpenMP encourages vendors to experiment with nested parallelism to help us
and the users of OpenMP understand the best model and API to include in our
specification. We will include the necessasry functionality when we understand
the issues better.
Q7: What about task parallelism?
A7: Support for general task parallelism is not included in the OpenMP specification.
OpenMP encourages vendors to experiment with task parallelism to help us
and the users of OpenMP understand the best model and API to include in our
specification. We will include the necessasry functionality when we understand
the issues better.
Q8: What if I just want loop-level parallelism?
A8: OpenMP fully supports loop-level parallelism. Loop-level parallelism is
useful for applications which have lots of coarse loop-level parallelism,
especially those that will never be run on large numbers of processors or
for which restructuring the source code is either impractical or disallowed.
Typically, though, the amount of loop-level parallelism in an application
is limited, and this in turn limits the scalability of the application.
OpenMP allows you to use loop-level parallelism as a way to start scaling
your application for multiple processors, but then move into coarser grain
parallelism, while maintaining the value of your earlier investment. This incremental
development strategy avoids the all-or-none risks involved in moving to message-passing
or other parallel programming models.
Q9: What does orphaning mean?
A9: In early shared-memory models, parallel directives were only permitted
within the lexical extent of parallel regions. To the application programmer,
this meant that the directives had to be defined in such a way that all information
needed to parallelize a loop or subroutine had to be specified within the
source for that loop or subroutine. If another subroutine was called, parallel
information specific to that subroutine had to be specified at the call site,
or the called subroutine had to be (manually) inlined. This simplified model
was sufficient for the moderate, loop-level parallelism that dominated the
use of these models, but never allowed good scalability for very large applications.
For such large applications, programmers had to program outside the directive
set to achieve good performance, resulting in programs that were non-standard
and difficult to maintain.
Orphaning allows parallel directives to be specified outside
the lexical extent of parallel regions. A subroutine can be written for use
from a number of parallel regions, and parallel directives needed by that subroutine
embedded within its source, instead of having to be replicated everywhere that
calls it. This is a natural place to specify the parallelism, and avoids programming
errors that result when the earlier style is used for complex applications.
Orphaning is crucial to implementing coarse grain parallel algorithms, and to
the development of portable, parallel libraries.
Q10: What languages does OpenMP work with?
A10: OpenMP is designed for Fortran, C and C++ to support the language that
the underlying compiler supports. The OpenMP specification does not introduce
any constructs that require specific Fortran 90 or C++ features. OpenMP cannot
be supported by compilers that do not support one of Fortran 77, Fortran
90, ANSI 89 C or ANSI C++.
Q11: Is support for other languages planned?
A11: The OpenMP ARB does not plan to introduce support for additional languages
at this point.
Q12: Is OpenMP scalable?
A12: OpenMP can deliver scalability for applications using shared- memory parallel
programming. Significant effort was spent to ensure that OpenMP can be used
for scalable applications. Ultimately, scalability is a property of the application
and the algorithms used. The parallel programming language can only support
the scalability by providing constructs that simplify the specification of
the the parallelism and can be implemented with low overhead by compiler
vendors. OpenMP certainly delivers these kinds of constructs.
Q13: What about non-shared memory machines or networks of workstations?
A13: As much as it would be nice to think that a single programming model (OpenMP
or MPI or HPF or whatever) might run well on all architectures, this is not
the case today. OpenMP was designed to exploit certain characteristics of
shared-memory architectures. The ability to directly access memory throughout
the system (with minimum latency and no explicit address mapping) combined
with very fast shared memory locks, makes shared-memory architectures best
suited for supporting OpenMP.
Systems that don't fit the classic shared-memory architecture may provide
hardware or software layers that present the appearance of a shared-memory
system, but often at the cost of higher latencies or special limitations. For
example, OpenMP could be implemented for a distributed memory system on top
of MPI, so OpenMP's latencies would be greater than that of MPI (whereas typically
the reverse is the case on a shared-memory system). The extent to which these
latencies or limitations reduce application portability or performance will
help dictate whether vendors choose to develop OpenMP implementations for distributed
memory systems.
Q14: Why should OpenMP succeed when PCF and X3H5 failed?
A14: There are a variety of reasons OpenMP will succeed at being accepted as
a standard, where earlier efforts failed.
Goals - The OpenMP definition was driven by a small number of experts working
to an aggressive schedule, building primarily on current practice.
Technical - OpenMP includes better support for more scalable, coarse grain
parallelism and more directives for managing private/shared data. OpenMP also
includes query functions, environment variables, and conditional compilation
support.
Timing - The need for a standard in this area is better accepted throughout
the industry now, as vendors have begun to converge on system architectures
that combine aspects of both shared-memory and distributed architectures of
the past. The importance of a standard encouraging scalable, parallel application
development that can exploit shared-memory hardware is recognized as being
more important than ever. (Interest in PCF and X3H5 was partly derailed by
the appearance of pure distributed memory MPP systems, whose proponents were
arguing that shared-memory parallel programming was no longer interesting.)
Vendor support - The system vendors behind OpenMP collectively have delivered
a very large share of the shared-memory parallel systems in use today.
Q15: How will OpenMP be managed as a specification, over the long-term? Who
owns it?
A15: The OpenMP specifications are owned and managed by the OpenMP Architecture
Review Board (ARB).
Q16: How do I get other questions answered?
A16: You can send your questions to the OpenMP organization in the feedback
section of the web site. We endeavor to answer all queries within one week,
but sometimes it can take longer due to people's varying schedules, etc.
|