Guidelines for Writing System Software and Tools Requirements
Terms and Definitions
Terminology
The following terms have been defined for use in the documents associated with
this effort:
- Platform: Generic term for a parallel or clustered computing
system. Unless explicitly noted otherwise, capabilities apply
equally to shared-memory, distributed-memory, and SMP systems.
Note that a particular vendor may implement multiple platforms,
so an element described as platform-specific may be
supported by a vendor in more than one way.
- PE (Processing Element): Generic reference to the basic
hardware unit on a given platform; may correspond to "node",
"processor", "CPU", or "PE", depending on the system.
- API (Application Programming Interface): Syntax and
semantics for invoking services from within an executing
application. All APIs must be available to both Fortran and C programs,
although implementation issues (such as whether the Fortran
routines are simply wrappers for calling C routines) are up
to the supplier.
- Standard (as applied to APIs): Where an API is
required to be consistent across platforms, the reference
standard is named as part of the capability. The implementation
must include all routines defined by that standard (even if some
simply result in no-ops on a given platform).
- Published (as applied to APIs): Where an API is not
required to be consistent across platforms, the capability lists
it as "published," referring to the fact that it must be
documented and supported, although it may be vendor- or
even platform-specific.
- Current standard: Term applied when an API is not "frozen"
on a particular version of a standard, but should be upgraded
automatically by vendors as new specifications are released
(e.g., "MPI version 1.1" refers to the standard in effect at
the time of writing this document, while "current version of
MPI" refers to further versions that take effect during the
two-year lifetime of this document.
- Fully-supported (as applied to system software and tools):
A product-quality implementation, documented and maintained by
the HPC machine supplier or an affiliated software supplier.
- XXX-compatible (as applied to system software and tool
definitions): Requires that a capability be compatible, at
the interface level, with the referenced standard, although the
lower-level implementation details may differ substantially
(e.g., "DCE/DFS-compatible" means that the distributed file system
must be capable of handling standard DFS requests, but need
not conform to DFS implementation specifics).
- Single-point control (as applied to tool interfaces):
Refers to the ability to control or acquire information on
all processes/PEs using a single command or operation.
Prioritization Levels
The capabilities established by the task force were assigned to one of five
priority levels:
- Baseline Development Environment (BDE): Capabilities that are
needed by an overwhelming majority of HPC user sites. The recommendation
of the task force is that the entire BDE be included on all procurements
for parallel and clustered computing systems.
- Desirable Level 1 (D1): Capabilities that are important to a
significant number of HPC user sites, but not to an overwhelming
majority.
- Desirable Level 2 (D3): Capabilities that are important to a
significant number of HPC user sites, but involve significant extra
implementation effort or embody a better way of doing something that can
already be accomplished (albeit crudely) using the BDE or D1 elements.
- Desirable Level 3 (D3): Capabilities that are important to many HPC
user sites, but can be interpreted as "value-added" rather than essential
needs.
- Omitted: A number of other capabilities were considered by the task
force. These were omitted from the final document either because the
technology is not sufficiently advanced to permit a clear specification of
the requirement, or because the group felt that the requirement did not
apply to a large enough cross-section of the HPC user community. They
provide important indicators of future needs, however. See the supporting
documentation for details.
Back to document home page.