We can always build complicated solutions, but in some cases sane and
simple solutions exists. Let me clear some of the misinformation in this
thread.
truncation to a sane value is the rule. This is nothing new, the rules are
similar to other data conversion standards such as XDR. Thus, if you send
otherwise [MAX|MIN]_LONG on the target machine. For floating point data the
architecture a sane value is obtained. Otherwise, the data will be replaced
with one of the extremes. This also applies to file operations for as long
as the correct external32 type is used.
as the source and target machine are correctly identified. This
heterogeneous architectures.
George.
Post by George ReekeDear colleagues,
FWIW, years ago I was looking at this problem and developed my
--Be sure your code that works with ambiguous-length types like
'long' can handle different sizes. I have replacement unambiguous
typedef names like 'si32', 'ui64' etc. for the usual signed and
unsigned fixed-point numbers.
--Run your source code through a utility that analyzes a specified
set of variables, structures, and unions that will be used in
messages and builds tables giving their included types. Include
these tables in your makefiles.
--Replace malloc, calloc, realloc, free with my own versions,
where you pass a type argument pointing into to this table along
with number of items, etc. There are separate memory pools for
items that will be passed often, rarely, or never, just to make
things more efficient.
--Do all these calls on the rank 0 processor at program startup and
call a special broadcast routine that sets up data structures on
all the other processors to manage the conversions.
--Replace mpi message passing and broadcast calls with new routines
that use the type information (stored by malloc, calloc, etc.) to
determine what variables to lengthen or shorten or swap on arrival
at the destination. Regular mpi message passing is used inside
these routines and can be used natively for variables that do not
ever need length changes or byte swapping (i.e. text). I have a
simple set of routines to gather statistics across nodes with sum,
max, etc. operations, but not too fancy. I do not have versions of
any of the mpi operations that collect or distribute matrices, etc.
--A little routine must be written for every union. This is called
from the package when a union is received to determine which
member is present so the right conversion can be done.
--There was a hook to handle IBM (hex exponent) vs IEEE floating
point, but the code never got written.
Because this is all very complicated and demanding on the
programmer, I am not making it publicly available, but will be
glad to send it privately to anyone who really thinks they can
use it and is willing to get their hands dirty.
Post by Jeff Squyres (jsquyres)Post by dpchoudh .Is this (heterogeneous cluster support) something that is specified by
the MPI standard (perhaps as an optional component)?
The MPI standard states that if you send a message, you should receive
the same values at the receiver. E.g., if you sent int=3, you should
receive int=3, even if one machine is big endian and the other machine is
little endian.
Post by Jeff Squyres (jsquyres)It does not specify what happens when data sizes are different (e.g., if
type X is 4 bits on one side and 8 bits on the other) -- there's no good
answers on what to do there.
Post by Jeff Squyres (jsquyres)Post by dpchoudh .Do people know if
MPICH. MVAPICH, Intel MPI etc support it? (I do realize this is an
OpenMPI forum)
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users