Discussion:
[OMPI users] Hang in mpi on 32-bit
Orion Poplawski
2018-11-27 04:11:39 UTC
Permalink
Hello -

We are starting to see some mpi processes "hang" (really cpu spin and
never complete) on 32 bit architectures on Fedora during package tests.
Some examples:

hpl 2.2 and openmpi 2.1.5 on i686 and arm:

https://koji.fedoraproject.org/koji/taskinfo?taskID=31129461

hdf5 1.8.20 and openmpi 3.1.3 on i686 with the "t_cache" test.

https://copr-be.cloud.fedoraproject.org/results/@scitech/openmpi3.1/fedora-28-i386/00830432-hdf5/builder-live.log

I'm at a loss as to how to debug this further.
--
Orion Poplawski
Manager of NWRA Technical Systems 720-772-5637
NWRA, Boulder/CoRA Office FAX: 303-415-9702
3380 Mitchell Lane ***@nwra.com
Boulder, CO 80301 https://www.nwra.com/
Nathan Hjelm via users
2018-11-27 04:17:15 UTC
Permalink
Can you try configuring with —disable-builtin-atomics and see if that fixes the issue for you?

-Nathan
Post by Orion Poplawski
Hello -
https://koji.fedoraproject.org/koji/taskinfo?taskID=31129461
hdf5 1.8.20 and openmpi 3.1.3 on i686 with the "t_cache" test.
I'm at a loss as to how to debug this further.
--
Orion Poplawski
Manager of NWRA Technical Systems 720-772-5637
NWRA, Boulder/CoRA Office FAX: 303-415-9702
Boulder, CO 80301 https://www.nwra.com/
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Orion Poplawski
2018-11-27 16:49:00 UTC
Permalink
It does not appear to have any effect, at least not with 2.1.5.

Thanks.
Post by Nathan Hjelm via users
Can you try configuring with —disable-builtin-atomics and see if that fixes the issue for you?
-Nathan
Post by Orion Poplawski
Hello -
https://koji.fedoraproject.org/koji/taskinfo?taskID=31129461
hdf5 1.8.20 and openmpi 3.1.3 on i686 with the "t_cache" test.
I'm at a loss as to how to debug this further.
--
Orion Poplawski
Manager of NWRA Technical Systems 720-772-5637
NWRA, Boulder/CoRA Office FAX: 303-415-9702
Boulder, CO 80301 https://www.nwra.com/
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Orion Poplawski
Manager of NWRA Technical Systems 720-772-5637
NWRA, Boulder/CoRA Office FAX: 303-415-9702
3380 Mitchell Lane ***@nwra.com
Boulder, CO 80301 https://www.nwra.com/
Loading...