I wrote parallel transpose algorithm using MPI.
This program with MPI_Alltoall and MPI_Ghater .
It has run time error : stack around variable d is corrupted .
where variable d is array that is receive buffer in MPI_Gather method.
and the output only has process with rank 0 results . it has only first nlocal number of process 0.
what is wrong ?