API Reference

mpi4torch.JoinDummies(loopthrough: torch.Tensor, dummies: List[torch.Tensor]) torch.Tensor

This function joins multiple dummy dependencies with the DAG.

From the perspective of the forward pass, this function is mostly a no-op, since it simply loops through its first argument, and discards the dummies argument.

However, for the backward pass, the AD engine still considers the dummies as actual dependencies. The main use of this function is thus to manually encode dependencies that the AD engine does not see on its own. See also the introductory text in the Implications for mpi4torch section on how to use this function.

Parameters:
  • loopthrough – Variable to pass through.

  • dummies – List of tensors that are added as dummy dependencies to the DAG.

Returns:

Tensor that is a shallow copy of loopthrough, but whose grad_fn is JoinDummiesBackward.

Return type:

torch.tensor

mpi4torch.JoinDummiesHandle(handle: mpi4torch.WaitHandle, dummies: List[torch.Tensor]) mpi4torch.WaitHandle

This function has the same purpose as JoinDummies(), but accepts mpi4torch.WaitHandle as its first argument.

Parameters:
  • handlempi4torch.WaitHandle to pass through.

  • dummies – List of tensors that are added as dummy dependencies to the DAG.

Returns:

A wait handle with the additional dummy dependenices added.

Return type:

mpi4torch.WaitHandle

mpi4torch.MPI_MAX
mpi4torch.MPI_MIN
mpi4torch.MPI_SUM
mpi4torch.MPI_PROD
mpi4torch.MPI_LAND
mpi4torch.MPI_BAND
mpi4torch.MPI_LOR
mpi4torch.MPI_BOR
mpi4torch.MPI_LXOR
mpi4torch.MPI_BXOR
mpi4torch.MPI_MINLOC
mpi4torch.MPI_MAXLOC
mpi4torch.COMM_WORLD = <mpi4torch.MPI_Communicator object>

World communicator MPI_COMM_WORLD.

class mpi4torch.MPI_Communicator(comm: <torch.ScriptClass object at 0x7f59c010af30>)

MPI communicator wrapper class

The only supported ways to construct an MPI_Communicator are currently either through mpi4torch.COMM_WORLD or mpi4torch.comm_from_mpi4py().

Note

All methods with an underscore suffix are in-place operations.

Allgather(tensor: Tensor, gatheraxis: int) Tensor
Allreduce(tensor: Tensor, op: int) Tensor

Combines values from all processes and distributes the result back to all processes.

The combination operation is performed element-wise on the tensor.

This is the wrapper function of MPI_Allreduce.

Parameters:
Returns:

Combined tensor of the same shape as the input tensor.

Return type:

torch.Tensor

Note

Only mpi4torch.MPI_SUM is supported in the backwards pass at the moment.

Alltoall(tensor: Tensor, gatheraxis: int, scatteraxis: int, numelem: int) Tensor
Bcast_(tensor: Tensor, root: int) Tensor

Broadcasts a tensor from the root process to all other processes.

This is an in-place operation.

This is the wrapper function of MPI_Bcast.

Parameters:
  • tensortorch.Tensor that shall be broadcasted. The tensor needs to have the same shape on all processes, since it is an in-place operation.

  • root – The root process, whose tensor shall be broadcasted to the others.

Returns:

For rank == root this is the same as the input tensor. For all other processes this is the input tensor filled with the content from the root process.

Return type:

torch.Tensor

Gather(tensor: Tensor, gatheraxis: int, root: int) Tensor
Irecv(tensor: Tensor, source: int, tag: int) WaitHandle
Isend(tensor: Tensor, dest: int, tag: int) WaitHandle
Recv(tensor: Tensor, source: int, tag: int) Tensor
Reduce_(tensor: Tensor, op: int, root: int) Tensor

Reduces multiple tensors of the same shape, scattered over all processes, to a single tensor of the same shape stored on the root process.

The combination operation is performed element-wise on the tensor.

This is an in-place operation.

This is the wrapper function of MPI_Reduce.

Parameters:
Returns:

For rank == root the result stores the reduced tensor. For all other processes the content of the resulting tensor is undefined, with the exception that the result shall still suffice as input for the second argument of mpi4torch.JoinDummies().

Return type:

torch.Tensor

Note

Only mpi4torch.MPI_SUM is supported in the backwards pass at the moment.

Scatter(tensor: Tensor, scatteraxis: int, numelem: int, root: int) Tensor
Send(tensor: Tensor, dest: int, tag: int) Tensor
Wait(waithandle: WaitHandle) Tensor
property rank: int

The rank or identification number of the local process with respect to this communicator.

The processes participating in a communicator are consecutively given ranks in the interval [0, mpi4torch.MPI_Communicator.size - 1].

property size: int

The size of the MPI communicator, i.e. the number of processes involved.

class mpi4torch.WaitHandle(raw_handle: List[Tensor])

Class representing a wait handle, as they are returned from one of the non-blocking MPI calls.

property dummy

A dummy variable that allows for the usage of the WaitHandle as one of the second arguments of mpi4torch.JoinDummies() and mpi4torch.JoinDummiesHandle().

mpi4torch.comm_from_mpi4py(comm) MPI_Communicator

Converts an mpi4py communicator to an mpi4torch.MPI_Communicator.

mpi4torch.deactivate_cuda_aware_mpi_support() None

Deactivates the CUDA-aware MPI support.

Calling this function forces mpi4torch to first move any tensor into main memory before calling a MPI function on it, and then to move the result back into device memory after the MPI call has finished.

Note

This function is useful in situations in which MPI advertises CUDA-awareness but the functionality is not really supported by the backend.