Hardware context

Arbor provides two ways for working with hardware resources:

  • Prescribe the hardware resources and their contexts for use in Arbor simulations.

  • Query available hardware resources (e.g. the number of available GPUs), and initializing MPI.

Note that to utilize some hardware features Arbor must be built and installed with the feature enabled, for example MPI or a GPU. Please refer to the installation guide for information on how to enable hardware support.

Available resources

Helper functions for checking cmake or environment variables, as well as configuring and checking MPI are the following:

arbor.config()

Returns a dictionary to check which options the Arbor library was configured with at compile time:

  • ARB_MPI_ENABLED

  • ARB_WITH_MPI4PY

  • ARB_GPU_ENABLED

  • ARB_VECTORIZE

  • ARB_WITH_PROFILING

  • ARB_USE_BUNDLED_LIBS

  • ARB_VERSION

  • ARB_ARCH

import arbor
arbor.config()

{'mpi': True, 'mpi4py': True, 'gpu': False, 'vectorize': True, 'profiling': True, 'bundled': True, 'version': '0.5.3-dev', 'arch': 'native'}
arbor.mpi_init()

Initialize MPI with MPI_THREAD_SINGLE, as required by Arbor.

arbor.mpi_is_initialized()

Check if MPI is initialized.

class arbor.mpi_comm
mpi_comm()

By default sets MPI_COMM_WORLD as communicator.

mpi_comm(object)
Parameters

object – Converts a Python object to an MPI Communicator.

arbor.mpi_finalize()

Finalize MPI by calling MPI_Finalize.

arbor.mpi_is_finalized()

Check if MPI is finalized. :rtype: bool

Env: Helper functions

The arbor.env module collects helper functions for interacting with the environment.

env.find_private_gpu(comm)

Requires GPU and MPI. Will return an integer id of a GPU such that each GPU is mapped to at most one MPI task (on the same node as the GPU). Raises an exception if

  • not built with GPU or MPI support

  • unable to satisfy the constraints above

  • handed an invalid or unknown MPI communicator object

env.thread_concurrency()

Returns the number of locally available CPU cores. Returns 1 if unable to detect the number of cores. Use with caution in combination with MPI.

env.get_env_num_threads()

Retrieve user-specified number of threads to use from the environment variable ARBENV_NUM_THREADS.

env.default_concurrency()

Returns number of threads to use from get_env_num_threads(), or else from thread_concurrency() if get_env_num_threads() returns zero.

env.default_gpu()

Determine GPU id to use from the ARBENV_GPU_ID environment variable, or from the first available GPU id of those detected.

env.default_allocation()

Returns a proc_allocation() with the number of threads intitalized with default_concurrency() and gpu set to default_gpu(). Use with caution in combination with MPI.

Prescribed resources

The Python wrapper provides an API for:

  • prescribing which hardware resources are to be used by a simulation using proc_allocation.

  • opaque handles to hardware resources used by simulations called context.

class arbor.proc_allocation

Enumerates the computational resources on a node to be used for a simulation, specifically the number of threads and identifier of a GPU if available.

proc_allocation([threads=1, gpu_id=None, bind_procs=False, bind_threads=False])
Parameters
  • threads (int) – Number of threads.

  • gpu_id (int) – Device ID.

threads

The number of CPU threads available, 1 by default. Must be set to 1 at minimum.

gpu_id

The identifier of the GPU to use. Must be None, or a non-negative integer.

The gpu_id corresponds to the int device parameter used by CUDA API calls to identify gpu devices. Set to None to indicate that no GPU device is to be used. See cudaSetDevice and cudaDeviceGetAttribute provided by the CUDA API.

bind_procs

Try to generate a binding mask for all MPI processes on a node. This can help with performance by suppressing unneeded task migrations from the OS. See also affinity. Do not enable if process binding is handled externally, eg by SLURM or OpenMPI, or disable it there first.

bind_threads

Try to generate a binding mask for all threads on an MPI process. This can help with performance by suppressing unneeded task migrations from the OS. See also affinity. If a process binding mask is set – either externally or by bind_procs –, it will be respected.

has_gpu()

Indicates whether a GPU is selected (i.e., whether gpu_id is None).

Here are some examples of how to create a proc_allocation.

import arbor

# default: one thread and no GPU selected
alloc1 = arbor.proc_allocation()

# 8 threads and no GPU
alloc2 = arbor.proc_allocation(8, None)

# reduce alloc2 to 4 threads and use the first available GPU
alloc2.threads = 4
alloc2.gpu_id  = 0
class arbor.context

An opaque handle for the hardware resources used in a simulation. A context contains a thread pool, and optionally the GPU state and MPI communicator. Users of the library do not directly use the functionality provided by context, instead they configure contexts, which are passed to Arbor interfaces for domain decomposition and simulation.

context()

When constructed without arguments, an undistributed context is automatically created using default_allocation().

context(threads=threads, gpu_id=gpu_id, mpi=mpi_comm, inter=mpi_comm)

Create a context.

Parameters
  • threads (int) – The number of threads available locally for execution. Must be set to 1 at minimum. Defaults to the maximum number of threads the system makes available (respecting optional affinity limits imposed through the environment) if gpu_id and mpi are not set, else defaults to 1.

  • gpu_id (int or None) – The non-negative identifier of the GPU to use, None by default. Can only be set when Arbor was built with GPU support.

  • mpi (arbor.mpi_comm or None.) – The MPI communicator for distributed calculation, None by default. Can only be set when Arbor was built with MPI support.

  • inter – The MPI communicator for external coupling to Arbor, e.g. another simulator. None by default. Can only be set when Arbor was built with MPI support.

context(alloc, mpi=mpi_comm, inter=mpi_comm)

Create a context.

Parameters
  • alloc (proc_allocation) – The computational resources. It is advised to explicitly provide one if you are providing an MPI communicator for distributed calculation.

  • mpi (arbor.mpi_comm or None.) – The MPI communicator for distributed calculation, None by default. Can only be set when Arbor was built with MPI support.

  • inter – The MPI communicator for external coupling to Arbor, e.g. another simulator. None by default. Can only be set when Arbor was built with MPI support.

Contexts can be queried for information about which features a context has enabled, whether it has a GPU, how many threads are in its thread pool.

has_gpu

Query whether the context has a GPU.

has_mpi

Query whether the context uses MPI for distributed communication.

threads

Query the number of threads in the context’s thread pool.

ranks

Query the number of distributed domains. If the context has an MPI communicator, return is equivalent to MPI_Comm_size. If the communicator has no MPI, returns 1.

rank

The numeric id of the local domain. If the context has an MPI communicator, return is equivalent to MPI_Comm_rank. If the communicator has no MPI, returns 0.

Here are some simple examples of how to create a context:

import arbor
import mpi4py.MPI as mpi

# Construct a context that uses 1 thread and no GPU or MPI.
context = arbor.context()

# Construct a context that:
#  * uses 8 threads in its thread pool;
#  * does not use a GPU, reguardless of whether one is available
#  * does not use MPI.
alloc   = arbor.proc_allocation(8, None)
context = arbor.context(alloc)

# Construct a context that uses:
#  * 4 threads and the first GPU;
#  * MPI_COMM_WORLD for distributed computation.
alloc   = arbor.proc_allocation(4, 0)
comm    = arbor.mpi_comm(mpi.COMM_WORLD)
context = arbor.context(alloc, comm)