Hardware context¶
Arbor provides two ways for working with hardware resources:
Prescribe the hardware resources and their contexts for use in Arbor simulations.
Query available hardware resources (e.g. the number of available GPUs), and initializing MPI.
Note that to utilize some hardware features Arbor must be built and installed with the feature enabled, for example MPI or a GPU. Please refer to the installation guide for information on how to enable hardware support.
Available resources¶
Helper functions for checking cmake or environment variables, as well as configuring and checking MPI are the following:
- arbor.config()¶
Returns a dictionary to check which options the Arbor library was configured with at compile time:
ARB_MPI_ENABLED
ARB_WITH_MPI4PY
ARB_GPU_ENABLED
ARB_VECTORIZE
ARB_WITH_PROFILING
ARB_USE_BUNDLED_LIBS
ARB_VERSION
ARB_ARCH
import arbor arbor.config() {'mpi': True, 'mpi4py': True, 'gpu': False, 'vectorize': True, 'profiling': True, 'bundled': True, 'version': '0.5.3-dev', 'arch': 'native'}
- arbor.mpi_init()¶
Initialize MPI with
MPI_THREAD_SINGLE
, as required by Arbor.
- arbor.mpi_is_initialized()¶
Check if MPI is initialized.
- class arbor.mpi_comm¶
- mpi_comm()¶
By default sets MPI_COMM_WORLD as communicator.
- mpi_comm(object)
- Parameters:
object – Converts a Python object to an MPI Communicator.
- arbor.mpi_finalize()¶
Finalize MPI by calling
MPI_Finalize
.
- arbor.mpi_is_finalized()¶
Check if MPI is finalized. :rtype: bool
Env: Helper functions¶
The arbor.env
module collects helper functions for interacting with the environment.
- env.find_private_gpu(comm)¶
Requires GPU and MPI. Will return an integer id of a GPU such that each GPU is mapped to at most one MPI task (on the same node as the GPU). Raises an exception if
not built with GPU or MPI support
unable to satisfy the constraints above
handed an invalid or unknown MPI communicator object
- env.thread_concurrency()¶
Returns the number of locally available CPU cores. Returns 1 if unable to detect the number of cores. Use with caution in combination with MPI.
- env.get_env_num_threads()¶
Retrieve user-specified number of threads to use from the environment variable
ARBENV_NUM_THREADS
.
- env.default_concurrency()¶
Returns number of threads to use from
get_env_num_threads()
, or else fromthread_concurrency()
ifget_env_num_threads()
returns zero.
- env.default_gpu()¶
Determine GPU id to use from the
ARBENV_GPU_ID
environment variable, or from the first available GPU id of those detected.
- env.default_allocation()¶
Returns a
proc_allocation()
with the number of threads intitalized withdefault_concurrency()
and gpu set todefault_gpu()
. Use with caution in combination with MPI.
Prescribed resources¶
The Python wrapper provides an API for:
prescribing which hardware resources are to be used by a simulation using
proc_allocation
.opaque handles to hardware resources used by simulations called
context
.
- class arbor.proc_allocation¶
Enumerates the computational resources on a node to be used for a simulation, specifically the number of threads and identifier of a GPU if available.
- proc_allocation([threads=1, gpu_id=None, bind_procs=False, bind_threads=False])¶
- Parameters:
threads (int) – Number of threads.
gpu_id (int) – Device ID.
- threads¶
The number of CPU threads available, 1 by default. Must be set to 1 at minimum.
- gpu_id¶
The identifier of the GPU to use. Must be
None
, or a non-negative integer.The
gpu_id
corresponds to theint device
parameter used by CUDA API calls to identify gpu devices. Set toNone
to indicate that no GPU device is to be used. SeecudaSetDevice
andcudaDeviceGetAttribute
provided by the CUDA API.
- bind_procs¶
Try to generate a binding mask for all MPI processes on a node. This can help with performance by suppressing unneeded task migrations from the OS. See also affinity. Do not enable if process binding is handled externally, eg by SLURM or OpenMPI, or disable it there first.
- bind_threads¶
Try to generate a binding mask for all threads on an MPI process. This can help with performance by suppressing unneeded task migrations from the OS. See also affinity. If a process binding mask is set – either externally or by bind_procs –, it will be respected.
Here are some examples of how to create a
proc_allocation
.import arbor # default: one thread and no GPU selected alloc1 = arbor.proc_allocation() # 8 threads and no GPU alloc2 = arbor.proc_allocation(8, None) # reduce alloc2 to 4 threads and use the first available GPU alloc2.threads = 4 alloc2.gpu_id = 0
- class arbor.context¶
An opaque handle for the hardware resources used in a simulation. A
context
contains a thread pool, and optionally the GPU state and MPI communicator. Users of the library do not directly use the functionality provided bycontext
, instead they configure contexts, which are passed to Arbor interfaces for domain decomposition and simulation.- context()¶
When constructed without arguments, an undistributed context is automatically created using
default_allocation()
.
- context(threads=threads, gpu_id=gpu_id, mpi=mpi_comm, inter=mpi_comm)
Create a context.
- Parameters:
threads (int) – The number of threads available locally for execution. Must be set to 1 at minimum. Defaults to the maximum number of threads the system makes available (respecting optional affinity limits imposed through the environment) if gpu_id and mpi are not set, else defaults to 1.
gpu_id (int or None) – The non-negative identifier of the GPU to use,
None
by default. Can only be set when Arbor was built with GPU support.mpi (
arbor.mpi_comm
or None.) – The MPI communicator for distributed calculation,None
by default. Can only be set when Arbor was built with MPI support.inter – The MPI communicator for external coupling to Arbor, e.g. another simulator.
None
by default. Can only be set when Arbor was built with MPI support.
- context(alloc, mpi=mpi_comm, inter=mpi_comm)
Create a context.
- Parameters:
alloc (
proc_allocation
) – The computational resources. It is advised to explicitly provide one if you are providing an MPI communicator for distributed calculation.mpi (
arbor.mpi_comm
or None.) – The MPI communicator for distributed calculation,None
by default. Can only be set when Arbor was built with MPI support.inter – The MPI communicator for external coupling to Arbor, e.g. another simulator.
None
by default. Can only be set when Arbor was built with MPI support.
Contexts can be queried for information about which features a context has enabled, whether it has a GPU, how many threads are in its thread pool.
- has_gpu¶
Query whether the context has a GPU.
- has_mpi¶
Query whether the context uses MPI for distributed communication.
- threads¶
Query the number of threads in the context’s thread pool.
- ranks¶
Query the number of distributed domains. If the context has an MPI communicator, return is equivalent to
MPI_Comm_size
. If the communicator has no MPI, returns 1.
- rank¶
The numeric id of the local domain. If the context has an MPI communicator, return is equivalent to
MPI_Comm_rank
. If the communicator has no MPI, returns 0.
Here are some simple examples of how to create a
context
:import arbor import mpi4py.MPI as mpi # Construct a context that uses 1 thread and no GPU or MPI. context = arbor.context() # Construct a context that: # * uses 8 threads in its thread pool; # * does not use a GPU, reguardless of whether one is available # * does not use MPI. alloc = arbor.proc_allocation(8, None) context = arbor.context(alloc) # Construct a context that uses: # * 4 threads and the first GPU; # * MPI_COMM_WORLD for distributed computation. alloc = arbor.proc_allocation(4, 0) comm = arbor.mpi_comm(mpi.COMM_WORLD) context = arbor.context(alloc, comm)