Domain decomposition¶

The Python API for partitioning a model over distributed and local hardware is described here.

Load balancers¶

Load balancing generates a domain_decomposition given an arbor.recipe and a description of the hardware on which the model will run. Currently, Arbor provides one load balancer, partition_load_balance(); and a function for creating custom decompositions, partition_by_group(). More load balancers will be added over time.

If the model is distributed with MPI, the partitioning algorithm for cells is distributed with MPI communication. The returned domain_decomposition describes the cell groups on the local MPI rank.

arbor.partition_load_balance(recipe, context, hints)¶

Construct a domain_decomposition that distributes the cells in the model described by an arbor.recipe over the distributed and local hardware resources described by an arbor.context.

The algorithm counts the number of each cell type in the global model, then partitions the cells of each type equally over the available nodes. If a GPU is available, and if the cell type can be run on the GPU, the cells on each node are put into one large group to maximise the amount of fine grained parallelism in the cell group. Otherwise, cells are grouped into small groups that fit in cache, and can be distributed over the available cores. Optionally, provide a dictionary of partition_hint s for certain cell kinds, by default this dictionary is empty.

Note

The partitioning assumes that all cells of the same kind have equal computational cost, hence it may not produce a balanced partition for models with cells that have a large variance in computational costs.

arbor.partition_by_group(recipe, context, groups)¶

Construct a domain_decomposition that assigns the groups described by the provided list of group_description to the local hardware of the calling rank.

The function expects to be called by each rank in a distributed simulation with the selected groups for that rank. For example, in a simulation of 10 cells on 2 MPI ranks where cells {0, 2, 4, 6, 8} of kind cable_cell are expected to be in a single group executed on the GPU on rank 0; and cells {1, 3, 5, 7, 9} of kind lif_cell are expected to be in a single group executed on the CPU on rank 1:

Rank 0 should run:

import arbor

# Get a communication context (with 4 threads, and 1 GPU with id 0)
context = arbor.context(threads=4, gpu_id=0)

# Initialise a recipe of user-defined type my_recipe with 10 cells.
n_cells = 10
recipe = my_recipe(n_cells)

groups = [arbor.group_description(arbor.cell_kind.cable, [0, 2, 4, 6, 8], arbor.backend.gpu)]
decomp = arbor.partition_by_group(recipe, context, groups)

And Rank 1 should run:

import arbor

# Get a communication context (with 4 threads, and no GPU)
context = arbor.context(threads=4, gpu_id=-1)

# Initialise a recipe of user-defined type my_recipe with 10 cells.
n_cells = 10
recipe = my_recipe(n_cells)

groups = [arbor.group_description(arbor.cell_kind.lif, [1, 3, 5, 7, 9], arbor.backend.multicore)]
decomp = arbor.partition_by_group(recipe, context, groups)

The function expects that cells connected by gap-junction are in the same group. An exception will be raised if this is not the case.

The function doesn’t perform any checks on the validity of the generated domain_decomposition. The validity is only checked when a simulation object is constructed using that domain_decomposition.

Note

This function is intended for users who have a good understanding of the computational cost of simulating the cells in their network and want fine-grained control over the partitioning of cells across ranks. It is recommended to start off by using partition_load_balance() and switch to this function if the observed performance across ranks is unbalanced (for example, if the performance of the network is not scaling well with the number of nodes.)

Note

This function relies on the user to decide the size of the cell groups. It is therefore important to keep in mind that smaller cell groups have better performance on the multicore backend and larger cell groups have better performance on the GPU backend.

class arbor.partition_hint¶

Provide a hint on how the cell groups should be partitioned.

partition_hint(cpu_group_size, gpu_group_size, prefer_gpu)¶

Construct a partition hint with arguments cpu_group_size and gpu_group_size, and whether to prefer_gpu.

By default returns a partition hint with cpu_group_size = 1, i.e., each cell is put in its own group, gpu_group_size = max, i.e., all cells are put in one group, and prefer_gpu = True, i.e., GPU usage is preferred.

cpu_group_size¶: The size of the cell group assigned to CPU. Must be positive, else set to default value.

gpu_group_size¶: The size of the cell group assigned to GPU. Must be positive, else set to default value.

prefer_gpu¶: Whether GPU usage is preferred.

max_size¶: Get the maximum size of cell groups.

An example of a partition load balance with hints reads as follows:

import arbor

# Get a communication context (with 4 threads, no GPU)
context = arbor.context(threads=4, gpu_id=None)

# Initialise a recipe of user-defined type my_recipe with 100 cells.
n_cells = 100
recipe = my_recipe(n_cells)

# The hints prefer the multicore backend, so the decomposition is expected
# to never have cell groups on the GPU, regardless of whether a GPU is
# available or not.
cable_hint                  = arb.partition_hint()
cable_hint.prefer_gpu       = False
cable_hint.cpu_group_size   = 3
spike_hint                  = arb.partition_hint()
spike_hint.prefer_gpu       = False
spike_hint.cpu_group_size   = 4
hints = dict([(arb.cell_kind.cable, cable_hint), (arb.cell_kind.spike_source, spike_hint)])

decomp = arb.partition_load_balance(recipe, context, hints)

Decomposition¶

As defined in Domain decomposition a domain decomposition is a description of the distribution of the model over the available computational resources. Therefore, the following data structures are used to describe domain decompositions.

class arbor.backend¶

Enumeration used to indicate which hardware backend to execute a cell group on.

multicore¶: Use multicore backend.

gpu¶: Use GPU backend.

Note

Setting the GPU back end is only meaningful if the cell group type supports the GPU backend.

class arbor.domain_decomposition¶

Describes a domain decomposition and is solely responsible for describing the distribution of cells across cell groups and domains. It holds cell group descriptions (groups) for cells assigned to the local domain, and a helper function (gid_domain()) used to look up which domain a cell has been assigned to. The domain_decomposition object also has meta-data about the number of cells in the global model, and the number of domains over which the model is distributed.

Note

The domain decomposition represents a division of all of the cells in the model into non-overlapping sets, with one set of cells assigned to each domain.

gid_domain(gid)¶: A function for querying the domain id that a cell is assigned to (using global identifier arbor.cell_member.gid).

num_domains¶: The number of domains that the model is distributed over.

domain_id¶: The index of the local domain. Always 0 for non-distributed models, and corresponds to the MPI rank for distributed runs.

num_local_cells¶: The total number of cells in the local domain.

num_global_cells¶: The total number of cells in the global model (sum of num_local_cells over all domains).

num_groups¶: The total number of cell groups on the local domain.

groups¶: The descriptions of the cell groups on the local domain. See group_description.

class arbor.group_description¶

Return the indexes of a set of cells of the same kind that are grouped together in a cell group in an arbor.simulation.

group_description(kind, gids, backend)¶: Construct a group description with parameters kind, gids and backend.

kind¶: The kind of cell in the group.

gids¶: The list of gids of the cells in the cell group.

backend¶: The hardware backend on which the cell group will run.