Domain decomposition#
The C++ API for partitioning a model over distributed and local hardware is described here.
Decomposition#
Documentation for the data structures used to describe domain decompositions.
-
class domain_decomposition#
Describes a domain decomposition and is solely responsible for describing the distribution of cells across cell groups and domains. It holds cell group descriptions (
groups
) for cells assigned to the local domain, and a helper member (gid_domain
) used to look up which domain a cell has been assigned to. Thedomain_decomposition
object also has meta-data about the number of cells in the global model, and the number of domains over which the model is distributed.Note
The domain decomposition represents a division of all of the cells in the model into non-overlapping sets, with one set of cells assigned to each domain. A domain decomposition is generated either by a load balancer or is directly constructed by a user, the following conditions must be met, if not, an exception will be thrown:
Every cell in the model appears once in one and only one cell
group
on one and only one localdomain_decomposition
object.The total number of cells across all cell
groups
on alldomain_decomposition
objects must match the total number of cells in therecipe
.Cells that are connected via gap-junction must be present in the same cell
group
.
-
domain_decomposition(const recipe &rec, const context &ctx, const std::vector<group_description> &groups)#
The constructor takes:
a
arb::recipe
that describes the model;a
arb::context
that describes the hardware resources;a vector of
arb::group_description
that contains the indices of the cells to be executed on the local rank, categorized into groups.
It’s expected that a different
arb::domain_decomposition
object will be constructed on each rank in a distributed simulation containing that selected cell groups for that rank. For example, in a simulation of 10 cells on 2 MPI ranks where cells {0, 2, 4, 6, 8} of kindcable_cell
are meant to be in a single group executed on the GPU on rank 0; and cells {1, 3, 5, 7, 9} of kindlif_cell
are expected to be in a single group executed on the CPU on rank 1:Rank 0 should run:
std::vector<arb::group_description> groups = { {arb::cell_kind::cable, {0, 2, 4, 6, 8}, arb::backend_kind::gpu} }; auto decomp = arb::domain_decomposition(recipe, context, groups);
And Rank 1 should run:
std::vector<arb::group_description> groups = { {arb::cell_kind::lif, {1, 3, 5, 7, 9}, arb::backend_kind::multicore} }; auto decomp = arb::domain_decomposition(recipe, context, groups);
Important
Constructing a balanced
domain_decomposition
quickly becomes a difficult task for large and diverse networks. This is why arbor provides load balancing algorithms that automatically generate adomain_decomposition
from arecipe
andcontext
. A user-defineddomain_decomposition
using the constructor is useful for cases where the provided load balancers are inadequate, or when the user has specific insight into running their model on the target computer.Important
When creating your own
domain_decomposition
of a network containing Gap Junction connections, be sure to place all cells that are connected via gap junctions in the samegroup
. Example:A -gj- B -gj- C
andD -gj- E
. Cells A, B and C need to be in a single group; and cells D and E need to be in a single group. They may all be placed in the same group but not necessarily. Be mindful that smaller cell groups perform better on multi-core systems and try not to overcrowd cell groups if not needed. Arbor provided load balancers such aspartition_load_balance()
guarantee that this rule is obeyed.
-
int gid_domain(cell_gid_type gid)#
Returns the domain id of the cell with id
gid
.
-
int num_domains()#
Returns the number of domains that the model is distributed over.
-
int domain_id()#
Returns the index of the local domain. Always 0 for non-distributed models, and corresponds to the MPI rank for distributed runs.
-
cell_size_type num_local_cells()#
Returns the total number of cells in the local domain.
-
cell_size_type num_global_cells()#
Returns the total number of cells in the global model (sum of
num_local_cells
over all domains).
-
cell_size_type num_groups()#
Returns the total number of cell groups on the local domain.
-
const group_description &group(unsigned idx)#
Returns the description of the cell group at index
idx
on the local domain. Seegroup_description
.
-
const std::vector<group_description> &groups()#
Returns the descriptions of the cell groups on the local domain. See
group_description
.
-
class group_description#
The indexes of a set of cells of the same kind that are group together in a cell group in a
arb::simulation
.-
group_description(cell_kind k, std::vector<cell_gid_type> g, backend_kind b)#
Constructor.
-
const std::vector<cell_gid_type> gids#
The gids of the cells in the cell group.
-
const backend_kind backend#
The back end on which the cell group is to run.
-
group_description(cell_kind k, std::vector<cell_gid_type> g, backend_kind b)#
Load balancers#
Load balancing generates a domain_decomposition
given an arb::recipe
and a description of the hardware on which the model will run. Currently Arbor provides
one load balancer, partition_load_balance()
, and more will be added over time.
If the model is distributed with MPI, the partitioning algorithm for cells is
distributed with MPI communication. The returned domain_decomposition
describes the cell groups on the local MPI rank.
-
domain_decomposition partition_load_balance(const recipe &rec, const arb::context &ctx)#
Construct a
domain_decomposition
that distributes the cells in the model described byrec
over the distributed and local hardware resources described byctx
.The algorithm counts the number of each cell type in the global model, then partitions the cells of each type equally over the available nodes. If a GPU is available, and if the cell type can be run on the GPU, the cells on each node are put one large group to maximise the amount of fine grained parallelism in the cell group. Otherwise, cells are grouped into small groups that fit in cache, and can be distributed over the available cores.
Note
The partitioning assumes that all cells of the same kind have equal computational cost, hence it may not produce a balanced partition for models with cells that have a large variance in computational costs.