Domain decomposition¶
The C++ API for partitioning a model over distributed and local hardware is described here.
Load balancers¶
Load balancing generates a domain_decomposition
given an arb::recipe
and a description of the hardware on which the model will run. Currently Arbor provides
one load balancer, partition_load_balance()
, and more will be added over time.
If the model is distributed with MPI, the partitioning algorithm for cells is
distributed with MPI communication. The returned domain_decomposition
describes the cell groups on the local MPI rank.
Note
The domain_decomposition
type is
independent of any load balancing algorithm, so users can define a
domain decomposition directly, instead of generating it with a load balancer.
This is useful for cases where the provided load balancers are inadequate,
or when the user has specific insight into running their model on the
target computer.
Important
When users supply their own domain_decomposition
, if they have
Gap Junction connections, they have to be careful to place all cells that
are connected via gap junctions in the same group.
Example:
A -gj- B -gj- C
and D -gj- E
.
Cells A, B and C need to be in a single group; and cells D and E need to be in a
single group. They may all be placed in the same group but not necessarily.
Be mindful that smaller cell groups perform better on multi-core systems and
try not to overcrowd cell groups if not needed.
Arbor provided load balancers such as partition_load_balance()
guarantee that this rule is obeyed.
-
domain_decomposition
partition_load_balance
(const recipe &rec, const arb::context &ctx)¶ Construct a
domain_decomposition
that distributes the cells in the model described byrec
over the distributed and local hardware resources described byctx
.The algorithm counts the number of each cell type in the global model, then partitions the cells of each type equally over the available nodes. If a GPU is available, and if the cell type can be run on the GPU, the cells on each node are put one large group to maximise the amount of fine grained parallelism in the cell group. Otherwise, cells are grouped into small groups that fit in cache, and can be distributed over the available cores.
Note
The partitioning assumes that all cells of the same kind have equal computational cost, hence it may not produce a balanced partition for models with cells that have a large variance in computational costs.
Decomposition¶
Documentation for the data structures used to describe domain decompositions.
-
enum class
backend_kind
¶ Used to indicate which hardware backend to use for running a
cell_group
.-
enumerator
multicore
¶ Use multicore backend.
-
enumerator
gpu
¶ Use GPU back end.
Note
Setting the GPU back end is only meaningful if the
cell_group
type supports the GPU backend.
-
enumerator
-
class
domain_decomposition
¶ Describes a domain decomposition and is solely responsible for describing the distribution of cells across cell groups and domains. It holds cell group descriptions (
groups
) for cells assigned to the local domain, and a helper function (gid_domain
) used to look up which domain a cell has been assigned to. Thedomain_decomposition
object also has meta-data about the number of cells in the global model, and the number of domains over which the model is distributed.Note
The domain decomposition represents a division all of the cells in the model into non-overlapping sets, with one set of cells assigned to each domain. A domain decomposition is generated either by a load balancer or is directly specified by a user, and it is a requirement that the decomposition is correct:
Every cell in the model appears once in one and only one cell
groups
on one and only one localdomain_decomposition
object.num_local_cells
is the sum of the number of cells in each of thegroups
.The sum of
num_local_cells
over all domains matchesnum_global_cells
.
-
std::function<int(cell_gid_type)>
gid_domain
¶ A function for querying the domain id that a cell assigned to (using global identifier
gid
). It must be a pure function, that is it has no side effects, and hence is thread safe.
-
int
num_domains
¶ Number of domains that the model is distributed over.
-
int
domain_id
¶ The index of the local domain. Always 0 for non-distributed models, and corresponds to the MPI rank for distributed runs.
-
cell_size_type
num_local_cells
¶ Total number of cells in the local domain.
-
cell_size_type
num_global_cells
¶ Total number of cells in the global model (sum of
num_local_cells
over all domains).
-
std::vector<group_description>
groups
¶ Descriptions of the cell groups on the local domain. See
group_description
.
-
class
group_description
¶ The indexes of a set of cells of the same kind that are group together in a cell group in a
arb::simulation
.-
group_description
(cell_kind k, std::vector<cell_gid_type> g, backend_kind b)¶ Constructor.
-
const std::vector<cell_gid_type>
gids
¶ The gids of the cells in the cell group.
-
const backend_kind
backend
¶ The back end on which the cell group is to run.
-