ood performance of a Cuda-based implementation needs grouping of alike
operations to threads within the same kernel call. Hence, we need to identify
the subdomains
that include the boundary
and we need to establish a 1-to-1 mapping between these domains and a
one-dimensional index. To establish such mapping we iterate though all
relevant
-dimensional
subdomains
For each
we find the complement
as follows. Let
be the set of vertices of the subdomain
.
For each
we find an integer
such
that
or
Then
covers part of the boundary
.
There is no need to calculate
for every
.
Indeed,
thus a vertex
of a subdomain
is given
by
Hence
|