et
is a given array of numbers (integers or doubles). We would like to calculate
an array
s.t.
The presented procedure has
-proportional
processing time under assumption of infinite parallel processing capability.
We proceed as follows. Choose an integer
:
.
For step 0, we calculate in parallel
(
is the thread
index):
For step 1, we
calculate
We continue, for step
,
we
calculate
Summation in parallel for
and
.
|
The last step would be
when
We
have
Similarly,
Note that for each
is a partial sum. Next, we correct the rest of the numbers.
For step 0 we calculate (on a single
thread):
Thus,
Note that the final
contains the final partial sum. Hence, there is no need to increment this
position anymore. This fact is reflected in slightly different indexing for
the following steps.
For step 1 we calculate (on
threads,
is the thread
index):
We adopt the
convention
The index
varies from 0 to
except for the boundary where
cannot
exceed or be equal to
(because
indexes of
are in the same range as indexes of
):
Hence, the stated range for
follows.
We
have
thus
For general step
we calculate (on
threads,
is the thread
index):
Assuming
that
we
get
thus
At
we have the partial sums.
It remains to describe memory allocation and positioning for the intermediate
data
and
.
We place (see the figure (
Partial
summation in
parallel
))
|