he same project contains the classes Device2D and Block2D with similar
functionality and design ideas. However, there is one topic worth mentioning.
Cuda documentation recommends using cudaMallocPitch for all 2D allocations.
The author dares to recommend exactly the opposite: do not use
cudaMallocPitch.
The author conducted a simple experiment with combined use of cudaMallocPitch
and cudaMemGetInfo. The amount of free memory in bytes was measured before and
after a call to cudaMallocPitch. A block of width=5*sizeof(double)=40 and
height=1,000,000 was allocated. The calls to cudaMemGetInfo indicated that
such allocation changed the amount of free memory from 916,508,672 to
404,410,368. On the same machine, a cudaMalloc-based allocation of
size=5,000,000*sizeof(double) changed the amount of free memory from
914,108,416 to 874,000,384 (as it should).
|