Control the geometry of automatic blocks
AutoBlock-global-settings.RdA family of utilities to control the automatic block size (or length) and shape.
Usage
getAutoBlockSize()
setAutoBlockSize(size=1e8)
getAutoBlockLength(type)
getAutoBlockShape()
setAutoBlockShape(shape=c("hypercube",
"scale",
"first-dim-grows-first",
"last-dim-grows-first"))Arguments
- size
The auto block size (automatic block size) in bytes. Note that, except when the type of the array data is
"character"or"list", the size of a block is its length multiplied by the size of an array element. For example, a block of 500 x 1000 x 500 doubles has a length of 250 million elements and a size of 2 Gb (each double occupies 8 bytes of memory).The auto block size is set to 100 Mb at package startup and can be reset anytime to this value by calling
setAutoBlockSize()with no argument.- type
A string specifying the type of the array data.
- shape
A string specifying the auto block shape (automatic block shape). See
makeCappedVolumeBoxfor a description of the supported shapes.The auto block shape is set to
"hypercube"at package startup and can be reset anytime to this value by callingsetAutoBlockShape()with no argument.
Details
block size != block length
block length = number of array elements in a block
(i.e. prod(dim(block))).
block size = block length * size of the individual elements in memory.
For example, for an integer array, block size (in bytes) is
going to be 4 x block length. For a numeric array x
(i.e. type(x) == "double"), it's going to be 8 x block length.
In its current form, block processing in the DelayedArray package must decide the geometry of the blocks before starting the walk on the blocks. It does this based on several criteria. Two of them are:
The auto block size: maximum size (in bytes) of a block once loaded in memory.
The
type()of the array (e.g.integer,double,complex, etc...)
The auto block size setting and type(x) control the maximum
length of the blocks. Other criteria control their shape. So for example
if you set the auto block size to 8GB, this will cap the length of
the blocks to 2e9 if your DelayedArray object x is of type
integer, and to 1e9 if it's of type double.
Note that this simple relationship between block size and
block length assumes that blocks are loaded in memory as
ordinary (a.k.a. dense) matrices or arrays. With sparse blocks,
all bets are off. But the max block length is always taken to be
the auto block size divided by get_type_size(type())
whether the blocks are going to be loaded as dense or sparse arrays.
If they are going to be loaded as sparse arrays, their memory footprint
is very likely to be smaller than if they were loaded as dense arrays
so this is safe (although probably not optimal).
It's important to keep in mind that the auto block size setting
is a simple way for the user to put a cap on the memory footprint of
the blocks. Nothing more. In particular it doesn't control the maximum
amount of memory used by the block processing algorithm. Other variables
can impact dramatically memory usage like parallelization (where more than
one block is loaded in memory at any given time), what the algorithm is
doing with the blocks (e.g. something like blockApply(x, identity)
will actually load the entire array data in memory), what delayed
operations are on x, etc... It would be awesome to have a way to
control the maximum amount of memory used by a block processing algorithm
as a whole but we don't know how to do that.
Value
getAutoBlockSize: The current auto block size in bytes
as a single numeric value.
setAutoBlockSize: The new auto block size in bytes as an
invisible single numeric value.
getAutoBlockLength: The auto block length as a single
integer value.
getAutoBlockShape: The current auto block shape as a
single string.
setAutoBlockShape: The new auto block shape as an invisible
single string.
See also
defaultAutoGridand family to create automatic grids to use for block processing of array-like objects.blockApplyand family for convenient block processing of an array-like object.The
makeCappedVolumeBoxutility to make capped volume boxes.
Examples
getAutoBlockSize()
#> [1] 1e+08
getAutoBlockLength("double")
#> [1] 12500000
getAutoBlockLength("integer")
#> [1] 25000000
getAutoBlockLength("logical")
#> [1] 25000000
getAutoBlockLength("raw")
#> [1] 100000000
m <- matrix(runif(600), ncol=12)
setAutoBlockSize(140)
#> automatic block size set to 140 bytes (was 1e+08)
getAutoBlockLength(type(m))
#> [1] 17
defaultAutoGrid(m)
#> 13 x 3 RegularArrayGrid object on a 50 x 12 array:
#> [,1] [,2] [,3]
#> [1,] [1-4,1-4] [1-4,5-8] [1-4,9-12]
#> [2,] [5-8,1-4] [5-8,5-8] [5-8,9-12]
#> [3,] [9-12,1-4] [9-12,5-8] [9-12,9-12]
#> [4,] [13-16,1-4] [13-16,5-8] [13-16,9-12]
#> [5,] [17-20,1-4] [17-20,5-8] [17-20,9-12]
#> [6,] [21-24,1-4] [21-24,5-8] [21-24,9-12]
#> [7,] [25-28,1-4] [25-28,5-8] [25-28,9-12]
#> [8,] [29-32,1-4] [29-32,5-8] [29-32,9-12]
#> [9,] [33-36,1-4] [33-36,5-8] [33-36,9-12]
#> [10,] [37-40,1-4] [37-40,5-8] [37-40,9-12]
#> [11,] [41-44,1-4] [41-44,5-8] [41-44,9-12]
#> [12,] [45-48,1-4] [45-48,5-8] [45-48,9-12]
#> [13,] [49-50,1-4] [49-50,5-8] [49-50,9-12]
lengths(defaultAutoGrid(m))
#> [1] 16 16 16 16 16 16 16 16 16 16 16 16 8 16 16 16 16 16 16 16 16 16 16 16 16
#> [26] 8 16 16 16 16 16 16 16 16 16 16 16 16 8
dims(defaultAutoGrid(m))
#> [,1] [,2]
#> [1,] 4 4
#> [2,] 4 4
#> [3,] 4 4
#> [4,] 4 4
#> [5,] 4 4
#> [6,] 4 4
#> [7,] 4 4
#> [8,] 4 4
#> [9,] 4 4
#> [10,] 4 4
#> [11,] 4 4
#> [12,] 4 4
#> [13,] 2 4
#> [14,] 4 4
#> [15,] 4 4
#> [16,] 4 4
#> [17,] 4 4
#> [18,] 4 4
#> [19,] 4 4
#> [20,] 4 4
#> [21,] 4 4
#> [22,] 4 4
#> [23,] 4 4
#> [24,] 4 4
#> [25,] 4 4
#> [26,] 2 4
#> [27,] 4 4
#> [28,] 4 4
#> [29,] 4 4
#> [30,] 4 4
#> [31,] 4 4
#> [32,] 4 4
#> [33,] 4 4
#> [34,] 4 4
#> [35,] 4 4
#> [36,] 4 4
#> [37,] 4 4
#> [38,] 4 4
#> [39,] 2 4
getAutoBlockShape()
#> [1] "hypercube"
setAutoBlockShape("scale")
#> automatic block shape set to "scale" (was "hypercube")
defaultAutoGrid(m)
#> 7 x 6 RegularArrayGrid object on a 50 x 12 array:
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] [1-8,1-2] [1-8,3-4] [1-8,5-6] [1-8,7-8] [1-8,9-10] [1-8,11-12]
#> [2,] [9-16,1-2] [9-16,3-4] [9-16,5-6] [9-16,7-8] [9-16,9-10] [9-16,11-12]
#> [3,] [17-24,1-2] [17-24,3-4] [17-24,5-6] [17-24,7-8] [17-24,9-10] [17-24,11-12]
#> [4,] [25-32,1-2] [25-32,3-4] [25-32,5-6] [25-32,7-8] [25-32,9-10] [25-32,11-12]
#> [5,] [33-40,1-2] [33-40,3-4] [33-40,5-6] [33-40,7-8] [33-40,9-10] [33-40,11-12]
#> [6,] [41-48,1-2] [41-48,3-4] [41-48,5-6] [41-48,7-8] [41-48,9-10] [41-48,11-12]
#> [7,] [49-50,1-2] [49-50,3-4] [49-50,5-6] [49-50,7-8] [49-50,9-10] [49-50,11-12]
lengths(defaultAutoGrid(m))
#> [1] 16 16 16 16 16 16 4 16 16 16 16 16 16 4 16 16 16 16 16 16 4 16 16 16 16
#> [26] 16 16 4 16 16 16 16 16 16 4 16 16 16 16 16 16 4
dims(defaultAutoGrid(m))
#> [,1] [,2]
#> [1,] 8 2
#> [2,] 8 2
#> [3,] 8 2
#> [4,] 8 2
#> [5,] 8 2
#> [6,] 8 2
#> [7,] 2 2
#> [8,] 8 2
#> [9,] 8 2
#> [10,] 8 2
#> [11,] 8 2
#> [12,] 8 2
#> [13,] 8 2
#> [14,] 2 2
#> [15,] 8 2
#> [16,] 8 2
#> [17,] 8 2
#> [18,] 8 2
#> [19,] 8 2
#> [20,] 8 2
#> [21,] 2 2
#> [22,] 8 2
#> [23,] 8 2
#> [24,] 8 2
#> [25,] 8 2
#> [26,] 8 2
#> [27,] 8 2
#> [28,] 2 2
#> [29,] 8 2
#> [30,] 8 2
#> [31,] 8 2
#> [32,] 8 2
#> [33,] 8 2
#> [34,] 8 2
#> [35,] 2 2
#> [36,] 8 2
#> [37,] 8 2
#> [38,] 8 2
#> [39,] 8 2
#> [40,] 8 2
#> [41,] 8 2
#> [42,] 2 2
## Reset the auto block size and shape to factory settings:
setAutoBlockSize()
#> automatic block size set to 1e+08 bytes (was 140)
setAutoBlockShape()
#> automatic block shape set to "hypercube" (was "scale")