blockApply() and family
blockApply.RdA family of convenience functions to walk on the blocks of an array-like object and process them.
Usage
## Main looping functions:
blockApply(x, FUN, ..., grid=NULL, as.sparse=FALSE,
BPPARAM=getAutoBPPARAM(), verbose=NA)
blockReduce(FUN, x, init, ..., BREAKIF=NULL, grid=NULL, as.sparse=FALSE,
verbose=NA)
## Lower-level looping functions:
gridApply(grid, FUN, ..., BPPARAM=getAutoBPPARAM(), verbose=NA)
gridReduce(FUN, grid, init, ..., BREAKIF=NULL, verbose=NA)
## Retrieve grid context for the current block/viewport:
effectiveGrid(envir=parent.frame(2))
currentBlockId(envir=parent.frame(2))
currentViewport(envir=parent.frame(2))
## Get/set automatic parallel back-end:
getAutoBPPARAM()
setAutoBPPARAM(BPPARAM=NULL)
## For testing/debugging callback functions:
set_grid_context(effective_grid, current_block_id, current_viewport=NULL,
envir=parent.frame(1))Arguments
- x
An array-like object, typically a DelayedArray object or derivative.
- FUN
For
blockApplyandblockReduce,FUNis the callback function to apply to each block of data inx. More precisely,FUNwill be called on each block of data inxdefined by the grid used to walk onx.IMPORTANT: If
as.sparseis set toFALSE, all blocks will be passed toFUNas ordinary arrays. If it's set toTRUE, they will be passed as SparseArray objects. If it's set toNA, thenis_sparse(x)determines how they will be passed toFUN.For
gridApply()andgridReduce(),FUNis the callback function to apply to each **viewport** ingrid.Beware that
FUNmust take at least **two** arguments forblockReduce()andgridReduce(). More precisely:blockReduce()will performinit <- FUN(block, init, ...)on each block, soFUNmust take at least argumentsblockandinit.gridReduce()will performinit <- FUN(viewport, init, ...)on each viewport, soFUNmust take at least argumentsviewportandinit.
In both cases, the exact names of the two arguments doesn't really matter. Also
FUNis expected to return a value of the same type as its 2nd argument (init).- ...
Additional arguments passed to
FUN.- grid
The grid used for the walk, that is, an ArrayGrid object that defines the blocks (or viewports) to walk on.
For
blockApply()andblockReduce()the supplied grid must be compatible with the geometry ofx. If not specified, an automatic grid is used. By defaultdefaultAutoGrid(x)is called to create an automatic grid. The automatic grid maker can be changed withsetAutoGridMaker(). See?setAutoGridMakerfor more information.- as.sparse
Passed to the internal calls to
read_block. See?read_blockin the S4Arrays package for more information.- BPPARAM
A
NULL, in which case blocks are processed sequentially, or a BiocParallelParam instance (from the BiocParallel package), in which case they are processed in parallel. The specific BiocParallelParam instance determines the parallel back-end to use. See?BiocParallelParamin the BiocParallel package for more information about parallel back-ends.- verbose
Whether block processing progress should be displayed or not. If set to
NA(the default), verbosity is controlled byDelayedArray:::get_verbose_block_processing(). SettingverbosetoTRUEorFALSEoverrides this.- init
The value to pass to the first call to
FUN(block, init)(orFUN(viewport, init)) whenblockReduce()(orgridReduce()) starts the walk. Note thatblockReduce()andgridReduce()always operate sequentially.- BREAKIF
An optional callback function that detects a break condition. Must return
TRUEorFALSE. At each iterationblockReduce()(andgridReduce()) will call it on the result ofinit <- FUN(block, init)(on the result ofinit <- FUN(viewport, init)forgridReduce()) and exit the walk ifBREAKIF(init)returnedTRUE.- envir
Do not use (unless you know what you are doing).
- effective_grid, current_block_id, current_viewport
See Details below.
Details
effectiveGrid(), currentBlockId(), and currentViewport()
return the "grid context" for the block/viewport being currently processed.
By "grid context" we mean:
The effective grid, that is, the user-supplied grid or
defaultAutoGrid(x)if the user didn't supply any grid.The current block id (a.k.a. block rank).
The current viewport, that is, the ArrayViewport object describing the position of the current block w.r.t. the effective grid.
Note that effectiveGrid(), currentBlockId(), and
currentViewport() can only be called (with no arguments) from
**within** the callback functions FUN and/or BREAKIF
passed to blockApply() and family.
If you need to be able to test/debug your callback function as a standalone function, set an arbitrary effective grid, current block id, and current_viewport, by calling
set_grid_context(effective_grid, current_block_id, current_viewport)**right before** calling the callback function.
Value
For blockApply() and gridApply(), a list with one
list element per block/viewport visited.
For blockReduce() and gridReduce(), the result of
the last call to FUN.
For effectiveGrid(), the grid (ArrayGrid object)
being effectively used.
For currentBlockId(), the id (a.k.a. rank) of the current block.
For currentViewport(), the viewport (ArrayViewport
object) of the current block.
See also
defaultAutoGridand family to create automatic grids to use for block processing of array-like objects.ArrayGrid in the S4Arrays package for the formal representation of grids and viewports.
read_blockandwrite_blockin the S4Arrays package.SparseArray objects implemented in the SparseArray package.
MulticoreParam,SnowParam, andbpparam, from the BiocParallel package.DelayedArray objects.
Examples
m <- matrix(1:60, nrow=10)
m_grid <- defaultAutoGrid(m, block.length=16, block.shape="hypercube")
## ---------------------------------------------------------------------
## blockApply()
## ---------------------------------------------------------------------
blockApply(m, identity, grid=m_grid)
#> [[1]]
#> [,1] [,2] [,3] [,4]
#> [1,] 1 11 21 31
#> [2,] 2 12 22 32
#> [3,] 3 13 23 33
#> [4,] 4 14 24 34
#>
#> [[2]]
#> [,1] [,2] [,3] [,4]
#> [1,] 5 15 25 35
#> [2,] 6 16 26 36
#> [3,] 7 17 27 37
#> [4,] 8 18 28 38
#>
#> [[3]]
#> [,1] [,2] [,3] [,4]
#> [1,] 9 19 29 39
#> [2,] 10 20 30 40
#>
#> [[4]]
#> [,1] [,2]
#> [1,] 41 51
#> [2,] 42 52
#> [3,] 43 53
#> [4,] 44 54
#>
#> [[5]]
#> [,1] [,2]
#> [1,] 45 55
#> [2,] 46 56
#> [3,] 47 57
#> [4,] 48 58
#>
#> [[6]]
#> [,1] [,2]
#> [1,] 49 59
#> [2,] 50 60
#>
blockApply(m, sum, grid=m_grid)
#> [[1]]
#> [1] 280
#>
#> [[2]]
#> [1] 344
#>
#> [[3]]
#> [1] 196
#>
#> [[4]]
#> [1] 380
#>
#> [[5]]
#> [1] 412
#>
#> [[6]]
#> [1] 218
#>
blockApply(m, function(block) {block + currentBlockId()*1e3}, grid=m_grid)
#> [[1]]
#> [,1] [,2] [,3] [,4]
#> [1,] 1001 1011 1021 1031
#> [2,] 1002 1012 1022 1032
#> [3,] 1003 1013 1023 1033
#> [4,] 1004 1014 1024 1034
#>
#> [[2]]
#> [,1] [,2] [,3] [,4]
#> [1,] 2005 2015 2025 2035
#> [2,] 2006 2016 2026 2036
#> [3,] 2007 2017 2027 2037
#> [4,] 2008 2018 2028 2038
#>
#> [[3]]
#> [,1] [,2] [,3] [,4]
#> [1,] 3009 3019 3029 3039
#> [2,] 3010 3020 3030 3040
#>
#> [[4]]
#> [,1] [,2]
#> [1,] 4041 4051
#> [2,] 4042 4052
#> [3,] 4043 4053
#> [4,] 4044 4054
#>
#> [[5]]
#> [,1] [,2]
#> [1,] 5045 5055
#> [2,] 5046 5056
#> [3,] 5047 5057
#> [4,] 5048 5058
#>
#> [[6]]
#> [,1] [,2]
#> [1,] 6049 6059
#> [2,] 6050 6060
#>
blockApply(m, function(block) currentViewport(), grid=m_grid)
#> [[1]]
#> 4 x 4 ArrayViewport object on a 10 x 6 array: [1-4,1-4]
#>
#> [[2]]
#> 4 x 4 ArrayViewport object on a 10 x 6 array: [5-8,1-4]
#>
#> [[3]]
#> 2 x 4 ArrayViewport object on a 10 x 6 array: [9-10,1-4]
#>
#> [[4]]
#> 4 x 2 ArrayViewport object on a 10 x 6 array: [1-4,5-6]
#>
#> [[5]]
#> 4 x 2 ArrayViewport object on a 10 x 6 array: [5-8,5-6]
#>
#> [[6]]
#> 2 x 2 ArrayViewport object on a 10 x 6 array: [9-10,5-6]
#>
blockApply(m, dim, grid=m_grid)
#> [[1]]
#> [1] 4 4
#>
#> [[2]]
#> [1] 4 4
#>
#> [[3]]
#> [1] 2 4
#>
#> [[4]]
#> [1] 4 2
#>
#> [[5]]
#> [1] 4 2
#>
#> [[6]]
#> [1] 2 2
#>
## The grid does not need to be regularly spaced:
a <- array(runif(8000), dim=c(25, 40, 8))
a_tickmarks <- list(c(7L, 15L, 25L), c(14L, 22L, 40L), c(2L, 8L))
a_grid <- ArbitraryArrayGrid(a_tickmarks)
a_grid
#> 3 x 3 x 2 ArbitraryArrayGrid object on a 25 x 40 x 8 array:
#> , , 1
#>
#> [,1] [,2] [,3]
#> [1,] [1-7,1-14,1-2] [1-7,15-22,1-2] [1-7,23-40,1-2]
#> [2,] [8-15,1-14,1-2] [8-15,15-22,1-2] [8-15,23-40,1-2]
#> [3,] [16-25,1-14,1-2] [16-25,15-22,1-2] [16-25,23-40,1-2]
#>
#> , , 2
#>
#> [,1] [,2] [,3]
#> [1,] [1-7,1-14,3-8] [1-7,15-22,3-8] [1-7,23-40,3-8]
#> [2,] [8-15,1-14,3-8] [8-15,15-22,3-8] [8-15,23-40,3-8]
#> [3,] [16-25,1-14,3-8] [16-25,15-22,3-8] [16-25,23-40,3-8]
#>
blockApply(a, function(block) sum(log(block + 0.5)), grid=a_grid)
#> [[1]]
#> [1] -10.44024
#>
#> [[2]]
#> [1] -8.743716
#>
#> [[3]]
#> [1] -4.505172
#>
#> [[4]]
#> [1] -2.552188
#>
#> [[5]]
#> [1] -2.872688
#>
#> [[6]]
#> [1] -5.912852
#>
#> [[7]]
#> [1] -5.315335
#>
#> [[8]]
#> [1] -9.333795
#>
#> [[9]]
#> [1] -11.17658
#>
#> [[10]]
#> [1] -31.44718
#>
#> [[11]]
#> [1] -49.80399
#>
#> [[12]]
#> [1] -43.43242
#>
#> [[13]]
#> [1] -14.80972
#>
#> [[14]]
#> [1] -16.44982
#>
#> [[15]]
#> [1] -23.90359
#>
#> [[16]]
#> [1] -29.13078
#>
#> [[17]]
#> [1] -35.85019
#>
#> [[18]]
#> [1] -42.52521
#>
## See block processing in action:
blockApply(m, function(block) sum(log(block + 0.5)), grid=m_grid,
verbose=TRUE)
#> / reading and realizing block 1/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 2/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 3/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 4/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 5/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 6/6 ...
#> ok
#> \ processing it ...
#> ok
#> [[1]]
#> [1] 40.84448
#>
#> [[2]]
#> [1] 46.67745
#>
#> [[3]]
#> [1] 24.77323
#>
#> [[4]]
#> [1] 30.92372
#>
#> [[5]]
#> [1] 31.57089
#>
#> [[6]]
#> [1] 16.01257
#>
## Use parallel evaluation:
library(BiocParallel)
if (.Platform$OS.type != "windows") {
BPPARAM <- MulticoreParam(workers=4)
} else {
## MulticoreParam() is not supported on Windows so we use
## SnowParam() on this platform.
BPPARAM <- SnowParam(4)
}
blockApply(m, function(block) sum(log(block + 0.5)), grid=m_grid,
BPPARAM=BPPARAM, verbose=TRUE)
#> [[1]]
#> [1] 40.84448
#>
#> [[2]]
#> [1] 46.67745
#>
#> [[3]]
#> [1] 24.77323
#>
#> [[4]]
#> [1] 30.92372
#>
#> [[5]]
#> [1] 31.57089
#>
#> [[6]]
#> [1] 16.01257
#>
## Note that blocks can be visited in any order!
## ---------------------------------------------------------------------
## blockReduce()
## ---------------------------------------------------------------------
FUN <- function(block, init) anyNA(block) || init
blockReduce(FUN, m, init=FALSE, grid=m_grid, verbose=TRUE)
#> / reading and realizing block 1/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 2/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 3/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 4/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 5/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 6/6 ...
#> ok
#> \ processing it ...
#> ok
#> [1] FALSE
m[10, 1] <- NA
blockReduce(FUN, m, init=FALSE, grid=m_grid, verbose=TRUE)
#> / reading and realizing block 1/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 2/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 3/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 4/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 5/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 6/6 ...
#> ok
#> \ processing it ...
#> ok
#> [1] TRUE
## With early bailout:
blockReduce(FUN, m, init=FALSE, BREAKIF=identity, grid=m_grid,
verbose=TRUE)
#> / reading and realizing block 1/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 2/6 ...
#> ok
#> \ processing it ...
#> ok
#> / reading and realizing block 3/6 ...
#> ok
#> \ processing it ...
#> ok
#> BREAK condition encountered
#> [1] TRUE
## Note that this is how the anyNA() method for DelayedArray objects is
## implemented.