blockApply() and family

A family of convenience functions to walk on the blocks of an array-like object and process them.

Usage

## Main looping functions:

blockApply(x, FUN, ..., grid=NULL, as.sparse=FALSE,
           BPPARAM=getAutoBPPARAM(), verbose=NA)

blockReduce(FUN, x, init, ..., BREAKIF=NULL, grid=NULL, as.sparse=FALSE,
            verbose=NA)

## Lower-level looping functions:
gridApply(grid, FUN, ..., BPPARAM=getAutoBPPARAM(), verbose=NA)
gridReduce(FUN, grid, init, ..., BREAKIF=NULL, verbose=NA)

## Retrieve grid context for the current block/viewport:
effectiveGrid(envir=parent.frame(2))
currentBlockId(envir=parent.frame(2))
currentViewport(envir=parent.frame(2))

## Get/set automatic parallel back-end:
getAutoBPPARAM()
setAutoBPPARAM(BPPARAM=NULL)

## For testing/debugging callback functions:
set_grid_context(effective_grid, current_block_id, current_viewport=NULL,
                 envir=parent.frame(1))

Arguments

x

An array-like object, typically a DelayedArray object or derivative.

FUN

For blockApply and blockReduce, FUN is the callback function to apply to each block of data in x. More precisely, FUN will be called on each block of data in x defined by the grid used to walk on x.

IMPORTANT: If as.sparse is set to FALSE, all blocks will be passed to FUN as ordinary arrays. If it's set to TRUE, they will be passed as SparseArray objects. If it's set to NA, then is_sparse(x) determines how they will be passed to FUN.

For gridApply() and gridReduce(), FUN is the callback function to apply to each **viewport** in grid.

Beware that FUN must take at least **two** arguments for blockReduce() and gridReduce(). More precisely:

blockReduce() will perform init <- FUN(block, init, ...) on each block, so FUN must take at least arguments block and init.
gridReduce() will perform init <- FUN(viewport, init, ...) on each viewport, so FUN must take at least arguments viewport and init.

In both cases, the exact names of the two arguments doesn't really matter. Also FUN is expected to return a value of the same type as its 2nd argument (init).

...

Additional arguments passed to FUN.

grid

The grid used for the walk, that is, an ArrayGrid object that defines the blocks (or viewports) to walk on.

For blockApply() and blockReduce() the supplied grid must be compatible with the geometry of x. If not specified, an automatic grid is used. By default defaultAutoGrid(x) is called to create an automatic grid. The automatic grid maker can be changed with setAutoGridMaker(). See ?setAutoGridMaker for more information.

as.sparse

Passed to the internal calls to read_block. See ?read_block in the S4Arrays package for more information.

BPPARAM

A NULL, in which case blocks are processed sequentially, or a BiocParallelParam instance (from the BiocParallel package), in which case they are processed in parallel. The specific BiocParallelParam instance determines the parallel back-end to use. See ?BiocParallelParam in the BiocParallel package for more information about parallel back-ends.

verbose

Whether block processing progress should be displayed or not. If set to NA (the default), verbosity is controlled by DelayedArray:::get_verbose_block_processing(). Setting verbose to TRUE or FALSE overrides this.

init

The value to pass to the first call to FUN(block, init) (or FUN(viewport, init)) when blockReduce() (or gridReduce()) starts the walk. Note that blockReduce() and gridReduce() always operate sequentially.

BREAKIF

An optional callback function that detects a break condition. Must return TRUE or FALSE. At each iteration blockReduce() (and gridReduce()) will call it on the result of init <- FUN(block, init) (on the result of init <- FUN(viewport, init) for gridReduce()) and exit the walk if BREAKIF(init) returned TRUE.

envir

Do not use (unless you know what you are doing).

effective_grid, current_block_id, current_viewport

See Details below.

Details

effectiveGrid(), currentBlockId(), and currentViewport() return the "grid context" for the block/viewport being currently processed. By "grid context" we mean:

The effective grid, that is, the user-supplied grid or defaultAutoGrid(x) if the user didn't supply any grid.
The current block id (a.k.a. block rank).
The current viewport, that is, the ArrayViewport object describing the position of the current block w.r.t. the effective grid.

Note that effectiveGrid(), currentBlockId(), and currentViewport() can only be called (with no arguments) from **within** the callback functions FUN and/or BREAKIF passed to blockApply() and family.

If you need to be able to test/debug your callback function as a standalone function, set an arbitrary effective grid, current block id, and current_viewport, by calling

    set_grid_context(effective_grid, current_block_id, current_viewport)

**right before** calling the callback function.

Value

For blockApply() and gridApply(), a list with one list element per block/viewport visited.

For blockReduce() and gridReduce(), the result of the last call to FUN.

For effectiveGrid(), the grid (ArrayGrid object) being effectively used.

For currentBlockId(), the id (a.k.a. rank) of the current block.

For currentViewport(), the viewport (ArrayViewport object) of the current block.

Examples

m <- matrix(1:60, nrow=10)
m_grid <- defaultAutoGrid(m, block.length=16, block.shape="hypercube")

## ---------------------------------------------------------------------
## blockApply()
## ---------------------------------------------------------------------
blockApply(m, identity, grid=m_grid)
#> [[1]]
#>      [,1] [,2] [,3] [,4]
#> [1,]    1   11   21   31
#> [2,]    2   12   22   32
#> [3,]    3   13   23   33
#> [4,]    4   14   24   34
#> 
#> [[2]]
#>      [,1] [,2] [,3] [,4]
#> [1,]    5   15   25   35
#> [2,]    6   16   26   36
#> [3,]    7   17   27   37
#> [4,]    8   18   28   38
#> 
#> [[3]]
#>      [,1] [,2] [,3] [,4]
#> [1,]    9   19   29   39
#> [2,]   10   20   30   40
#> 
#> [[4]]
#>      [,1] [,2]
#> [1,]   41   51
#> [2,]   42   52
#> [3,]   43   53
#> [4,]   44   54
#> 
#> [[5]]
#>      [,1] [,2]
#> [1,]   45   55
#> [2,]   46   56
#> [3,]   47   57
#> [4,]   48   58
#> 
#> [[6]]
#>      [,1] [,2]
#> [1,]   49   59
#> [2,]   50   60
#> 
blockApply(m, sum, grid=m_grid)
#> [[1]]
#> [1] 280
#> 
#> [[2]]
#> [1] 344
#> 
#> [[3]]
#> [1] 196
#> 
#> [[4]]
#> [1] 380
#> 
#> [[5]]
#> [1] 412
#> 
#> [[6]]
#> [1] 218
#> 

blockApply(m, function(block) {block + currentBlockId()*1e3}, grid=m_grid)
#> [[1]]
#>      [,1] [,2] [,3] [,4]
#> [1,] 1001 1011 1021 1031
#> [2,] 1002 1012 1022 1032
#> [3,] 1003 1013 1023 1033
#> [4,] 1004 1014 1024 1034
#> 
#> [[2]]
#>      [,1] [,2] [,3] [,4]
#> [1,] 2005 2015 2025 2035
#> [2,] 2006 2016 2026 2036
#> [3,] 2007 2017 2027 2037
#> [4,] 2008 2018 2028 2038
#> 
#> [[3]]
#>      [,1] [,2] [,3] [,4]
#> [1,] 3009 3019 3029 3039
#> [2,] 3010 3020 3030 3040
#> 
#> [[4]]
#>      [,1] [,2]
#> [1,] 4041 4051
#> [2,] 4042 4052
#> [3,] 4043 4053
#> [4,] 4044 4054
#> 
#> [[5]]
#>      [,1] [,2]
#> [1,] 5045 5055
#> [2,] 5046 5056
#> [3,] 5047 5057
#> [4,] 5048 5058
#> 
#> [[6]]
#>      [,1] [,2]
#> [1,] 6049 6059
#> [2,] 6050 6060
#> 
blockApply(m, function(block) currentViewport(), grid=m_grid)
#> [[1]]
#> 4 x 4 ArrayViewport object on a 10 x 6 array: [1-4,1-4]
#> 
#> [[2]]
#> 4 x 4 ArrayViewport object on a 10 x 6 array: [5-8,1-4]
#> 
#> [[3]]
#> 2 x 4 ArrayViewport object on a 10 x 6 array: [9-10,1-4]
#> 
#> [[4]]
#> 4 x 2 ArrayViewport object on a 10 x 6 array: [1-4,5-6]
#> 
#> [[5]]
#> 4 x 2 ArrayViewport object on a 10 x 6 array: [5-8,5-6]
#> 
#> [[6]]
#> 2 x 2 ArrayViewport object on a 10 x 6 array: [9-10,5-6]
#> 
blockApply(m, dim, grid=m_grid)
#> [[1]]
#> [1] 4 4
#> 
#> [[2]]
#> [1] 4 4
#> 
#> [[3]]
#> [1] 2 4
#> 
#> [[4]]
#> [1] 4 2
#> 
#> [[5]]
#> [1] 4 2
#> 
#> [[6]]
#> [1] 2 2
#> 

## The grid does not need to be regularly spaced:
a <- array(runif(8000), dim=c(25, 40, 8))
a_tickmarks <- list(c(7L, 15L, 25L), c(14L, 22L, 40L), c(2L, 8L))
a_grid <- ArbitraryArrayGrid(a_tickmarks)
a_grid
#> 3 x 3 x 2  ArbitraryArrayGrid object on a 25 x 40 x 8 array:
#> , , 1
#> 
#>      [,1]             [,2]              [,3]             
#> [1,]   [1-7,1-14,1-2]   [1-7,15-22,1-2]   [1-7,23-40,1-2]
#> [2,]  [8-15,1-14,1-2]  [8-15,15-22,1-2]  [8-15,23-40,1-2]
#> [3,] [16-25,1-14,1-2] [16-25,15-22,1-2] [16-25,23-40,1-2]
#> 
#> , , 2
#> 
#>      [,1]             [,2]              [,3]             
#> [1,]   [1-7,1-14,3-8]   [1-7,15-22,3-8]   [1-7,23-40,3-8]
#> [2,]  [8-15,1-14,3-8]  [8-15,15-22,3-8]  [8-15,23-40,3-8]
#> [3,] [16-25,1-14,3-8] [16-25,15-22,3-8] [16-25,23-40,3-8]
#> 
blockApply(a, function(block) sum(log(block + 0.5)), grid=a_grid)
#> [[1]]
#> [1] -10.44024
#> 
#> [[2]]
#> [1] -8.743716
#> 
#> [[3]]
#> [1] -4.505172
#> 
#> [[4]]
#> [1] -2.552188
#> 
#> [[5]]
#> [1] -2.872688
#> 
#> [[6]]
#> [1] -5.912852
#> 
#> [[7]]
#> [1] -5.315335
#> 
#> [[8]]
#> [1] -9.333795
#> 
#> [[9]]
#> [1] -11.17658
#> 
#> [[10]]
#> [1] -31.44718
#> 
#> [[11]]
#> [1] -49.80399
#> 
#> [[12]]
#> [1] -43.43242
#> 
#> [[13]]
#> [1] -14.80972
#> 
#> [[14]]
#> [1] -16.44982
#> 
#> [[15]]
#> [1] -23.90359
#> 
#> [[16]]
#> [1] -29.13078
#> 
#> [[17]]
#> [1] -35.85019
#> 
#> [[18]]
#> [1] -42.52521
#> 

## See block processing in action:
blockApply(m, function(block) sum(log(block + 0.5)), grid=m_grid,
           verbose=TRUE)
#> / reading and realizing block 1/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 2/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 3/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 4/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 5/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 6/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> [[1]]
#> [1] 40.84448
#> 
#> [[2]]
#> [1] 46.67745
#> 
#> [[3]]
#> [1] 24.77323
#> 
#> [[4]]
#> [1] 30.92372
#> 
#> [[5]]
#> [1] 31.57089
#> 
#> [[6]]
#> [1] 16.01257
#> 

## Use parallel evaluation:
library(BiocParallel)
if (.Platform$OS.type != "windows") {
    BPPARAM <- MulticoreParam(workers=4)
} else {
    ## MulticoreParam() is not supported on Windows so we use
    ## SnowParam() on this platform.
    BPPARAM <- SnowParam(4)
}
blockApply(m, function(block) sum(log(block + 0.5)), grid=m_grid,
           BPPARAM=BPPARAM, verbose=TRUE)
#> [[1]]
#> [1] 40.84448
#> 
#> [[2]]
#> [1] 46.67745
#> 
#> [[3]]
#> [1] 24.77323
#> 
#> [[4]]
#> [1] 30.92372
#> 
#> [[5]]
#> [1] 31.57089
#> 
#> [[6]]
#> [1] 16.01257
#> 
## Note that blocks can be visited in any order!

## ---------------------------------------------------------------------
## blockReduce()
## ---------------------------------------------------------------------
FUN <- function(block, init) anyNA(block) || init
blockReduce(FUN, m, init=FALSE, grid=m_grid, verbose=TRUE)
#> / reading and realizing block 1/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 2/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 3/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 4/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 5/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 6/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> [1] FALSE

m[10, 1] <- NA
blockReduce(FUN, m, init=FALSE, grid=m_grid, verbose=TRUE)
#> / reading and realizing block 1/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 2/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 3/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 4/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 5/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 6/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> [1] TRUE

## With early bailout:
blockReduce(FUN, m, init=FALSE, BREAKIF=identity, grid=m_grid,
            verbose=TRUE)
#> / reading and realizing block 1/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 2/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> / reading and realizing block 3/6 ... 
#> ok
#> \ processing it ... 
#> ok
#> BREAK condition encountered
#> [1] TRUE

## Note that this is how the anyNA() method for DelayedArray objects is
## implemented.

Usage

Arguments

Details

Value

See also

Examples