Macros

TensorCast.@castMacro
@cast Z[i,j,...] := f(A[i,j,...], B[j,k,...])  options

Macro for broadcasting, reshaping, and slicing of arrays in index notation. Understands the following things:

  • A[i,j,k] is a three-tensor with these indices.
  • B[(i,j),k] is the same thing, reshaped to a matrix. Its first axis (the bracket) is indexed by n = i + (j-1) * N where i ∈ 1:N. This may also be written B[i\j,k] or B[i⊗j,k].
  • C[k][i,j] is a vector of matrices, either created by slicing (if on the left) or implying glueing into (if on the right) a 3-tensor A[i,j,k].
  • D[j,k]{i} is an ordinary matrix of SVectors, which may be reinterpreted from A[i,j,k].
  • E[i,_,k] has two nontrivial dimensions, and size(E,2)==1. On the right hand side (or when writing to an existing array) you may also write E[i,3,k] meaning view(E, :,3,:), or E[i,$c,j] to use a variable c.
  • f(x)[i,j,k] is allowed; f(x) must return a 3-tensor (and will be evaluated only once).
  • g(H[:,k])[i,j] is a generalised mapslices, with g mapping columns of H to matrices, which are glued into a 3-tensor A[i,j,k].
  • h(I[i], J[j])[k] expects an h which maps two scalars to a vector, which gets broadcasted h.(I,J'), then glued to make a 3-tensor.
  • K[i,j]' conjugates each element, equivalent to K'[j,i] which is the conjugate-transpose of the matrix.
  • M[i,i] means diag(M)[i], but only for matrices: N[i,i,k] is an error.
  • P[i,i'] is normalised to P[i,i′] with unicode \prime.
  • R[i,-j,k] means roughly reverse(R, dims=2), and Q[i,~j,k] similar with shuffle.
  • S[i,T[j,k]] is the 3-tensor S[:,T] created by indexing a matrix S with T, where these integers are all(1 .<= T .<= size(S,2)).

The left and right hand sides must have all the same indices, and the only repeated index allowed is M[i,i], which is a diagonal not a trace. See @reduce and @matmul for related macros which can sum over things.

If a function of one or more tensors appears on the right hand side, then this represents a broadcasting operation, and the necessary re-orientations of axes are automatically inserted.

The following actions are possible:

  • = writes into an existing array, overwriting its contents, while += adds (precisely Z .= Z .+ ...) and *= multiplies.
  • := creates a new array. This need not be named: Z = @cast [i,j,k] := ... is allowed.
  • |= insists that this is a copy, not a view.

Re-ordering of indices Z[k,j,i] is done lazily with PermutedDimsArray(A, ...). Reversing of an axis F[i,-j,k] is also done lazily, by Reverse{2}(F) which makes a view. Using |= (or broadcasting) will produce a simple Array.

Options can be specified at the end (if several, separated by , i.e. options::Tuple)

  • i:3 supplies the range of index i. Variables and functions like j:Nj, k:length(K) are allowed.
  • assert will turn on explicit dimension checks of the input. (Providing any ranges will also turn these on.)
  • cat will glue slices by things like hcat(A...) instead of the default reduce(hcat, A), and lazy will instead make a LazyStack.Stacked container.
  • nolazy disables PermutedDimsArray and Reverse in favour of permutedims and reverse, and Diagonal in favour of diagm for Z[i,i] output.
  • strided will place @strided in front of broadcasting operations, and use @strided permutedims(A, ...) instead of PermutedDimsArray(A, ...). For this you need using Strided to load that package.
  • avx will place @avx in front of broadcasting operations, which needs using LoopVectorization to load that package.

To create static slices D[k]{i,j} you should give all slice dimensions explicitly. You may write D[k]{i:2,j:2} to specify Size(2,2) slices. They are made most cleanly from the first indices of the input, i.e. this D from A[i,j,k]. The notation A{:,:,k} will only work in this order, and writing A{:2,:2,k} provides the sizes.

source
TensorCast.@reduceMacro
@reduce A[i] := sum(j,k) B[i,j,k]             # A = vec(sum(B, dims=(2,3)))
@reduce A[i] := prod(j) B[i] + ε * C[i,j]     # A = vec(prod(B .+ ε .* C, dims=2))
@reduce A[i] = sum(j) exp( C[i,j] / D[j] )    # sum!(A, exp.(C ./ D') )

Tensor reduction macro:

  • The reduction function can be anything which works like sum(B, dims=(1,3)), for instance prod and maximum and Statistics.mean.
  • In-place operations Z[j] = sum(... will construct the banged version of the given function's name, which must work like sum!(Z, A).
  • The tensors can be anything that @cast understands, including gluing of slices B[i,k][j] and reshaping B[i\j,k]. See ? @cast for the complete list.
  • If there is a function of one or more tensors on the right, then this is a broadcasting operation.
  • Index ranges may be given afterwards (as for @cast) or inside the reduction sum(i:3, k:4).
  • All indices appearing on the right must appear either within sum(...) etc, or on the left.
F = @reduce sum(i,j)  B[i] + γ * D[j]         # sum(B .+ γ .* D')
@reduce G[] := sum(i,j)  B[i] + γ * D[j]      # F == G[]

Complete reduction to a scalar output F, or a zero-dim array G. G[] involves sum(A, dims=(1,2)) rather than sum(A).

@reduce Z[k] := sum(i,j) A[i] * B[j] * C[k]  lazy, i:N, j:N, k:N

The option lazy replaces the broadcast expression with a BroadcastArray, to avoid materializeing the entire array before summing. In the example this is of size N^3.

The option strided will place @strided in front of the broadcasting operation. You need using Strided for this to work.

@reduce sum(i) A[i] * log(@reduce [i] := sum(j) A[j] * exp(B[i,j]))
@cast W[i] := A[i] * exp(- @reduce S[i] = sum(j) exp(B[i,j]) lazy)

Recursion like this is allowed, inside either @cast or @reduce. The intermediate array need not have a name, like [i], unless writing into an existing array, like S[i] here.

source
TensorCast.@matmulMacro
@matmul M[a,c] := sum(b)  A[a,b] * B[b,c]

Matrix multiplication macro. Uses the same syntax as @reduce, but instead of broadcasting the expression on the right out to a 3-tensor before summing it along one dimension, this calls * or mul!, which is usually much faster. But it only works on expressions of suitable form.

With more than two tensors on the right, it proceeds left to right, and each summed index must appear on two tensors, probably neighbours.

Note that unlike @einsum and @tensor, you must explicitly specify what indices to sum over. Both this macro and @reduce could in principal infer this, and perhaps I will add that later... but then I find myself commenting # sum_μ anyway.

@matmul Z[a⊗b,z] = sum(i,j,k)  D[i,a][j] * E[i⊗j,_,k,b] * F[z,3,k]

Each tensor will be pre-processed exactly as for @cast / @reduce, here glueing slices of D together, reshaping E, and taking a view of F. Once this is done, the right hand side must be of the form (tensor) * (tensor) * ..., which becomes mul!(ZZ, (DD * EE), FF).

@reduce V[i] := sum(k) W[k] * exp(@matmul [i,k] := sum(j) A[i,j] * B[j,k])

You should be able to use this within the other macros, as shown.

source
TensorCast.@prettyMacro
@pretty @cast A[...] := B[...]

Prints an approximately equivalent expression with the macro expanded. Compared to @macroexpand1, generated symbols are replaced with animal names (from MacroTools), comments are deleted, module names are removed from functions, and the final expression is fed to println().

To copy and run the printed expression, you may need various functions which aren't exported. Try something like using TensorCast: orient, star, rview, @assert_, red_glue, sliceview

source

String macros

These provide an alternative... but don't cover quite everything:

TensorCast.@cast_strMacro
cast" Z_ij := A_i + B_j "

String macro version of @cast, which translates things like "A_ijk" == A[i,j,k]. Indices should be single letters, except for primes, for example:

julia> @pretty cast" X_αβ' = A_α3β' * log(B_β'$k)"
# @cast  X[α,β′] = A[α,3,β′] * log(B[β′,$k])
begin
    local kangaroo = view(A, :, 3, :)
    local turtle = transpose(view(B, :, k))
    X .= @. kangaroo * log(turtle)
end

Underscores (trivial dimensions) and colons (for mapslices) are also understood:

julia> @pretty cast" Y_ijk := f(A_j:k)_i + B_i_k + (C_k)_i"
# @cast  Y[i,j,k] := f(A[j,:,k])[i] + B[i,_,k] + C[k][i]

Operators := and = work as usual, as do options. There are similar macros reduce"Z_i = Σ_j A_ij" and matmul"Z_ik = Σ_j A_ij * B_jk".

source
TensorCast.@reduce_strMacro
reduce" Z_i := sum_j A_ij + B_j "

String macro version of @reduce. Indices should be single letters, except for primes, and constants. You may write \Sigma Σ for sum (and \Pi Π for prod):

julia> @pretty reduce" W_i = Σ_i' A_ii' / B_i'3^2  lazy"
# @reduce  W[i] = sum(i′) A[i,i′] / B[i′,3]^2  lazy
begin
    local mallard = orient(view(B, :, 3), (*, :))
    sum!(W, @__dot__(lazy(A / mallard ^ 2)))
end
source

Functions

These are not exported, but are called by the macros above, and visible in what @pretty prints out.

Reshaping & views

TensorCast.Fast.orientFunction
B = orient(A, code)

Usually this calls _orient(A, code), which reshapes A such that its nontrivial axes lie in the directions where code contains a :, by inserting axes on which size(B, d) == 1 as needed.

When acting on A::Transpose, A::PermutedDimsArray etc, it will collect(A) first, because reshaping these is very slow. However, this is only done when the underlying array is a "normal" CPU Array, since e.g., for GPU arrays, collect(A) copies the array to the CPU. Fortunately, the speed penalty for reshaping transposed GPU arrays is lower than on the CPU.

source
TensorCast.Fast.PermuteDimsFunction
PermuteDims(A::Matrix)
PermuteDims(A::Vector)

Lazy like transpose, but not recursive: calls PermutedDimsArray unless eltype(A) <: Number.

source
TensorCast.Fast.rviewFunction
rview(A, :,1,:) ≈ (@assert size(A,2)==1; view(A, :,1,:))

This simply reshapes A so as to remove a trivial dimension where indicated. Throws an error if size of A is not 1 in the indicated dimension.

Will fail silently if given a (:,3,:) or (:,$c,:), for which needview!(code) = true, so the macro should catch such cases.

source

Slicing & glueing

TensorCast.Fast.sliceviewFunction
sliceview(A, code)
slicecopy(A, code)

Slice array A according to code, a tuple of length ndims(A), in which : indicates a dimension of the slices, and * a dimension separating them. For example if code = (:,*,:) then slices are either view(A, :,i,:) or A[:,i,:] with i=1:size(A,2).

source
TensorCast.Fast.glueFunction
glue!(B, A, code)
copy_glue(A, code) = glue!(Array{T}(...), A, code)

Copy the contents of an array of arrays into one larger array, un-doing sliceview / slicecopy with the same code. This function can handle arbitrary codes. (Also called stack or align elsewhere.)

cat_glue(A, code)
red_glue(A, code)

The same result, but calling either things like hcat(A...) or things like reduce(hcat, A). The code must be sorted like (:,:,:,*,*), except that (*,:) is allowed.

glue(A)
glue(A, code)

If code is omitted, the default is like (:,:,*,*) with ndims(first(A)) colons first, then ndims(A) stars. If the inner arrays are StaticArrays (and the code is sorted) then it calls static_glue. Otherwise it will call red_glue, unless code is unsuitable for that, in which case copy_glue.

source