Main concepts in GeLB
Main data-structure: the lattice#
All GeLB programs are centered on (at least one instance of) the lattice
data-structure.
A lattice is a (one-, two-, or three-dimensional) cartesian grid of
nodes.
Each node has a state-vector attached to it (where the length of the
state-vector is the same for all nodes in a `lattice).
To use LB terminology, each element of the state-vector represents a
node-local distribution function (DF), with indexing starting at zero.
As we will discuss later, each node will need to be assigned to a node_class
(think of "painting" the lattice using different colors).
You can use more than one lattice
For simulating athermal flows using LBM, a single distribution function is
sufficient.
However, when simulating thermal flows, we may want to use the
multi-distribution function (MDF) approach, where we have two distinct
distribution functions, to simulate the fluid and the temperature field.
To account for such scenarios, GeLB programs may instantiate more than one
lattice, with the restriction that all lattices must be overlapping in
space (as is usually the case for MDF algorithms).
lattice input / output (I/O) ports#
When implementing LB algorithms using general-purpose programming languages
(GPLs), the lattice is typically a (d+1)-dimensional array (where d stands
for the number of dimensions of the simulated spatial domain, and the +1 comes
from the node-local state-vector).
The LB algorithm then directly reads from and writes to this array.
Unfortunately, an update-rule of a node may overwrite DFs which are still
needed by the update-rule at a neighboring node (within the same
time-iteration).
Such data dependencies are quite common, and often cause beginners (or even
experts) much grief.
Two solutions have emerged to deal with this problem:
The easiest solution (used by many LB implementations) is to simply instantiate two copies of the
lattice(one holding the state at the current time-step, and another for holding the state at the next time-step). To make things more efficient, we can say that odd time-steps are stored in the firstlattice, and even time-steps are stored in the second one. This approach works, but it introduces other problems:a. As the amount of memory used by the simulation is doubled, the maximum size of the
latticethat can be simulated is halved (because system/GPU memory is finite).b. Also, the increased memory usage of the simulation usually places additional stress on the memory subsystem, reducing the efficiency of caches. Although (in theory) we still read and write the same amount of data from/to the memory, the way caches operate (in order to maintain data-consistency) means that doubling the memory footprint also generates more traffic on the memory bus. All this amounts to significantly longer simulation times, because modern computing hardware tends to be limited more by the volume of data transfers than by the number of arithmetic operations.
A more sophisticated approach is to keep a single
lattice, and to orchestrate the updates of the DFs in a very careful manner, so that we do not overwrite any data which will still be needed. However, this approach also has its own pitfalls:a. Difficult to understand, to implement, and (especially) to generalize: While it is doable to find a clever sequence of DF-updates which is applicable to simple / idealized cases (periodic domains or simple BCs), these sequences quickly become un-applicable e.g. when a user decides to implement a more complex BC. In the worst case, the user may not even be aware of the underlying assumptions, leading to incorrect simulations (which are arguably worse than slow simulations).
b. Potential to introduce other performance bugs: The core strength of modern computing hardware consist of the ability to perform parallel computations. For example, a common approach for GPUs is to assign each
nodeto a compute-thread, and allow the hardware-scheduler to freely decide in which order and on which hardware compute-units to map these threads. Thus, if the prescribed sequence of DF-updates is not adapted to the actual hardware that runs the simulation, there is a real danger of under-utilizing the CPU/GPU/FPGA, by forcing a (partially-)serial update of thenodes.
The core problem is that, when using a GPL, the compiler has no easy way of
understanding the data-dependencies. In GeLB, we make it easier to compute
these data-dependencies automatically, by:
Forcing all data-operations on the
latticeto be explicitly assigned to either the input (in) or to the output (out) port of thelattice. A read/write DF access operation is assigned to one of these two ports by simply appending the suffix.inor.outto the name of thelatticeinstance. For example, assuming that we have a 2Dlatticewhich we namedf, we would use:to write the new state (at time
(n+1)) of DF 0 from thenodeat(x, y) = (1, 1). Similarly, if a variablealphadepends on the current value of this DF (at timen), we would use:Important
While we may read
f.in[...]any number of times within a time-step, we can only write (assign) tof.out[...]of each DF only once during each time-step! This prevents the use of DFs as "scratch space" during an iteration. However, from the point of view ofGeLB, such tricks are implementation details, which should be generated automatically (when possible) by theGeLBcompiler, by analyzing the data-dependencies graph.Limiting which ports may be used for different types of
latticeoperators (as discussed in the next section).
By accepting the above-mentioned constraints enforced by the GeLB compiler,
the problem of generating a graph of data-dependencies becomes much easier
(compared to that faced e.g. by C(++) or Fortran compilers, which always
have to assume the worst cases, to ensure correctness).
lattice operators#
TODO-insert-Figure-18-p80-from-thesis (initializer, dynamic, gauge)
After analyzing many LB code-samples, three types of fundamental lattice
operations were identified (each of them having a corresponding type of
lattice operator in `GeLB):
initializeroperators: Before a simulation even begins, we want to initialize the DFs at eachnodeof thelattice, to impose some initial conditions (ICs). InGeLBprograms, this step is achieved by defininginitializers.For an
initializer, thelatticeis write-only (i.e. it is not allowed to read any DFs, because initially all DFs are assumed to be in an undefined state). Put otherwise, for alatticenamedf, aninitializermay not usef.in[...]values (theinputport of thelattice).Important
The limitation mentioned above (when we discussed the
inputandoutputports oflattices) is important here -- once a DF is written (by assigning a value tof.out[...]within theinitializer, it cannot be overwritten (neither by the sameinitializer, nor by another one).dynamicoperators: After the initialization phase, we enter a time-loop. During each iteration, at eachnodeof thelatticewe apply a certain update-rule, which computes the state-vector of thenodeat time(n+1), based on the state of thenode(and usually also based on the state of some neighboringnodes) at timen. InGeLBprograms, such an update-rule is specified by writing adynamic.For a
dynamic, thelatticeis read-write, i.e. we can use both theinputand theoutputports (given alatticenamedf, adynamicmay readf.in[...]values, and also write tof.out[...]values).Important
Similar to an
initializer, adynamicmay only write tof.out[...]values once per time-step -- if we say that a certain DF will have some value at the next time-step, we cannot change our mind :).gaugeoperators: Computing without generating output would be meaningless, of course. Thus, every few time-iterations (or even at every iteration), or maybe just at the end of the simulation), we want to create some output based on the state of thelattice. InGeLBprograms, such operations are specified by writinggauge(s). For example, in a LB fluid-flow simulation, we may want to write agaugewhich evaluates the density at eachnode, and another which evaluates the momentum (or we may even create a singlegauge, which evaluates both quantities, for greater efficiency). Within the body of agauge, thelatticeis read-only (we can only use theinputport, i.e.f.in[...]for alatticenamedf).
node_classes and geometry#
Another (trivial, but important) observation originating from analyzing many LB
implementations is that nodes may be usually attributed to a finite (and
small) set of classes.
For example, in a fluid flow problem where the fluid is surrounded by solid
(no-slip) walls, we may have on class for the nodes within the bulk of the
fluid domain and another class for the wall nodes.
Naturally, for simulations where we have to distinguish based on more ICs and/or
BCs, the number of node classes is larger; however, even for the most
demanding simulations, the number of different node classes is still
relatively small.
Accordingly, GeLB requires users to explicitly define each node_class.
In GeLB terminology, a node_class is defined by the set of initializer(s),
dynamic(s) and gauge(s) which need to be applied for a certain class of
node(s).
This idea is closely tied to the way we define the geometry in GeLB -- the
"paint" metaphor.
When it comes to defining the geometry for your simulation, you can think
of a lattice as a "canvas", where you are invited to "splash" different
colors (node_classes) onto the different nodes.
Each of your artistic "strokes" corresponds to a node-assignment command within
the geometry block, such as:
where WALL is the name of a node_class defined by you in the node_classes
block.
caution
The analogy is not perfect -- unlike colors in painting,
node_classes in GeLB may NOT overlap.
You can "splash" (assign) more than one node_class onto the same node,
but only the node_class assigned last is retained.
This property is useful when we want to assign most of a sub-domain to a
node_class, but we want to re-assign some smaller regions to different
node_classes, e.g. to implement BCs for immersed objects.
Flow-control in GeLB programs#
The assignment of different nodes to different node_classes (which you can
think of as a lattice-wide switch-statement) is the only form of
flow-control that is supported by GeLB programs.
However, after reading this manual, you will (hopefully) be convinced that you
can write your simulations without using if, switch, for, while,
do, and other such "beasts".
This design decision, inspired by functional programming languages, requires
a small amount of re-learning.
But, we promise it will be worth it!
Conclusions and what comes next#
Congratulations! You now have a pretty good idea about the structure of GeLB
programs.
In the following pages of this manual, we will:
- show you some sample GeLB programs, after which
- we will go into more details about rules (what you may and may not write) for
each type of block in your
GeLBprograms.