Main concepts in GeLB
lattice
#
Main data-structure: the All GeLB
programs are centered on (at least one instance of) the lattice
data-structure.
A lattice
is a (one-, two-, or three-dimensional) cartesian grid of
node
s.
Each node
has a state-vector attached to it (where the length of the
state-vector is the same for all nodes
in a `lattice).
To use LB terminology, each element of the state-vector represents a
node
-local distribution function (DF), with indexing starting at zero.
As we will discuss later, each node
will need to be assigned to a node_class
(think of "painting" the lattice
using different colors).
You can use more than one lattice
For simulating athermal flows using LBM, a single distribution function is
sufficient.
However, when simulating thermal flows, we may want to use the
multi-distribution function (MDF) approach, where we have two distinct
distribution functions, to simulate the fluid and the temperature field.
To account for such scenarios, GeLB
programs may instantiate more than one
lattice
, with the restriction that all lattice
s must be overlapping in
space (as is usually the case for MDF algorithms).
lattice
input / output (I/O) ports#
When implementing LB algorithms using general-purpose programming languages
(GPLs), the lattice
is typically a (d+1)
-dimensional array (where d
stands
for the number of dimensions of the simulated spatial domain, and the +1
comes
from the node
-local state-vector).
The LB algorithm then directly reads from and writes to this array.
Unfortunately, an update-rule of a node
may overwrite DFs which are still
needed by the update-rule at a neighboring node
(within the same
time-iteration).
Such data dependencies are quite common, and often cause beginners (or even
experts) much grief.
Two solutions have emerged to deal with this problem:
The easiest solution (used by many LB implementations) is to simply instantiate two copies of the
lattice
(one holding the state at the current time-step, and another for holding the state at the next time-step). To make things more efficient, we can say that odd time-steps are stored in the firstlattice
, and even time-steps are stored in the second one. This approach works, but it introduces other problems:a. As the amount of memory used by the simulation is doubled, the maximum size of the
lattice
that can be simulated is halved (because system/GPU memory is finite).b. Also, the increased memory usage of the simulation usually places additional stress on the memory subsystem, reducing the efficiency of caches. Although (in theory) we still read and write the same amount of data from/to the memory, the way caches operate (in order to maintain data-consistency) means that doubling the memory footprint also generates more traffic on the memory bus. All this amounts to significantly longer simulation times, because modern computing hardware tends to be limited more by the volume of data transfers than by the number of arithmetic operations.
A more sophisticated approach is to keep a single
lattice
, and to orchestrate the updates of the DFs in a very careful manner, so that we do not overwrite any data which will still be needed. However, this approach also has its own pitfalls:a. Difficult to understand, to implement, and (especially) to generalize: While it is doable to find a clever sequence of DF-updates which is applicable to simple / idealized cases (periodic domains or simple BCs), these sequences quickly become un-applicable e.g. when a user decides to implement a more complex BC. In the worst case, the user may not even be aware of the underlying assumptions, leading to incorrect simulations (which are arguably worse than slow simulations).
b. Potential to introduce other performance bugs: The core strength of modern computing hardware consist of the ability to perform parallel computations. For example, a common approach for GPUs is to assign each
node
to a compute-thread, and allow the hardware-scheduler to freely decide in which order and on which hardware compute-units to map these threads. Thus, if the prescribed sequence of DF-updates is not adapted to the actual hardware that runs the simulation, there is a real danger of under-utilizing the CPU/GPU/FPGA, by forcing a (partially-)serial update of thenode
s.
The core problem is that, when using a GPL, the compiler has no easy way of
understanding the data-dependencies. In GeLB
, we make it easier to compute
these data-dependencies automatically, by:
Forcing all data-operations on the
lattice
to be explicitly assigned to either the input (in
) or to the output (out
) port of thelattice
. A read/write DF access operation is assigned to one of these two ports by simply appending the suffix.in
or.out
to the name of thelattice
instance. For example, assuming that we have a 2Dlattice
which we namedf
, we would use:to write the new state (at time
(n+1)
) of DF 0 from thenode
at(x, y) = (1, 1)
. Similarly, if a variablealpha
depends on the current value of this DF (at timen
), we would use:Important
While we may read
f.in[...]
any number of times within a time-step, we can only write (assign) tof.out[...]
of each DF only once during each time-step! This prevents the use of DFs as "scratch space" during an iteration. However, from the point of view ofGeLB
, such tricks are implementation details, which should be generated automatically (when possible) by theGeLB
compiler, by analyzing the data-dependencies graph.Limiting which ports may be used for different types of
lattice
operators (as discussed in the next section).
By accepting the above-mentioned constraints enforced by the GeLB
compiler,
the problem of generating a graph of data-dependencies becomes much easier
(compared to that faced e.g. by C(++)
or Fortran
compilers, which always
have to assume the worst cases, to ensure correctness).
lattice
operators#
TODO-insert-Figure-18-p80-from-thesis (initializer, dynamic, gauge)
After analyzing many LB code-samples, three types of fundamental lattice
operations were identified (each of them having a corresponding type of
lattice
operator in `GeLB):
initializer
operators: Before a simulation even begins, we want to initialize the DFs at eachnode
of thelattice
, to impose some initial conditions (ICs). InGeLB
programs, this step is achieved by defininginitializer
s.For an
initializer
, thelattice
is write-only (i.e. it is not allowed to read any DFs, because initially all DFs are assumed to be in an undefined state). Put otherwise, for alattice
namedf
, aninitializer
may not usef.in[...]
values (theinput
port of thelattice
).Important
The limitation mentioned above (when we discussed the
input
andoutput
ports oflattice
s) is important here -- once a DF is written (by assigning a value tof.out[...]
within theinitializer
, it cannot be overwritten (neither by the sameinitializer
, nor by another one).dynamic
operators: After the initialization phase, we enter a time-loop. During each iteration, at eachnode
of thelattice
we apply a certain update-rule, which computes the state-vector of thenode
at time(n+1)
, based on the state of thenode
(and usually also based on the state of some neighboringnode
s) at timen
. InGeLB
programs, such an update-rule is specified by writing adynamic
.For a
dynamic
, thelattice
is read-write, i.e. we can use both theinput
and theoutput
ports (given alattice
namedf
, adynamic
may readf.in[...]
values, and also write tof.out[...]
values).Important
Similar to an
initializer
, adynamic
may only write tof.out[...]
values once per time-step -- if we say that a certain DF will have some value at the next time-step, we cannot change our mind :).gauge
operators: Computing without generating output would be meaningless, of course. Thus, every few time-iterations (or even at every iteration), or maybe just at the end of the simulation), we want to create some output based on the state of thelattice
. InGeLB
programs, such operations are specified by writinggauge
(s). For example, in a LB fluid-flow simulation, we may want to write agauge
which evaluates the density at eachnode
, and another which evaluates the momentum (or we may even create a singlegauge
, which evaluates both quantities, for greater efficiency). Within the body of agauge
, thelattice
is read-only (we can only use theinput
port, i.e.f.in[...]
for alattice
namedf
).
node_classes
and geometry
#
Another (trivial, but important) observation originating from analyzing many LB
implementations is that node
s may be usually attributed to a finite (and
small) set of classes.
For example, in a fluid flow problem where the fluid is surrounded by solid
(no-slip) walls, we may have on class for the node
s within the bulk of the
fluid domain and another class for the wall node
s.
Naturally, for simulations where we have to distinguish based on more ICs and/or
BCs, the number of node
classes is larger; however, even for the most
demanding simulations, the number of different node
classes is still
relatively small.
Accordingly, GeLB
requires users to explicitly define each node_class
.
In GeLB
terminology, a node_class
is defined by the set of initializer
(s),
dynamic
(s) and gauge
(s) which need to be applied for a certain class of
node
(s).
This idea is closely tied to the way we define the geometry
in GeLB
-- the
"paint" metaphor.
When it comes to defining the geometry
for your simulation, you can think
of a lattice
as a "canvas", where you are invited to "splash" different
colors (node_class
es) onto the different node
s.
Each of your artistic "strokes" corresponds to a node-assignment command within
the geometry
block, such as:
where WALL
is the name of a node_class
defined by you in the node_classes
block.
caution
The analogy is not perfect -- unlike colors in painting,
node_class
es in GeLB
may NOT overlap.
You can "splash" (assign) more than one node_class
onto the same node
,
but only the node_class
assigned last is retained.
This property is useful when we want to assign most of a sub-domain to a
node_class
, but we want to re-assign some smaller regions to different
node_class
es, e.g. to implement BCs for immersed objects.
GeLB
programs#
Flow-control in The assignment of different node
s to different node_classes
(which you can
think of as a lattice
-wide switch
-statement) is the only form of
flow-control that is supported by GeLB
programs.
However, after reading this manual, you will (hopefully) be convinced that you
can write your simulations without using if
, switch
, for
, while
,
do
, and other such "beasts".
This design decision, inspired by functional programming languages, requires
a small amount of re-learning.
But, we promise it will be worth it!
#
Conclusions and what comes nextCongratulations! You now have a pretty good idea about the structure of GeLB
programs.
In the following pages of this manual, we will:
- show you some sample GeLB programs, after which
- we will go into more details about rules (what you may and may not write) for
each type of block in your
GeLB
programs.