Skip to main content

Main concepts in GeLB

Main data-structure: the lattice#

All GeLB programs are centered on (at least one instance of) the lattice data-structure. A lattice is a (one-, two-, or three-dimensional) cartesian grid of nodes. Each node has a state-vector attached to it (where the length of the state-vector is the same for all nodes in a `lattice). To use LB terminology, each element of the state-vector represents a node-local distribution function (DF), with indexing starting at zero. As we will discuss later, each node will need to be assigned to a node_class (think of "painting" the lattice using different colors).

You can use more than one lattice

For simulating athermal flows using LBM, a single distribution function is sufficient. However, when simulating thermal flows, we may want to use the multi-distribution function (MDF) approach, where we have two distinct distribution functions, to simulate the fluid and the temperature field. To account for such scenarios, GeLB programs may instantiate more than one lattice, with the restriction that all lattices must be overlapping in space (as is usually the case for MDF algorithms).

lattice input / output (I/O) ports#

When implementing LB algorithms using general-purpose programming languages (GPLs), the lattice is typically a (d+1)-dimensional array (where d stands for the number of dimensions of the simulated spatial domain, and the +1 comes from the node-local state-vector). The LB algorithm then directly reads from and writes to this array. Unfortunately, an update-rule of a node may overwrite DFs which are still needed by the update-rule at a neighboring node (within the same time-iteration). Such data dependencies are quite common, and often cause beginners (or even experts) much grief. Two solutions have emerged to deal with this problem:

  1. The easiest solution (used by many LB implementations) is to simply instantiate two copies of the lattice (one holding the state at the current time-step, and another for holding the state at the next time-step). To make things more efficient, we can say that odd time-steps are stored in the first lattice, and even time-steps are stored in the second one. This approach works, but it introduces other problems:

    a. As the amount of memory used by the simulation is doubled, the maximum size of the lattice that can be simulated is halved (because system/GPU memory is finite).

    b. Also, the increased memory usage of the simulation usually places additional stress on the memory subsystem, reducing the efficiency of caches. Although (in theory) we still read and write the same amount of data from/to the memory, the way caches operate (in order to maintain data-consistency) means that doubling the memory footprint also generates more traffic on the memory bus. All this amounts to significantly longer simulation times, because modern computing hardware tends to be limited more by the volume of data transfers than by the number of arithmetic operations.

  2. A more sophisticated approach is to keep a single lattice, and to orchestrate the updates of the DFs in a very careful manner, so that we do not overwrite any data which will still be needed. However, this approach also has its own pitfalls:

    a. Difficult to understand, to implement, and (especially) to generalize: While it is doable to find a clever sequence of DF-updates which is applicable to simple / idealized cases (periodic domains or simple BCs), these sequences quickly become un-applicable e.g. when a user decides to implement a more complex BC. In the worst case, the user may not even be aware of the underlying assumptions, leading to incorrect simulations (which are arguably worse than slow simulations).

    b. Potential to introduce other performance bugs: The core strength of modern computing hardware consist of the ability to perform parallel computations. For example, a common approach for GPUs is to assign each node to a compute-thread, and allow the hardware-scheduler to freely decide in which order and on which hardware compute-units to map these threads. Thus, if the prescribed sequence of DF-updates is not adapted to the actual hardware that runs the simulation, there is a real danger of under-utilizing the CPU/GPU/FPGA, by forcing a (partially-)serial update of the nodes.

The core problem is that, when using a GPL, the compiler has no easy way of understanding the data-dependencies. In GeLB, we make it easier to compute these data-dependencies automatically, by:

  1. Forcing all data-operations on the lattice to be explicitly assigned to either the input (in) or to the output (out) port of the lattice. A read/write DF access operation is assigned to one of these two ports by simply appending the suffix .in or .out to the name of the lattice instance. For example, assuming that we have a 2D lattice which we named f, we would use:

    f.out[1, 1 | 0] = ...some expression...

    to write the new state (at time (n+1)) of DF 0 from the node at (x, y) = (1, 1). Similarly, if a variable alpha depends on the current value of this DF (at time n), we would use:

    alpha = ...expression involving f.in[1, 1 | 0]...
    Important

    While we may read f.in[...] any number of times within a time-step, we can only write (assign) to f.out[...] of each DF only once during each time-step! This prevents the use of DFs as "scratch space" during an iteration. However, from the point of view of GeLB, such tricks are implementation details, which should be generated automatically (when possible) by the GeLB compiler, by analyzing the data-dependencies graph.

  2. Limiting which ports may be used for different types of lattice operators (as discussed in the next section).

By accepting the above-mentioned constraints enforced by the GeLB compiler, the problem of generating a graph of data-dependencies becomes much easier (compared to that faced e.g. by C(++) or Fortran compilers, which always have to assume the worst cases, to ensure correctness).

lattice operators#

TODO-insert-Figure-18-p80-from-thesis (initializer, dynamic, gauge)

After analyzing many LB code-samples, three types of fundamental lattice operations were identified (each of them having a corresponding type of lattice operator in `GeLB):

  1. initializer operators: Before a simulation even begins, we want to initialize the DFs at each node of the lattice, to impose some initial conditions (ICs). In GeLB programs, this step is achieved by defining initializers.

    For an initializer, the lattice is write-only (i.e. it is not allowed to read any DFs, because initially all DFs are assumed to be in an undefined state). Put otherwise, for a lattice named f, an initializer may not use f.in[...] values (the input port of the lattice).

    Important

    The limitation mentioned above (when we discussed the input and output ports of lattices) is important here -- once a DF is written (by assigning a value to f.out[...] within the initializer, it cannot be overwritten (neither by the same initializer, nor by another one).

  2. dynamic operators: After the initialization phase, we enter a time-loop. During each iteration, at each node of the lattice we apply a certain update-rule, which computes the state-vector of the node at time (n+1), based on the state of the node (and usually also based on the state of some neighboring nodes) at time n. In GeLB programs, such an update-rule is specified by writing a dynamic.

    For a dynamic, the lattice is read-write, i.e. we can use both the input and the output ports (given a lattice named f, a dynamic may read f.in[...] values, and also write to f.out[...] values).

    Important

    Similar to an initializer, a dynamic may only write to f.out[...] values once per time-step -- if we say that a certain DF will have some value at the next time-step, we cannot change our mind :).

  3. gauge operators: Computing without generating output would be meaningless, of course. Thus, every few time-iterations (or even at every iteration), or maybe just at the end of the simulation), we want to create some output based on the state of the lattice. In GeLB programs, such operations are specified by writing gauge(s). For example, in a LB fluid-flow simulation, we may want to write a gauge which evaluates the density at each node, and another which evaluates the momentum (or we may even create a single gauge, which evaluates both quantities, for greater efficiency). Within the body of a gauge, the lattice is read-only (we can only use the input port, i.e. f.in[...] for a lattice named f).

node_classes and geometry#

Another (trivial, but important) observation originating from analyzing many LB implementations is that nodes may be usually attributed to a finite (and small) set of classes. For example, in a fluid flow problem where the fluid is surrounded by solid (no-slip) walls, we may have on class for the nodes within the bulk of the fluid domain and another class for the wall nodes. Naturally, for simulations where we have to distinguish based on more ICs and/or BCs, the number of node classes is larger; however, even for the most demanding simulations, the number of different node classes is still relatively small.

Accordingly, GeLB requires users to explicitly define each node_class. In GeLB terminology, a node_class is defined by the set of initializer(s), dynamic(s) and gauge(s) which need to be applied for a certain class of node(s). This idea is closely tied to the way we define the geometry in GeLB -- the "paint" metaphor. When it comes to defining the geometry for your simulation, you can think of a lattice as a "canvas", where you are invited to "splash" different colors (node_classes) onto the different nodes. Each of your artistic "strokes" corresponds to a node-assignment command within the geometry block, such as:

nodes[1, :] = WALL;

where WALL is the name of a node_class defined by you in the node_classes block.

caution

The analogy is not perfect -- unlike colors in painting, node_classes in GeLB may NOT overlap. You can "splash" (assign) more than one node_class onto the same node, but only the node_class assigned last is retained. This property is useful when we want to assign most of a sub-domain to a node_class, but we want to re-assign some smaller regions to different node_classes, e.g. to implement BCs for immersed objects.

Flow-control in GeLB programs#

The assignment of different nodes to different node_classes (which you can think of as a lattice-wide switch-statement) is the only form of flow-control that is supported by GeLB programs. However, after reading this manual, you will (hopefully) be convinced that you can write your simulations without using if, switch, for, while, do, and other such "beasts". This design decision, inspired by functional programming languages, requires a small amount of re-learning. But, we promise it will be worth it!

Conclusions and what comes next#

Congratulations! You now have a pretty good idea about the structure of GeLB programs. In the following pages of this manual, we will:

  • show you some sample GeLB programs, after which
  • we will go into more details about rules (what you may and may not write) for each type of block in your GeLB programs.