10. Hyperflow Model

The hyperflow model on derivation graphs, accessible through hyperflow::Model (C++) and hyperflow.Model (Python), consists of a base model and several modules. Each module defines one or more sets of variables with associated constraints.

If you use the hyperflow functionality in your research you may want to cite [AFMS-Hyperflows]. Results on computational complexity of finding hyperflows can be found in [AFMS-NPFlow].

10.1. Mathematical Description

Given is first of all a directed multi-hypergraph (e.g., modelling a chemical reaction network), in the form of a derivation graph object (dg::DG/DG). Let this hypergraph be called \(\mathcal{H} = (V, E)\), with \(V\) being the set of vertices (e.g., molecules) and \(E\) being the set of directed hyperedges (e.g., reactions). As a running example, we use the network below.

Figure made with TikZ

Example of a directed multi-hypergraph.

Each hyperedge \(e\in E\) is an ordered pair of multisets of vertices \(e = (e^{out}, e^{in})\), the sources and targets of the edge (e.g., the reactants and products of the reaction). To denote the multiplicity of a vertex \(v\) in a multiset s we write \(m_v(s)\). For example, in the network above, the edge \(e_3 = (\{B, D, D\}, \{C\})\) could model a reaction \(B + 2\,D\rightarrow C\). We have \(m_D(e_3^{out}) = 2\) as \(D\) is used twice as a reactant. The multiplicity function thus gives the stoichiometric coefficients in a reaction. In the context of stoichiometric matrices the multiplicities for the educts and products correcpond to the entries of \(\mathbf{S}^-\) and \(\mathbf{S}^+\) respectively.

To model input and output of the network we extend it with half-edges at each vertex, and construct the extended hypergraph: \(\overline{\mathcal{H}} = (V, \overline{E})\), with \(\overline{E} = E\cup E^{in} \cup E^{out}\) where \(E^{in} = \{e^{in}_v = (\emptyset, \{v\}) \mid v\in V\}\) and \(E^{out} = \{e^{out}_v = (\{v\}, \emptyset) \mid v\in V\}\). If we extend our example network we arrive at the extended network shown below.

Figure made with TikZ

Example of an extended network. Each vertex has an additional input edge and output edge.

10.1.1. Basic Hyperflow Model

Given an extended hypergraph \(\overline{\mathcal{H}} = (V, \overline{E})\), an integer hyperflow is a function \(f\colon \overline{E}\rightarrow \mathbb{N}_0\) that assigns a non-negative integer to each hyperedge, including the input/output edges. It must satisfy the flow conservation constraint

\[\begin{align*} \sum_{e\in \text{out}_{\overline{E}}(v)} m_v(e^{out})f(e) - \sum_{e\in \text{in}_{\overline{E}}(v)} m_v(e^{in})f(e) &= 0 & \forall v\in V \end{align*}\]

where \(\text{out}_{\overline{E}}(v)\) and \(\text{in}_{\overline{E}}(v)\) are the sets of out-edges and in-edges of a vertex \(v\) respectively. An example of an integer hyperflow is shown below.

Figure made with TikZ

Example of an integer hyperflow on the extended example network.

Figure made with TikZ

The same integer hyperflow, where edges with no flow are hidden.

10.1.2. Expanded Network and Hyperflow Model

For many uses, the basic model described above is sufficient, but often it is advantageous to have more fine-grained control of how flow is routed. The network is therefore transformed further by replacing each vertex with a small network (a complete bipartite graph), such that one can restrict how incomming flow to a vertex is routed to outgoing edges. Formally we create the expanded network \(\widetilde{\mathcal{H}} = (\widetilde{V}, \widetilde{E})\), with

\[\begin{split}\begin{align*} \widetilde{V} &= \bigcup_{v\in V} V_v^{in} \cup \bigcup_{v\in V} V_v^{out} \\ V_v^{in} &= \{u^{in}_{ve} \mid \forall e\in \text{in}_{\overline{E}}(v)\}\\ V_v^{out} &= \{u^{out}_{ve} \mid \forall e\in \text{out}_{\overline{E}}(v)\}\\ \end{align*}\end{split}\]

and

\[\begin{split}\begin{align*} \widetilde{E} &= \bigcup_{v\in V}E_v \cup \{\widetilde{e}\mid e\in \overline{E}\} \\ \widetilde{e} &= (\widetilde{e}^{out}, \widetilde{e}^{in}) \\ \widetilde{e}^{in} &= \{u^{in}_{ve} \mid v \in e^{in}\} \\ \widetilde{e}^{out} &= \{u^{out}_{ve} \mid v \in e^{out}\} \\ E_v &= \left\{\left(\{u^{in}\}, \{u^{out}\}\right) \mid u^{in}\in V_v^{in}, u^{out}\in V_v^{out}\right\} \end{align*}\end{split}\]

That is, each original vertex is replaced with a vertex for each in-edge and for each out-edge of that replaced vertex. The original hyperedges are then reconnected to these new vertices, and a set of new 1-to-1 hyperedges, \(E_v\) are added for each of the original vertices that connect all in-vertices to all out-vertices. We call these new edges transit edges. An example of network expansion is shown below.

Figure made with TikZ

Example of an expanded network. The vertices are the small black dots, while the large circles are a visual aid to group the vertices and edges related to the orignal vertices. The blue transit edges are those that connect pairs of reverse internal hyperedges, while the red transit edges are those that connect the input and output edges.

It will be convenient to specify some particular transit edges:

  • The IO transit edges \(T_{IO}\) (blue edges in the example above): in each vertex, the transit edge from the input to the output.

  • The reverse edge transit edges \(T_{rev}\) (red edges in the example blue): in each vertex, the transit edges that go between edges that are reverse of each other. That is, for each pair of hyperedges \(e_1, e_2\in \overline{E}\) with \(e_1 = (V_a, V_b), e_2 = (V_b, V_a)\), each transit edge \((u_{ve_1}^{in}, u_{ve_2}^{out})\) for \(v\in V_b\) and \((u_{ve_2}^{in}, u_{ve_1}^{out})\) for \(v\in V_a\) are in \(T_{rev}\). Note that a loop edge \(e = (V_a, V_a)\) is also considered to be a reverse of it self, and are thus considered here.

An integer hyperflow on this expanded network is then defined as before, but just on \(\widetilde{\mathcal{H}}\). A hyperflow on our example of an expanded hypergraph is shown below. When projecting it back on to the non-expanded network, it is equal to the previously shown hyperflow.

Figure made with TikZ

Example of an integer hyperflow on an expanded hypergraph.

10.2. Base Model Specification

Hyperflows are found by creating a hyperflow::Model/hyperflow.Model object, where a network is given as dg::DG/DG. The model object contains first a specification of a hyperflow problem, which can modified, after which one or more hyperflow solutions can be found. Each solution is a hyperflow \(f\) as mathematically defined in the previous section. It is represented by a set of variables \(x_e\) for each \(e\in \widetilde{E}\). Multiple sets of auxiliary variables are defined for convenience, predominantly indicator variables, i.e., variables of value 0 or 1, that indicate various conditions. For example, for each edge of the original network \(e\in E\) there is a flow variable \(x_e\) and then an indicator variable \(z_e\) that indicates the condition \(x_e > 0\). That is, \(z_e\) is 1 if there is non-zero flow on the edge \(e\) and 0 if there is no flow.

An initial model object specifies that nothing can flow in and out of the network, i.e., \(x_v^{in} = x_v^{out} = 0\) for each vertex \(v\in V\). To remove this constraint, can specify a vertex as a source or sink using the methods hyperflow::Model::addSource()/hyperflow.Model.addSource() and hyperflow::Model::addSink()/hyperflow.Model.addSink().

By default a the model does not allowed to have flow on the edges in \(T_{rev}\) (see the section above), i.e., flow can not go through one edge and then reversing directly back through the inverse of that edge (unless it is the input/output edges). If you want to allow this, you can use hyperflow::Model::setAllowReversal()/hyperflow.Model.allowReversal. By default this flow reversal is however allowed for the input/output edges (on the edges in \(T_{IO}\). This means that if too much flow is going into the network, it is allowed to flow directly out again. You can disallow this through hyperflow::Model::setAllowIOReversal()/hyperflow.Model.allowIOReversal. For example, consider the left-most flow below.

Figure made with TikZ

Example of a flow through pairs of reverse edges. In particular, the flow through \(B + C\rightarrow D\) is unambiguously going back through the inverse, \(D\rightarrow B + C\), which often is undesirable.

Figure made with TikZ

A simplified version of the flow where the flow on the purple edges has been cancelled out. Now the flow through the edge \(B\rightarrow C\) is unambiguously going through the inverse, \(C\rightarrow B\).

Figure made with TikZ

A further simplified version of the flow where also the flow on the green edges has been cancelled out. Now the input flow to \(B\) is unambiguously going back out again.

Figure made with TikZ

A fully simplified version of the flow where also the input/output flow has been cancelled out.

The network expansion as explained in the previous section introduced many new vertices and transit edges. In the implementation the expansion is only being done as necessary. E.g., if all flow reversal is allowed, then each original vertex will be replaced by just a single transit edge and two vertices. All in-edges are connected to the same new vertex and all out-edges are similarly connected to the other new vertex.

10.2.1. Relaxed Mode

The flow function is defined as assigning integers to each hyperedges, encoding how many times teach edge must be used to achieve balance. If you want to relaxed the model such that the flow values become floating point numbers, you can do that through hyperflow::Model::setRelaxed()/hyperflow.Model.relaxed. The hyperflow model thus degenerate to be equivalent to the model in Flux Balance Analysis (FBA). However, note that many features of the model are disabled in relaxed mode, e.g., solution enumeration, indicator variables, and several modules, because they are not well-defined with non-integer hyperflows.

10.3. Variable Specifiers, Linear Expressions, and Constraints

The base model does not provide any constraints on the flow apart from which vertices are allowed to have input and output flow, and if allowed, then it is unbounded. The model allows for adding arbitrary linear constraints, but this requires us a way of specifying model variables in C++ and Python. What we will arrive at, is that we can write the following Python code (and a quite similar version can be written in C++).

dg = DG()
r = dg.build().addAbstract("""
	#1 A + 2 B -> X
	#2 B + 3 C -> Y + A
""")
A = r.getGraph("A")
B = r.getGraph("B")
C = r.getGraph("C")
X = r.getGraph("X")
Y = r.getGraph("Y")
e1 = r.getEdge("1")
e2 = r.getEdge("2")

flow = hyperflow.Model(dg)
flow.addSource(A)
flow.addSource(B)
flow.addSource(C)
flow.addSink(X)
flow.addSink(Y)
flow.addConstraint(inFlow <= 12)
flow.addConstraint(2*edgeFlow[e1] + edgeFlow[e2] <= 3)
flow.objectiveFunction = -2*outFlow[X] - 3*outFlow[Y]
flow.findSolutions(maxNumSolutions=9)
flow.solutions.print()

Here we have a three variable specifiers named:

  1. inFlow, which is used just as it is, thereby representing the sum of all in-flow variables, i.e., \(\sum_{v\in V}x_v^{in}\).

  2. edgeFlow, which is indexed by a DG.HyperEdge object, in two places, to retrive a variable specifier for the specific edge-flow variable.

  3. outFlow, which is similarly used twice to retrieve particular out-flow variables. In this example we index by a Graph, but we could also index with a DG.Vertex.

No matter if a variable specifier is indexed or not, it can be used to create a linear expression, and they can again be used to create linear constraints. That is, when using arithmetic and relational operators on these objects, you are not actually evaluating anything, but merely creating objects that represent those expressions. Variable specifiers and linear expressions can also be used to query a found flow solution, using hyperflow.Solution.eval()/hyperflow::Solution::eval().

10.3.1. Named Variable Specifiers

The named variable specifiers are defined both in the Python and C++ interface. In C++ the variable specifiers are defined in the namespace mod::hyperflow::vars, it may thus be useful to use the using-declaration using namespace mod::hyperflow::vars; when defining linear expressions/constraints. In Python the variable specifiers are defined in the submodule mod.hyperflow.vars, but the root module, mod, imports them so they are available directly.

Each defined specifier represents a sum of variables, and indexing the specifier retrieves a specifier for a single variable (or a sum of a subset of the variables) as indicated by the argument. E.g., inFlow represents the sum \(\sum_{v\in V}x^{in}_v\) while inFlow[a] retrieves the specifier for \(x^{in}_v\) where a is either the vertex \(v\) or a graph associated with that vertex in the underlying derivation graph \(\mathcal{H} = (V, E)\). In the table below, the “Type” column essentially indicates which objects can be used for indexing. See also the types hyperflow::VarSumVertex/hyperflow.VarSumVertex and hyperflow::VarSumEdge/hyperflow.VarSumEdge. The “Module” column refers to which part of the flow model the variables are defined in, with “Base” being the core model. When a variable specifier is used for specification, e.g., in the objective function or a constraint, then the corresponding module will automatically be enabled.

Specifier (Python/C++)

Type

Module

Description

inFlow / inFlow

Vertex

Base

The input flow variables, \(x^{in}_v\). When listing a solution the column header for this variable is In.

outFlow / outFlow

Vertex

Base

The output flow variables, \(x^{out}_v\). When listing a solution the column header for this variable is Out.

isInUsed / isInUsed

Vertex

Base

Indicator variables for \(x^{in}_v > 0\).

isOutUsed / isOutUsed

Vertex

Base

Indicator variables for \(x^{out}_v > 0\).

isInLessOut / isInLessOut

Vertex

Base

Indicator variables for \(x^{in}_v < x^{out}_v\).

isInGreaterOut / isInGreaterOut

Vertex

Base

Indicator variables for \(x^{in}_v > x^{out}_v\).

isInOutZero / isInOutZero

Vertex

Base

Indicator variables for \(x^{in}_v = x^{out}_v = 0\).

vertexFlow / vertexFlow

Vertex

Base

Variables for flow through each vertex, \(\sum_{v\in V}\sum_{e\in \delta^{in}(v)} x_e\). The indexed variable specifier retrieves the inner sum for some \(v\), \(\sum_{e\in \delta^{in}(v)} x_e\).

isVertexUsed / isVertexUsed

Vertex

Base

Indicator variables for \(\sum_{e\in \delta^{in}(v)} x_e\).

transitInternalFlow / transitInternalFlow

Vertex

Base

Variables for flow through each vertex, which enters and exits the vertex from/to the network. That is, excluding the flow that is covered by inFlow and that covered by outFlow. Note, the indexed specifier can only be used if the separation of the transit flow is possible in the underlying expanded derivation graph. The non-indexed specifier can only be used if all indexed specifiers exist. The necessary network expansion can be triggered by disabling IO flow reversal (hyperflow::Model::setAllowIOReversal(false) or set hyperflow.Model.allowIOReversal to False), or by explicitly calling hyperflow::Model::separateIOInternalTransit() / hyperflow.Model.separateIOInternalTransit() to make the variable available for the relevant vertices.

edgeFlow / edgeFlow

Edge

Base

The edge flow variables, \(x_e\), without the virtual input/output edges.

isEdgeUsed / isEdgeUsed

Edge

Base

Indicator variables for \(x_e > 0\) for the non-input/output edges.

isBothReverseUsed / isBothReverseUsed

Edge

Base

For each unique unordered pair of edges (not input/output) \(e_a = (S_a, T_a), e_b = (T_a, S_a)\) there is an indicator variable indicating \(x_{e_a} > 0 \wedge x_{e_b} > 0\). The indexed specifier can only be retrieved if the given dg::DG::HyperEdge/DG.HyperEdge actually has an inverse. Two edges being each others inverse maps to the same indexed indicator variable.

isOverallAutocata / isOverallAutocata

Vertex

Overall Autocatalysis

Indicator variables for overall autocatalysis. When listing a solution the column header for this variable is OA.

isOverallCata / isOverallCata

Vertex

Overall Catalysis

Indicator variables for overall catalysis. When listing a solution the column header for this variable is OC.

energy / energy

Vertex

Thermodynamics

Variables for the free energy of each molecule/vertex. When listing a solution the column header for this variable is G.

logConcentration / logConcentration

Vertex

Thermodynamics

Variables for the logarithm of the concentration for each molecule/vertex. Note that the concentration it self does not appear in the model, only these variables. When listing a solution the column header for this variable is logK.

deltaEnergy / deltaEnergy

Edge

Thermodynamics

The thermodynamic energy difference for each reaction/hyperedge.

10.3.2. Linear Expressions

When creating a model or querying solutions it can be useful to form linear expressions or constraints, where variable specifiers acts as the variables. The operators +, -, and * are overloaded on variable specifiers and linear expressions (hyperflow.LinExp and hyperflow::LinExp), such that you can build linear expressions as if they were normal expressions. Examples:

  • The variable specifier inFlow is a linear expression.

  • So is inFlow[A].

  • The expression 2 * inFlow + edgeFlow[B] is a linear expression.

  • So is 2 * (inFlow + edgeFlow[B]) + outFlow[A].

Generally the following expressions create linear expressions:

  • A named variable specifier.

  • An indexed named variable specifier.

  • Addition or subtraction of two linear expressions.

  • Multiplication of an integer or floating-point number with a linear expression.

  • Negation of a linear expression.

  • A default-constructed hyperflow.LinExp/hyperflow::LinExp represents the linear expression of value 0.

The last case can be quite useful if you need to dynamically build a linear expression, e.g., the sum of all inFlow of vertices satisfying some predicate:

expr = hyperflow.LinExp()
for v in flow.dg.vertices:
        if myPredicate(v):
                expr += inFlow[v]

10.3.3. Linear Constraints

Similar to how linear expressions can be created using ordinary operators, so can linear constraints. They are created by using one of the operators ==, <=, and >= with a linear expression on one side and either an integer or floating-point number on the other side. The constraints can be pased to hyperflow.Model.addConstraint()/hyperflow::Model::addConstraint(), as shown in the example above.

10.4. Objective Function

When solutions are found to the specified model, they are enumerated in order of optimality, with respect to the objective function. It is set by giving a linear expression to hyperflow.Model.objectiveFunction/hyperflow::Model::setObjectiveFunction(), and a solution is then considered better than another if it has a lower objective value, i.e., the objective function is minimized by the internal solver. So, if you want to maximize instead, simply negate your linear expression.

If no objective function is set, a default is determined by the enabled modules. Each module will add some linear expression to the objective function if it is enabled. If the objective function is explicitly specified by the user, it is not modified by the modules.

Module

Added Linear Expression

Base

edgeFlow \(+\) inFlow

Overall Autocatalysis

isOverallAutocata

Overall Catalysis

isOverallCata

Thermodynamics

\(0\)

10.5. Extending with Additional Variables

The base model and the extension modules each define one or more sets of variables. You can also extend the model by addding additional variables, either boolean, integer, or floating-point variables. This is done through hyperflow.Model.addBoolVariable() / hyperflow::Model::addBoolVariable(), hyperflow.Model.addIntVariable() / hyperflow::Model::addIntVariable(), and hyperflow.Model.addFloatVariable() / hyperflow::Model::addFloatVariable(), respectively. Each of these functions need a unique name for the variable to be created, and will then return a variable specifier (specifically an object of type hyperflow.VarCustom/hyperflow::VarCustom), which can then be used as any of the variable specifiers from the built-in modules.

10.6. Overall Autocatalysis Module

Access: overallAutocatalysis (Python) / overallAutocatalysis (C++)

If you use the overall autocatalysis module in your research you may want to cite [AFMS-Hyperflows] and [AFMS-Autocata].

This module adds one type of model of autocatalysis which requires a certain relation between the input and output flow, i.e., the model adds a requirement on the overall reaction each flow solution implements.

Specifically, the module introduces the variable specifier isOverallAutocata which represents a set of indicator variables; \(z^a_v\) for each vertex \(v\in V\). Each variable is constrained such that \(z^a_v\) is 1 if and only if \(0 < f(e^{in}_v) < f(e^{out}_v\). That is, a vertex is considered overall autocatalytic if it has non-zero in-flow, and has greater out-flow than in-flow.

When the module is enabled it will disable reversal of flow through pairs of reverse hyperedges (hyperflow::Model::setAllowReversal() / hyperflow.Model.allowReversal), including the IO edges (hyperflow::Model::setAllowIOReversal() / hyperflow.Model.allowIOReversal).

10.6.1. Breadth-First Exclusivity

The core constraints of the module mere requires that some compound is consumed and then produced in a higher quantity. In some cases, this can lead to surprising results, e.g., the one in the figure below.

Figure made with TikZ

Example of an overall autocatalytic flow which is not really chemicall autocatalytic. The compound \(C\) is the overall autocatalytic because it is put into the network once, and two copies are produced.

One way to rule this flow out is to refine autocatalysis to define that the autocatalytic compound must be exclusively overall autocatalytic, and there can not be a way to create it without it self being put in. That is, let \(F\subseteq V\) be the set of vertices with allowed input flow, then for each \(v\in F\), it is not allowed to be overall autocatalytic if it can be reached by a hyper-breadth-first traversal of the network, starting from \(F\backslash \{v\}\). Equivalently, consider a flow model where all vertices are allowed to have output flow, and only vertices in \(F\backslash \{v\}\) are allowed to have input flow, if there exists a flow which has output flow from \(v\), then \(v\) is not allowed to be overall autocatalytic in the original model.

This pre-computation to disallow vertices to be overall autocatalytic is enabled by default, but can be disabled with hyperflow::Model::OverallAutocatalysis::setBFSExclusive() / hyperflow.Model.OverallAutocatalysis.bfsExclusive, which, depending on the use-case, in particular is relevant in cases where the set of source compounds is very large.

Note, the pre-computation is not affected by custom constraints. If a vertex should not be considered part of the network for this pre-computation, use hyperflow::Model::exclude() / hyperflow.Model.exclude().

10.6.2. Strict Transit

The core constraints of the module mere requires that some compound is consumed and then produced in a higher quantity. However, that does not mean that such a compound can not servce as an intermediary compound as well, which can be considered misleadning. An example of such a flow is shown in the figure below.

Figure made with TikZ

Example of an overall autocatalytic flow where an overall autocatalytic compound is also an intermediary compound. The shown network is the expanded network, and \(A\), is the overall autocatalytic compound. As \(A\) also has transit flow from the network that goes into the network again, it is also an intermediary compound.

We can create a related flow without any network-to-network transit flow by re-routing the transit flow to become input and output instead. This is illustrated in the figure below.

Figure made with TikZ

A flow related to the one in the previous figure, with the same edge flow, but without the overall autocatalytic compound \(A\) being an intermediary.

By default the module adds constraints such that if a vertex \(v\in V\) is overall autocatalytic then it can not have any network-to-network transit flow, all flow through \(v\) must come from the input edge or go to the output edge.

The addition of these constraints can be disabled with hyperflow::Model::OverallAutocatalysis::setStrictTransit() / hyperflow.Model.OverallAutocatalysis.strictTransit.

10.7. Overall Catalysis Module

Access: overallCatalysis (Python) / overallCatalysis (C++)

If you use the overall catalysis module in your research you may want to cite [AFMS-Hyperflows].

This module adds one type of model of catalysis which requires a certain relation between the input and output flow, i.e., the model adds a requirement on the overall reaction each flow solution implements.

Specifically, the module introduces the variable specifier isOverallCata which represents a set of indicator variables; \(z^c_v\) for each vertex \(v\in V\). Each variable is constrained such that \(z^c_v\) is 1 if and only if \(0 < f(e^{in}_v) = f(e^{out}_v\). That is, a vertex is considered overall atalytic if it has non-zero in-flow, and has the same out-flow as in-flow.

When the module is enabled it will disable reversal of flow through pairs of reverse hyperedges (hyperflow::Model::setAllowReversal() / hyperflow.Model.allowReversal), including the IO edges (hyperflow::Model::setAllowIOReversal() / hyperflow.Model.allowIOReversal).

10.7.1. Strict Transit

Similarly to the Overall Autocatalysis Module, this module also has constraints for restricting the transit flow of overall catalytic vertices.

The strict transit constraints are enabled by default, but can be disabled with hyperflow::Model::OverallCatalysis::setStrictTransit() / hyperflow.Model.OverallCatalysis.strictTransit.

10.8. Thermodynamics Module

Access: thermodynamics (Python) / thermodynamics (C++)

If you use the overall catalysis module in your research you may want to cite [TODO-Thermodynamics].

Todo

Write section when it’s ready.