Skip to main content

Use Zef Graphs in NetworkX

If you have existing graph analysis code written for NetworkX, or you would like to perform some more advanced analysis of a Zef graph that is not currently possible within the Zef ecosystem, you might like to have a "view" on a Zef graph which mimics the python NetworkX object type.

The proxy object zef.experimental.networkx.ProxyGraph presents a NetworkX-style interface, which is compatible with many of the NetworkX algorithms.

Note that this proxy object is lazy. It does not make a copy of the Zef graph, and accesses the graph data only when requested through the NetworkX proxy.

Creating a proxy object

Starting with some dummy data:

zg = Graph()
[
(ET.Person["alex"], RT.Name, "Alex"),
(ET.Person["bob"], RT.Name, "Bob"),
(ET.Person["charlie"], RT.Name, "Charlie"),
(ET.Person["doug"], RT.Name, "Doug"),

(Z["alex"], RT.FriendsWith["rel"], Z["bob"]),
(Z["alex"], RT.FriendsWith, Z["charlie"]),
(Z["bob"], RT.RivalsWith, Z["charlie"]),
(Z["bob"], RT.RivalsWith, Z["doug"]),

(Z["rel"], [(RT.Since, now()),
(RT.MetAt, "Gym"),]),
] | transact[zg] | run

we can construct a proxy NetworkX graph of the friends network:

import networkx as nx
from zef.experimental.networkx import ProxyGraph

dg = ProxyGraph(now(zg), ET.Person, RT.FriendsWith)
ug = ProxyGraph(now(zg), ET.Person, RT.FriendsWith, undirected=True)

wher dg is a directed graph (nx.DiGraph) and udg is undirected (nx.Graph), consisting of the subgraph made up of only ET.Person entities and RT.FriendsWith relations.

It is possible to have different views simultaneously. For example:

dg_all = ProxyGraph(now(zg), ET.Person)
ug_all = ProxyGraph(now(zg), ET.Person, undirected=True)

will consider all relations between ET.Person entities to be edges, that is RT.FriendsWith and RT.RivalsWith are considered equal.

Note that proxy views are of a GraphSlice, and so are immutable and will not advance with the Zef graph head.

Node/edge properties

Any AETs on the entities are interpreted as fields:

>>> for node in dg.nodes:
... print(f"{node} has name {dg.nodes[node]['Name']}")

Node(#96) has name Alex
Node(#126) has name Bob
Node(#134) has name Charlie
Node(#142) has name Doug

Nodes are simple wrappers around a ZefRef object:

>>> for node in dg.nodes:
... print(f"{node} is proxy for ZefRef {node.z}")

Node(#96) is proxy for ZefRef <ZefRef #96 ET.Person slice=2>
Node(#126) is proxy for ZefRef <ZefRef #126 ET.Person slice=2>
Node(#134) is proxy for ZefRef <ZefRef #134 ET.Person slice=2>
Node(#142) is proxy for ZefRef <ZefRef #142 ET.Person slice=2>

Edges can similarly possess fields:

>>> # We can also do lookups with ZefRefs
... z_alex = zg | now | all[ET.Person] | select_by_field[RT.Name]["Alex"] | collect
... z_bob = zg | now | all[ET.Person] | select_by_field[RT.Name]["Bob"] | collect
...
... info = dg[z_alex][z_bob]
... print(f"Edge information: {info}")

Edge information: {'Since': <Time 2022-03-09 12:50:19 (+0800)>, 'MetAt': 'Gym'}

Simple characterisations

Many simple NetworkX analysis functions will work directly on these graphs:

>>> nx.node_connectivity(ug_all)

1
>>> list(nx.connected_components(ug))

[{Node(#126), Node(#96), Node(#134)}, {Node(#142)}]
>>> nx.greedy_color(ug_all)

{Node(#126): 0, Node(#96): 1, Node(#134): 2, Node(#142): 1}
>>> nx.shortest_path(dg_all)

{Node(#96): {Node(#96): [Node(#96)],
Node(#126): [Node(#96), Node(#126)],
Node(#134): [Node(#96), Node(#134)],
Node(#142): [Node(#96), Node(#126), Node(#142)]},
Node(#126): {Node(#126): [Node(#126)],
Node(#134): [Node(#126), Node(#134)],
Node(#142): [Node(#126), Node(#142)]},
Node(#134): {Node(#134): [Node(#134)]},
Node(#142): {Node(#142): [Node(#142)]}}

Many other NetworkX features work as above.

Complex algorithms

Many of the NetworkX algorithms need to build up their own temporary graph to compute the output. This fails as a ProxyGraph is immutable. To work around this, a copy of the proxy object as a pure NetworkX object can be made using to_native():

nx.minimum_spanning_tree(ug.to_native())
nx.maximum_branching(dg.to_native())
nx.average_clustering(dg.to_native())

The nodes in a graph returned by to_native() are still thin wrappers around a ZefRef and can be used to get back to the original graph.