When microseconds are 'not fast enough'

Posted on Feb 22, 2019

video-game minetest yatm

What?

So a recent endeavor of mine was to work on my own minetest mod.

Boringly named Yet Another Tech Mod (abbreviated YATM), it’s same name as my unfinished minecraft mod, heck it’s the same thing, just a different game.

YATM?

Yes, YATM (I pronounce it as Yat-Em, you may do whatever you want, heck even Why-Eh-Tee-Em is fine in my book).

Okay, so what’s the problem with microseconds?

To quickly getting some basic knowledge out of the way:

YATM has a system called the “YATM network”, this system handles power transmission, grid solving and all that complex boring stuff.

It also has the job of handling tick updates for devices.

“Wait, you have a system that handles tick updates per device, why not just use the ABMs?”

Is a valid question, and my answer is: because it doesn’t fit with YATMs execution model.

But what does that even mean?

Now ABMs do a few things:

They determine when to execute an action
They have a specific function (the action) that is executed
Conditions under which it should be executed (i.e. what nodes to operate on, and what neighbours are required)
How often it should happen (i.e. interval)

But ABMs have one glaring issue:

The execute on one node at a time.

This makes power transmission very, very confusing.

The execution order of nodes is not guaranteed, you can’t have transient states with all nodes involved in the network. Transferring power from one node to another becomes a sequence of slow and complex rules.

To make it somewhat understandable, I tried to adapt a simple electrical rule: “Power flows through the path of least resistance.”

That is, power should flow to any node with less power than itself, however in practice, this presents a painful problem that plagued my original minecraft implementation: The “Glass is half empty, or half full” problem as I like to call it.

An example of this problem is: a node with ‘51 units’ of power will attempt to move it’s energy to a node with ‘50 units’.

What should happen to the energy in this scenario?

Do we transfer 1 unit to the target node? No, if we did that, when that node updates it will just transfer it back to it’s parent more than likely.
Do we establish a rule that says a node should not transfer power to a node if that node would become greater than the sender? It’s possible, BUT that requires some interface or knowledge of the target node
Do nothing? No, that breaks the system
Ditch this idea and try something else? Yes, that is exactly what I did.

Some may ask, what’s wrong with the solution in #2?

It requires that every node now either implement a ‘push’ based behaviour (i.e. send energy to this node) or a ‘pull’ based behaviour (i.e. take energy from this node) to handle energy.

Now imagine this running at the scale of several hundred nodes, and imagine doing that every frame.

You could say “well it would be pretty fast”, and it could be, but this would only be power transmission, we haven’t even touched on devices that need to do actual work with that power obtained.

Suddenly a 100 μs (microseconds) job turns into several hundred microseconds just for transmission and too much busy work for nodes that only serve a single purpose of transferring power in a ‘virtual’ circuit.

There is some advantage to this, you could have different nodes in the transmission affect the transmitted power, but YATM doesn’t really have time to fiddle around that, it just wants power, and wants to get things done.

So your solution?

Register all devices on a network once they arrive, it will detect every device on the network and then ‘index’ or group these nodes based on various tags.

This allows quite a few things:

Do you want all the energy_producers? You can grab the nodes registered under energy_producer
Want all the machines? You grab the machines group
Want all of x? You grab group of x

This is also used to determine all the power generated within a network.

But this method also has it’s own caveats: adding and removing nodes causes a massive refresh of the topography.

Granted, the nodes wouldn’t be added and removed so frequently to impact performance, but it is something to remember.

Every time a node is added to the network, or removed from the network, then the node must trigger a network refresh from it’s current location.

This refresh will do 2 main things:

Determine what nodes have been added/removed from the network due to the introduction or removal of a node
Detect any controller conflicts (i.e. a network cannot contain multiple controllers) and mark the network as such.

Okay, so you have a single network ‘service’ then

Exactly, using this service I can handle the annoyances (called power transmission and other forms of dispatching)

Now, this system works great, except for one problem.

lua just isn’t fast enough, hence the title of this article.

Granted, I can update a network with 30+ devices within 800 μs, but that is without all of the update functions and jobs being implemented, the biggest worry is this will climb up to several milliseconds.

And if you haven’t guessed it yet, anything over a millisecond is bad.

Every frame has to complete with 16 ms (that’s how much time we have per frame to do work).

And if one network takes that long, it’s gonna be a problem when there are more networks.

Granted there probably wouldn’t be that many large scale networks, and if it was, it would be running on better hardware, not my 8 year old laptop with a throttled CPU.

That won’t stop me from optimizing the crap out of it, if I can get that 800 μs done to even half, it would garner a huge performance boost.

But, it is too soon to be thinking about optimizing the code, once all the devices have their full implementations, then I can determine just how badly it’s performing.