As all communication of our devices is done via network, the highly resolved event streams are susceptible to network jitter, aka packet delay variation. If the network jitter is big enough, it may even be noticeable in the audio output. Here we elaborate on the countermeasures we take on our devices to handle network jitter.
Network jitter and clock synchronization
When looking at the Chimaera setup routines, one can see that we offer three possibilities to (not) handle network jitter:
- No clock synchronization and immediate OSC message dispatch
- SNTPv4 clock synchronization and OSC bundle timestamping
- PTPv2 clock synchronization and OSC bundle timestamping
To better illustrate the differences, we have taken some measurements as follows:
OSC bundles were sent from the Chimaera at a steady rate of 2kHz via a network switch to a Linux (rt-kernel and rt-prioritized network interrupt thread). In the case of no clock synchronization, OSC bundles were not timestamped and immediately executed at the host side upon reception. In the other two cases, the clock of the device was synchronized to the host either via SNTPv4 or PTPv2 and OSC bundles were timestamped to be executed 2ms in the future at the host side. You may want to read up on OSC timestamping first.
The events were deserialized and injected into a 96kHz audio stream and the difference in ms of following OSC bundles were recorded. At the device, the difference of two following OSC bundles was exactly 0.5ms (1 / 2 kHz). Without network jitter, the difference would be identical at the host. But network jitter is a real issue as can be seen in the graphs below and we thus need to take it into account.
The time difference of two following OSC bundles was 0.5019ms±0.0251(SD) when clocks were not synchronized. The time difference thus scatters around the expected 0.5ms with a standard deviation of 0.0251ms. This scatter now is the incarnation of network jitter.
When clocks are synchronized though and OSC bundles are timestamped to be executed in the future, we can get rid of most of the network jitter. The time difference of two following OSC bundles for SNTPv4 synchronized clocks was 0.5019ms±0.0096(SD) and for PTPv2 synchronized clocks it was 0.5021ms±0.0059(SD).
As the events finally will trigger or control audio and we are handling audio at 96kHz, the event dispatch for the two cases with synchronized clocks is off by 1 sample at most, we thus almost have sample accurate control signals at 96kHz, which is awesome. In the case of no clock synchronization event dispatch can be off by up to 10 samples. As we are dealing with control signals here, both are just fine.
Consider however, that we have an almost direct connection (only via a switch) between device and host and are using a Linux in a setup optimized for rt audio and networking. For a less well configured host, a much more complex network topology or an other operating system, network jitter with no clock synchronization may increase to ranges were it can become noticeable in the audio output.
We thus propose to take advantage of clock synchronization whenever possible. Where a PTPv2 server implementation is available (e.g. linuxptp), it should be given preference over SNTPv4.
Network jitter with no clock synchronization and thus no OSC bundle timestamping.
Network jitter with clock synchronization via SNTPv4 and OSC bundle timestamping as a countermeasure.
Network jitter wlock synchronization via PTPv2 and OSC bundle timestamping as a countermeasure.