When bridge sends refresh message and at the same happens some change in services (advertise/unadvertise) the delta and refresh messages might reorder
Due to re-ordering at tpbridge (it is multi-threaded), it might be possible that other cluster node's ndrxd for some time pointer might see inconsistent service tables (i.e. either new service is not seen until full refresh arrives or some service is seen (delta set to 0), but reordered older refresh indicates that service still is available).
Thus two approaches may be used to solve this:
- Remember the highest message sequence number
- If message arrives with older number, ignore it
- Downside for this is that maybe some refreshes can be lost. Also for example bridge initial full refresh message maybe lost and purely we could start to work with deltas only·
- For some time period wait for message sequences lost
- And if they arrive in certain time period, re-process older messages.
- And only if in the time period messages are not arrived, then drop the older messages and process only newer.
- This is more complex approach, but probably most correct solution.
If this currently causes some issues, then <brrefresh>BRIDGE_REFRESH_TIME</brrefresh> tag could be lowered, to have quicker full refreshes.
The issue can be monitored by text in ndrxd logs:
but we see it our view - there was some packet
000.000007|ndrxd-dom1.log|N:NDRX:3:9211474f:09546:7ff5d7e8e900:000:20210326:180407632191:md_brrefresh:bridge.c:0243:Service [TESTSV2] not present in refresh message, but we see it our view - there was some packet loss! - Removing it from our view!