Mikrotik failover LTE

In a nutshell

Stable Internet is becoming more and more important, especially during COVID19, when a large part of society has to work remotely. One way to ensure constant access to the Internet is to get a second ISP and many people choose the access in the form of a LTE connection.

There were a lot of ideas on how to handle the second link, which is generally only supposed to be a spare one. Unfortunately, no solution met my expectations, although all "worked".

In search of inspiration I found an article published in Server Management by Timo Puistay from 2017. The described method of link detection through lengthening and routing manipulation proved to be quite efficient, however, it could not detect link failures when packet loss or a drastic increase in the time of packet transmission through the main link occurs. I've decided to extend this idea with all my needs.

Assumptions

The guidelines are as follows:

  • Fast detection of a loss with the main connection;
  • Detect an increase in delay times on the main connection and interpret them as a problem;
  • Detect an Internet connection failure further than my local gateway at my ISP;
  • Detect a repair of the main link connection;
  • Switching from the main connection to the backup with a delay
  • Full automation;
  • Use my MikroTik router

Network diagram

Microtik failover LTE Click to enlarge!
Microtik failover LTE

Script configuration - Netwatch

The principle is based on using the Netwatch tool implemented in MikroTik RouterOS with script support and additional verification by extending the routing route and checking availability of the default gateway.

/system script
add dont-require-permissions=no name=NetWatch-check owner=admin policy=\
reboot,read,write,policy,test source="#Here you can change the value\r\
\n# time in minutes for how long the BACKUP link will be preferred\r\
\n# before we check if the main link works\r\
\n#\r\
\n:global nwwait 15;\r\
\n# leave unchanged\r\
\n:global nwgw2;\r\
\n:tonum nwgw2;\r\
\n:local nwstatus;\r\
\n:local nwgwstatus;\r\
\nset nwgwstatus ([/tool netwatch get value-name=status [find comment=\
\"NetWatch\"]]);\r\
\nset nwstatus ([/ip route get value-name=distance number=[/ip route f\
ind comment=\"BACKUP\"]]);\r\
\n:if (\$nwstatus = \"6\") do={\r\
\nset nwgw2 (nwgw2 + 1)\r\
\n}\r\
\n:if ((\$nwgw2 > \$nwwait) and (\$nwgwstatus = \"up\")) do={ :log err\
or \"Master GW: OK\"\r\
\n/ip route set [find comment=\"BACKUP\"] distance=66;\r\
\nset nwgw2 (0)\r\
\n}\r\
\n"
add dont-require-permissions=no name=NetWatch owner=admin policy=\
reboot,read,write,test source="/log error \"Master GW: PROBLEM\"\r\
\n/ip route set [find comment=\"BACKUP\"] distance=6\r\
\n\r\
\n"
add dont-require-permissions=no name=NetWatch-init owner=admin policy=\
ftp,reboot,read,write,policy,test,password,sniff,sensitive,romon \

source="#number of checks before the switch\r\
\n:global nwwait 20;\r\
\n"

Scripts used: NetWatch - script launched in case of main link failure NetWatch-check - a script running temporarily to verify switching back to the main link, in this script we can define how long the BACKUP link from the time of switching will remain the leading link before we check if the main link is working properly. In the default situation it is 15 minutes.

Timed activation of the script

The next step is to define the timing of scripts.

/system scheduler add interval=1m name=NetWatch on-event=\
"/system script run NetWatch-check\r\
\n" policy=\
ftp,reboot,read,write,policy,test,password,sniff,sensitive,romon \
start-date=apr/11/2020 start-time=17:31:44

It is time to configure the MikroTik Netwatch tool. Instead of A.B.C.D, the default address of the link's main gateway should be given. The value timeout can specify how sensitive the main link failure detection will be.

/tool netwatch add comment=NetWatch down-script="/system script run NetWatch\r\ \n" host=A.B.C.D interval=5s timeout=100ms

Routing configuration

Finally, configure the routing, using the trick with extended routing route and checking the default gateway availability. Instead of A.B.C.D, provide the default gateway address for the main link, and instead of E.F.G.H, provide the default gateway address for the backup link BACKUP.

/"ip route add comment=MASTER distance=10 gateway=10.255.66.1
add comment=BACKUP distance=66 gateway=10.255.67.1
add check-gateway=ping distance=1 dst-address=10.255.66.1/32 gateway=\
208.67.220.220 scope=10
add check-gateway=ping distance=1 dst-address=10.255.66.1/32 gateway=\zN
8.8.8.8 scope=10
add check-gateway=ping distance=1 dst-address=10.255.67.1/32 gateway=\n
208.67.222.222 scope=10
add check-gateway=ping distance=1 dst-address=10.255.67.1/32 gateway=\n
8.8.4.4 scope=10
add distance=1 dst-address=8.8.4.4/32 gateway=E.F.G.H scope=10
add distance=1 dst-address=8.8.8.8/32 gateway=A.B.C.D scope=10
add distance=1 dst-address=208.67.220.220/32 gateway=A.B.C.D scope=10
add distance=1 dst-address=208.67.222.222/32 gateway=E.F.G.H scope=10

We use the popular ip address of dns Google (8.8.8.8, 8.8.4.4) and OpenDNS (208.67.220.220, 208.67.222.222) services, which we check if they are reachable through the link of a given operator. If available, the default gateway for the link is activated.

Summary and conclusions

In the above solution, we used double verification of the correct operation of the Internet link, once by checking the default gateway of the main link, and additionally by checking the availability of hosts "far in the Internet" reached from specific links. This solution has a certain restriction, in case one of the operators changes the default gateway parameter when renegotiating the connection we are not able to configure it directly. In this case, use a separate router for this connection as in the diagrams below and indicate it as the connection gateway.

Microtik failover scheme2 Click to enlarge!
Microtik failover scheme2
Microtik failover scheme3 Click to enlarge!
Microtik failover scheme3

I am convinced that the solution can be improved and further developed, the scripts I have prepared can be written differently while maintaining the main principle of the solution.
For further discussion on this topic, please visit our FORUM, where you can share your configuration and comments on this topic with the whole community.

Author: Wojciech Repiński

Products from the article

Session will expire in:
Seconds
You will be logged out after the session expires
Choose a different country or region to shop in the language that suits you
Our site uses cookies (so-called "cookies"). You can find more about these files, as well as about how we process your personal data, in our privacy policy.
You are on page for country / region:
English (EN)
Please select a different country or region to shop in a language that suits you.