It’s generally good practice to make sure that any network you’re responsible for maintaining can be reached in the event of a failure of either the main Internet (transit) connection, or failure (or misconfiguration) of the routing equipment. Sometimes it’s not feasible to have a second transit connection or redundant networking hardware, and so you need to get creative.

One of my clients is a not-for-profit with constrained financial resources. We wanted to have a way in to the network in the event of a failure of the main router, or in case someone (likely me) fat-fingers something and breaks the router config. And, while having a second transit connection would be nice, it’s just not something we can fit in the budget at the moment.

So, we had to get creative.

Before I came on board, they had purchased a consumer-grade LTE modem with the intention of using that as the backup access into the network, but hadn’t actually set it up yet. This blog post covers the steps I took to get it working.

Overview

The data centre in question is in the US, so we’re using a simple T-mobile pay-as-you-go data service. This service is designed for outgoing connections, and doesn’t provide a publicly-reachable IP address that I could ssh to from outside the LTE network, so I need to set up some sort of tunnel to give me an endpoint on the Internet I can connect to that gets leads inside the client’s network. ssh itself is the obvious choice to set up that tunnel.

I’ve set the tunnel up to provide access to one of the client’s administrative hosts, which has direct serial access to about half the network equipment (including the main router). From that vantage point I should be able to fix most configuration issues that would prevent me from accessing the network through the normal transit connection, and can troubleshoot upstream transit problems as if I were standing there in the data centre.

The modem can be put into bridge mode, but can still have an IP address to manage its configuration. The LTE network wants to use DHCP to give our server an address. So, we’ll have the slightly unusual configuration of having both a static and DHCP address on the server interface that the modem is connected to. The server has other duties though, so we’ll have to make sure that things like the default route and DNS configuration aren’t overwritten; that requires some extra changes to the DHCP client config.

And finally, for the tunnel to work we need a host somewhere out on the Internet that we can still reach when the ‘home’ network goes down. In the rest of this post I’m going to refer to our local administrative host as HOST_A and the remote host we’re using for a tunnel endpoint as HOST_B. We’ll need some static routes on HOST_A that send all traffic for HOST_B through the LTE network, and then we can construct the ssh tunnel which we’ll use to proxy back into HOST_A.

Setting up the Modem

The modem we’re using is a Netgear LB2120 LTE Modem with an external antenna, to get around any potential interference from the cabinet itself or the computer equipment and wiring inside. We have pretty good reception (4-5 bars) from just placing the antenna on top of the cabinet.

The modem’s LAN port is connected directly to an ethernet port on HOST_A. We could also have run that connection through a VLAN on our switches, but since the router and the server are in the same cabinet that would only serve to increase the possible ways this could fail, while providing no benefit. The main point here is that the router is going to provide its own network, so it’s best not to have it on the same physical network (or VLAN) with other traffic.

This modem is designed to be able to take over in the event of the failure of a terrestrial network, which is what the WAN port is used for. But we don’t want to use that here, so that port is left empty.

Connect to the modem’s web interface (for this model, the default IP address and password are printed on the back).

In the Settings:Mobile tab, take a look a the APN details. This probably defaults to IPv4 only, so if you want to try to get IPv6 working (more on that later) you’ll have to update the PDP and PDP Roaming configuration here. In the Advanced tab, you want to put the modem into Bridge mode (which will also disable the DHCP server), and you may want to give it a different static address. The modem’s default network overlaps with private address space we already use, so I’m going to use 172.16.0.0/30 as an example point-to-point network to communicate with the modem. For that, you’d set the modem’s IP address to 172.16.0.1 and its netmask to 255.255.255.252. Once you submit the configuration changes, the modem should restart.

Setting up the Server

The server needs to have a static IP address on the point-to-point network for configuring the modem as well as a DHCP address assigned by the LTE network. Because we may want to bring these up and down separately, I suggest putting the DHCP address on a virtual interface. You also need to configure a static route on the DHCP-assigned interface that points to HOST_B, so that any outbound traffic from HOST_A to HOST_B goes across the LTE network instead of using your normal Internet links. On a Debian host, /etc/network/interfaces.d/LTE.conf might look something like this:

auto eth3
iface eth3 inet static
   address 172.16.0.2/30

auto eth3:0
iface eth3:0 inet dhcp
post-up ip route add 192.0.2.1/32 dev eth3:0
post-down ip route del 192.0.2.1/32 dev eth3:0

You’ll also need to modify /etc/dhcp/dhclient.conf to disable some of the changes that it normally makes to the system. The default request sent by the Debian dhclient includes the following options:

request subnet-mask, broadcast-address, time-offset, routers,
   domain-name, domain-name-servers, domain-search, host-name,
   dhcp6.name-servers, dhcp6.domain-search, dhcp6.fqdn, dhcp6.sntp-servers,
   netbios-name-servers, netbios-scope, interface-mtu,
   rfc3442-classless-static-routes, ntp-servers;

I’ve modified ours to remove the routers, domain-name, domain-name-servers, domain-search, dhcp6.name-servers, dhcp6.domain-search, dhcp6.fqdn, and dhcp6.sntp-servers options. You also need to block changes to /etc/resolv.conf. Even though you’ve told dhclient not to request those options, the server may still supply them and dhclient will happily apply them unless you explicitly tell it not to.

request subnet-mask, broadcast-address, time-offset, host-name,
   netbios-name-servers, netbios-scope, interface-mtu,
   rfc3442-classless-static-routes;

supersede domain-name "example.net";
supersede domain-name-servers 198.51.100.1, 198.51.100.2;

Setting up the Tunnel

For this, you want to create an unprivileged user that doesn’t have access to anything sensitive. For the purposes of this post I’ll call the user ‘workhorse’. Set up the workhorse user on both hosts; generate an SSH key without a passphrase for that user on HOST_A, and put the public half in the workhorse user’s authorized_keys file on HOST_B.

We’re going to use SSH to set up the tunnel, but we need something to maintain the tunnel in the event it drops for some reason. There is a handy programme called autossh which does the job well. In addition to setting up the tunnel we need for access to HOST_A, it will also set up an additional tunnel that it uses to echo data back and forth between HOST_A and HOST_B to monitor its own connectivity, and restart the tunnel if necessary. We can combine that monitor with SSH’s own ServerAliveInterval and ServerAliveCountMax settings to be pretty sure that the tunnel will be up unless there’s a serious problem with the LTE network or modem.

I’ve chosen to run autossh from cron on every reboot, so I created an /etc/cron.d/ssh-tunnel file on HOST_A that looks like this:

@reboot workhorse autossh -f -M 20000 -qN4 -o "ServerAliveInterval 60" -o "ServerAliveCountMax 3" -R '*:20022:localhost:22' HOST_B

The -f option backgrounds autossh. -M 20000 sets up a listening port at HOST_B:20000 which sends data back to HOST_A:20001 for autossh to use to monitor the connection. You can explicitly specify the HOST_A port as well, if you prefer. The remaining options are standard SSH options which autossh passes on. Note that in my case HOST_B has an IPv6 address, but I haven’t configured the tunnel interface for IPv6, so I’m forcing ssh to use IPv4.

You may need to modify the sshd_config on HOST_B to set Gateway Ports yes, depending on the default configuration. Otherwise you won’t get a remotely accessible port on HOST_B.

Instead of using cron, you could also use something like supervisord or systemd to start (and re-start if necessary) the autossh process.

Using the Setup

Once this is all put together, you should be able to ssh to port 20022 on HOST_B, and wind up with a shell on HOST_A.

% ssh -p 20022 HOST_B
The authenticity of host '[HOST_B]:20022 ([192.0.2.1]:20022)'
can't be established.
ECDSA key fingerprint is SHA256:4v+NbLg2QYqe43WFR9QKXaVwCpcc71u5jJmxJdZVITQ.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '[HOST_B]:20022,[192.0.2.1]:20022' (ECDSA) to the list of known hosts.
Linux HOST_A 4.9.0-6-amd64 x86_64 GNU/Linux

Last login: Wed Mar 13 18:02:09 2019 from 192.0.2.2

matt@HOST_A:~
20:05:59 (618) %

Why no IPv6?

T-Mobile support IPv6 on their LTE networks, so I have the APN for our modem set to IPV4V6 PDP. The server configuration has been a problem, however.

As with IPv4, we don’t want to get a default route for our LTE network because that would interfere with the normal traffic of the server. It seems like disabling the acceptance of Router Advertisement (RA) messages should be all that’s necessary, but that also disables SLAAC address assignment.

auto eth3
iface eth3 inet static
   address 172.16.0.2/30

auto eth3:0
iface eth3:0 inet dhcp
   post-up ip route add 192.0.2.1/32

dev eth3:0
   post-down ip route del 192.0.2.1/32 dev eth3:0

iface eth3:0 inet6 auto
   pre-up /sbin/sysctl -w net.ipv6.conf.eth3.accept_ra=0
   post-up ip route add 2001:db8::1//128 dev eth3:0
   post-down ip route del 2001:db8::1/128 dev eth3:0

I have also tried using DHCPv6 (iface eth3:0 inet6 dhcp, above) but that also fails to get the configuration I want, and also causes ifup to return a fail condition when configuring the interface. At least the above SLAAC problem has the feature of failing silently, so I can leave the configuration in place without causing problems with interface management.

Perhaps you can find the right combination of options to make it work! I invite you to follow up, if you do.

Good luck!