Switching to Proton VPN

I’d been using ExpressVPN for some time, but got sick of the price increases. So I’m now using Proton VPN instead. At the time of this writing, ExpressVPN was about to increase their price I think to $150/yr. Proton is $70, with a $20 discount as a new customer. So 1/3 the cost up front, and a little less than half next year.

ExpressVPN uses their own application, which is pretty easy to use, whereas Proton is built on WireGuard. I.e. Proton has you configure what you want on their website, which country and server #, port forwarding, etc., and then generates a config file for you to download and use with WireGuard. Not quite as easy, but WireGuard in general is pretty easy to use, so it should be fine.

The Situation So Far

TrueNAS Scale VM. There are NFS shares for torrents, videos, and music, amongst other things not relevant here. Currently on the 192.168.0.0/24 subnet (needs to be moved as I continue segmentation).
Debian VM running transmission-daemon. On the 10.10.20.0/24 subnet. Being connected to with transmission-remote on various machines to monitor my Linux ISOs. “torrents” NFS share mounted, and where transmission is using as storage, to not interfere with the small virtual drive for the OS.
Desktop/Laptop/Phone on one of the private IP ranges, depending on which device. Connecting to transmission-daemon remotely. NFS/SMB shares mounted, so that I can scrape/rename files before moving them to the appropriate folders.
Jellyfin. Not really important here. But has “videos” and “music” NFS shares mounted to use as libraries.

Transmission is already installed and configured on the VM, and on the client. Make sure to use a username and password, it’s not all that hard to configure (edit the .json file and restart the transmission-daemon service).

The Problem

First we just get rid of expressvpn.

sudo apt remove --purge expressvpn

Then we install wireguard.

sudo apt install wireguard wireguard-tools

Then we go onto Proton VPN, select our server and whatnot, and download our config file.

Now we just have to connect to the VPN and Bob’s your uncle! Right?

sudo wg-quick up ./proton-nl.conf

First, transmission-remote client loses its connection to the remote host. Maybe that’s not such a big deal. It’s still downloading in the background, right?

Give it an hour, because you’re doing other things.
Try and SSH in via the laptop, and get “connection refused”.
Go back to Proxmox web GUI and access the VM’s shell that way.
Disconnect from the VPN.
Reconnect to the remote host with the laptop to check your progress.

No progress has been made. All of the torrents show errors about storage not being available. Transmission is telling me that I need to set the storage location, or manually re-validate the existing data before it will continue downloading.

What’s going on here?

When I connect to the VPN on that VM, all other traffic is blocked. This is good, because I want to be safe on the internet, right? It’s working as intended.

But I want to be able to SSH in if needed. I want to be able to connect with transmission-remote on port 9091. I need to connect to my NFS shares where the files are ultimately being downloaded to.

I can’t do any of that if I’m blocking local traffic.

So how do we get all traffic to flow through the wireguard connection, except for anything to/from other local IP ranges?

First, let’s try to understand a bit about what’s going on in the network stack.

How traffic is flowing nominally

The VM has 2 interfaces, loopback “lo” and ethernet “ens18”. When I bring the wireguard connection up, it creates a third interface. I’m using The Netherlands as my server, and called it “proton-nl.conf”, so the interface created is called “proton-nl”.

$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: ens18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether bc:24:11:cb:62:5f brd ff:ff:ff:ff:ff:ff
    altname enp0s18
    inet 10.10.20.2/24 brd 10.10.20.255 scope global ens18
       valid_lft forever preferred_lft forever
5: proton-nl: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN group default qlen 1000
    link/none
    inet 10.2.0.2/32 scope global proton-nl
       valid_lft forever preferred_lft forever

Notice that the VPN interface as a netmask of 32? So we only have one single address in that network space. Also notice that the IP address chosen for that interface doesn’t conflict with ens18. If I were already using the 10.2.1.x/24 subnet on this machine, I would have to choose something else in my wireguard config file.

Which by the way, here’s the original wireguard config file:

[Interface]
# Bouncing = 8
# NetShield = 0
# Moderate NAT = off
# NAT-PMP (Port Forwarding) = on
# VPN Accelerator = on
PrivateKey = __________
Address = 10.2.0.2/32
DNS = 10.2.0.1

[Peer]
PublicKey = __________
AllowedIPs = 0.0.0.0/0, ::/0
Endpoint = x.x.x.x:51820

All traffic is being routed to the VPN host in the Netherlands. That device gets packets with destination in 192.168.x.x or 10.x.x.x (besides 10.2.0.2), which are defined as private ranges, and it should be dropping them. Which I guess is the intended behavior, which it’s doing!

The Solution

We’re going to add a static route in our [main] routing table.

When the linux kernel gets a packet, it checks the routing tables to see what to do with it, starting with the lowest numbered table. If it matches a rule, it gets routed as the table directs it. If it’s not matched, it goes to the next rule to try and match again.

$ip rule
0:	from all lookup local
32764:	from all lookup main suppress_prefixlength 0
32765:	not from all fwmark 0xca6c lookup 51820
32766:	from all lookup main
32767:	from all lookup default

You can read more about routing in linux here. I’m no expert yet. But per the great explanation, when wireguard is running and the VPN connection is up it creates the two rules in the middle (65 and 66), and it is sending all traffic to the “proton-nl” interface.

$ip route list table 51820
default dev proton-nl scope link

So let’s add some routes in the local routing table and see if it works. The idea is that we hit table 0 first, and it will be true, never giving it a chance to check against the 51820 routing table. I want to use my legacy 192.168.0.x/24, and the newer 10.10.x.x/16 (using 3rd octet for different VLANs).

$ip route add 10.10.0.0/16 via 10.10.20.1 dev ens18
$ip route add 192.168.0.0/24 via 10.10.20.1 dev ens18

$ip route show
default via 10.10.20.1 dev ens18 proto static
10.10.0.0/16 via 10.10.20.1 dev ens18
10.10.20.0/24 dev ens18 proto kernel scope link src 10.10.20.2
192.168.0.0/24 via 10.10.20.1 dev ens18

And it’s successfully showing entries 2 and 4 now. Essentially I’m just telling linux that if a packet is in that range, to act like a router, and that the next hop is the router that I myself (the transmission VM) is connected to. That next hop is on device ens18.

Once that happens, it does in fact go to 10.10.20.1, which is the router for my VLAN 20, and I have access control lists (ACLs) setup to flow traffic to other VLANs on 10.10.30/40/etc.x, or to 192.168.0.x. If those ACLs were not in place, the packets would die at my physical router, instead of dying in the VM. So make sure you have rules in place to handle them when they get to the desired device.

Here’s a better explanation here if you’re a visual learner like me.

Here’s what it looks like when done correctly:

The local traffic hits the first routing table, passes the check, and gets sent along to ens18. The other traffic fails the check on routing table 0, continues on to table 51820, then gets forwarded to the VPN interface.

Now we need to make sure that this is done automatically every time I bring the wireguard interface up, and undone when I take it down. (Not technically, it wouldn’t hurt anything to keep these in place regardless of VPN status. They’d just be covered under the other rules.)

So go back and edit your wireguard config file with the PreUp and PostDown commands. And throw in PersistentKeepalive for good measure, since there could be long periods of time where this connection sits idle.

[Interface]
# Bouncing = 8
# NetShield = 0
# Moderate NAT = off
# NAT-PMP (Port Forwarding) = on
# VPN Accelerator = on
PrivateKey = __________
Address = 10.2.0.2/32
DNS = 10.2.0.1
PreUp = ip route add 192.168.0.0/24 via 10.10.20.1 dev ens18
PostDown = ip route del 192.168.0.0/24 via 10.10.20.1 dev ens18
PreUp = ip route add 10.10.0.0/16 via 10.10.20.1 dev ens18
PostDown = ip route del 10.10.0.0/16 via 10.10.20.1 dev ens18

[Peer]
PublicKey = __________
AllowedIPs = 0.0.0.0/0, ::/0
Endpoint = x.x.x.x:51820
PersistentKeepalive=30

Bonus Points

Ensure port forwarding is enabled and working. This could keep peer discovery from happening as intended. Making things much slower. Tutorial here

Bind transmission-daemon to the IP address of the wireguard interface. If someone were to get through my public IP address of my router and try to connect as a peer, transmission would allow it. This is unlikely since I have only 3 ports open (my own wireguard instance, which no one has credentials to, and 80/443 which go to my reverse proxy). But if it did happen, transmission would indeed allow it.

You can use either the public IP, or in our case which is much easier, the local IP assigned to the interface. Which for me is 10.2.0.2, as defined in the WG config file.

Stop the transmission-daemon service, since you can’t make changes while it’s running.

$service transmission-daemon stop

Modify the transmission config file at /etc/transmission-daemon/settings.json

"bind-address-ipv4": "10.2.0.2",
"rpc-whitelist": "127.0.0.1,10.10.*.*,192.168.0.*",
"rpc-whitelist-enabled": true,

This also binds local addresses to the transmission Remote Protocol Call (RPC) aka transmission remote client connecting to the host. This way I can get to the web interface at 10.10.20.2:9091/transmission/web or use a remote client directly, but only from one of my local machines.

End

That’s it. When I setup ExpressVPN, I never had these issues, I could still connect to local IPs just fine. I think their client is probably just doing these things in an automated fashion that’s transparent to the user. But it also begs the question; why is the ExpressVPN client for Windows 329MB in size?! Surely it’s not that sophisticated. And surely not worth an extra $80+/year!

But this wasn’t really all that hard. Most of this post was explaining the why behind it. The actual configuration to make it work is just 2 lines (PreUp and PostDown commands in the wireguard config file).