EXPLOINSIGHTS, INC. Sys-Admin – Page 4 – The EXPLOINSIGHTS, Inc. System Administrator's journey to establish and operate a Centralised Office IT Infrastructure

April 19, 2018August 4, 2018

Linux – Full Disk Encryption with Remote Access

So if anyone is enjoying the ride bought on by NIST-800-171, you’ll know that data encryption is one of the many rages. And a lovely thing it is too – AES encryption is safe from ANYONE this side of Quantum computers. What’s not to love here?
Windows has Bitlocker. Easy to use, albeit proprietary. EXPLOINSIGHTS, Inc. uses it for mobile Windows devices as one of three major data encryption regimes employed as part of the NIST compliance for the company (translated into English: We don’t trust BitLocker alone :-)).
So, how hard can it be to enable full disk encryption on a Linux server? Answer: dead easy – it’s part of the installer process for Ubuntu, for example.

The more INTERESTING question, and the subject of this article: how EASY is it to remotely and securely decrypt a Linux server that employs full disk encryption? Answer: it’s a nightmare. Trust us, we know. 🙁
BUT it can be done. And this record here is to point people in the right direction for doing just that. Firstly, google is your friend, so whatever your distro, ask Google. I use Ubuntu Linux, so I found several useful online articles. The rest of this article is Ubuntu based:

Perform your server (or Desktop) Linux install using full disk encryption. This HAS to be done locally, as does the next steps:

This post: https://www.theo-andreou.org/?p=1579 is genius. Except where it isn’t, if you see what we mean.

The article will work for you until you get to the line:

echo IP=10.0.0.67::10.0.0.1:255.255.255.0:encrypted-system:eth0:off >> /etc/initramfs-tools/initramfs.conf

He doesn’t explain the above line well. Here’s our attempt:

echo IP=HOSTIPADDRESS::HOSTDHCPSERVER:255.255.255.0:HOSTNAME:<SEE-BELOW>:off

Hopefully our version clarifies things. When you create your server, it has a name. It could be, e.g. “MyServer”. Whatever it is, it goes in place of “HOSTNAME” above. This is not terribly important; but this next bit is:
The <SEE-BELOW>:off really means you have to insert the name of your network adapter. In days of old, it was called ‘eth0’, but rarely now. So if you use the ‘eth0’ expression, as stated in the article, the darn thing WON’T WORK. We know. We tried. We failed. It took us ages to figure that out.
To find your network adapter name, use this command:
```
ls /sys/class/net
```
…and it will show you the names of your network adaptors. Ours is enp3s0, so We have to enter:
```
HOSTIP::HOSTDHCP:255.255.255.0:enp3s0:off
```
…at the end of the /etc/initramfs-tools/initramfs.conf file. Your adaptor name will likely be different.
Example, for a regular small-business router, that has 192.168.1.1 as the router address and also the place where dhcp requests go):
```
echo IP=192.168.1.115::192.168.1.1:255.255.255.0:mypcservername:enp3s0:off
```
If you are trying a remote LUKS unlock then…GOOD LUCK! It’s hard. It’s also totally satisfying to watch a linux-OS hard disk encryption screen disappear after a remote login, and is perhaps even worth the effort (PERHAPS!!). It’s not easy… #JustSaying
Here’s a screen-grab of me successfully unlocking one of our servers remotely. We use public-key SSH login, and then busybox gives us a small menu of commands. The one we want is ‘unlock’. After entry, you are prompted for the decryption password. If the key is accepted, the drive unlocks and you see ‘cryptsetup: <drive-ref> set up successfully:

April 10, 2018

VPN – To TCP, Or not to TCP – that is the question

So, when on travel, I notice the EXPLOINSIGHTS, Inc. OpenVPN server becomes a little unreliable (slow). It was configured originally using UDP as the transfer protocol. UDP is the faster of the modern two protocol options (UDP and TCP). It’s fast because it throws data down the pipe and assumes it makes it to the right destination. This is brilliant…when it works. And when it doesn’t you want to use the slightly slower but much more robust TCP protocol, which actually checks to make sure the packets it sends make it to the right place!
So it’s a simple change: on Ubuntu, just edit your openvpn config at /etc/openvpn/server.conf, and change ‘proto udp’ to ‘proto tcp’. Save config, restart your openvpn server (sudo service openvpn restart – or your server equivalent, or reboot server if more convenient for you). If you run a different server distro, then just google your distro and openvpn. It will give you links to show you where your config file is, but it could well be the same as above (so look there first!).
Now you need to make the same change for each of your client config files. So whatever devices are using the VPN server, find their local config files (in windows it’s user/yourname/openvpn/config/file-name.ovpn) and make the same change as above. No need to restart anything, just load the config into your client and run. Hey presto, your server is running with TCP. Enjoy a more reliable/consistent, and marginally slower vpn service.
🙂

April 7, 2018

Server Security

Routine maintenance and general configuration management of a new cloud server supporting EXPLOINSIGHTS installed as an LXC instance is going ok. Nextcloud’s security scan has given me a top rating for a new installation, which is encouraging but not enough to rest on laurels. This installation is currently a mirror (in terms of files) of a current install on a live server, but after testing, this will become the main cloud server for the organizations needs and the older server will be retired.

This version is of course running with server-side file encryption. As part of the testing process, client-side encryption (a feature of Nextcloud version 13) will be evaluated, as BoxCryptor (the current Exploinsights, Inc. end to end encryption service) causes a few operational issues.

March 1, 2018July 31, 2018

Server Outage Automated-Reporting

So there are a lot of methods of checking to see if your server(s) are running. I have spent some time adapting a simple ping script. My servers are all LAN side of a single IP address. If my external (WAN) IP goes down, it’s likely either my ISP, or a power cut that’s gone on long enough to kill the modem/router UPS. Not a lot I can do about that unless I am at the office (maybe not then!). BUT what about my multiple containerized LAN servers, which, as I now know, can just “stop working”.
I have today set-up a simple PING-TEST script (which I downloaded and adapted). I run it via cron every hour, and it emails me if a server is down. It pings each LAN server and does nothing else if all is well. If all is not well, it emails me.
Why not run it every minute? Well, if a server is down, I might not be able to fix it quickly (or even at all) and it will email me the same bad news every minute until I do fix it. I could have just left the office for a week… I don’t think I need that kind of “excitement”. Every hour will do for now. 🙂
PS – here’s a link to the script I am currently running, if anyone is interested. The source of the original script is shown in the comments. As I run the script in cron, I let cron email me if there’s any results. Nil output = no email. Any other ouput and my Inbox gets ping’d (pun intended). 🙂
Note – when viewing the link, only the first few lines are displayed. Feel free to download a copy, if you are that interested. Or just check out the referenced source.

March 1, 2018

Server Backups using LXD

So I am working the process of server backups today. Most people do backups wrong, and I have been guilty of that too. You know it’s true when you accidentally delete a file, and you think ‘No worries, I’ll restore it from a backup…’; and about an hour later of opening archives and trying to extract the one file but finding some issue or other…makes you realize your backup strategy sucks. I am thus trying to do get this right from the get-go today:
LXD makes the process easy (albeit with a few quirks). EXPLOINSIGHTS Inc. (EI) servers are structured such that each service is running in an LXD container. Today, there are several active, ‘production’ servers (plus several developmental servers, which are ignored in this posting):

Nextcloud – cloud file storage;
WordPress – this web-site/blog;
Onlyoffice – an ‘OnlyOffice’ document server;
Haproxy – the front-end server that routes traffic across the LAN

All of these services are running on one physical device. They are important to EI as customers access these servers (whether they know it or not), so they need to JUST WORK.
What can I do if the single device (‘server1’) running these services just dies? well I have battery backup, so a power glitch won’t do it. Check. And the modem/router are also UPS charged, so connectivity is good. Check. I don’t have RAID on the device but I do have new HD’s – low risk (but not great). Half-check there. And if the device hardware just crashes and burns just because it can…well that’s what I want to fix today:
So my way of creating functionally useful backups is to do the following, as a simple loop in a script file:

For each <container name> on server1:
1. lxc stop <container-name>
2. lxc copy <container-name> TO server2:<container-name##>
3. lxc restart <container-name>
Next <container-name>

The ‘##’ at the end of the lxc copy command is the week-number, so I can create weekly container backups EASILY and store them on server2. I had hoped to do this without stopping the containers, but the criu LXD add-on program (which is supposed to provide that very capability) is not performing properly on server2, so I have a brief server-outage when I run this script for each service for now. I thus have to try to run this at “quite times”, if such a thing exists; but I can live with that for now.
I did a dry-run today: I executed the script, then I stopped two of the production containers. I then launched the backup containers with the command:

lxc start <container-name##>

I then edited the LAN addresses for these services and I was operational again IN MINUTES. The only user-experience change I noticed was my login credentials expired, but other than that it was exactly the same experience “as a user”. Just awesome!
Such a strategy is of no use if you need 100% up-time, but this works for EI for now until I develop something better. And to be clear, this solution is still far from perfect so it’s always going to be a work in progress:-
Residual risks include:

Both servers are on same premises, so e.g. fire or theft risks are not covered;
1. Really hard to fix this because of data residency and control requirements.
This strategy requires human intervention to launch the backup servers, so there could be considerable downtime. Starting a backup lxd container for the haproxy server will also require changes at the router (this one container receives and routes all http and https traffic except ssh/vpn connections. The LAN router presently sends everything to this server. A backup container will have a different LAN IP address thus router reconfig is needed);
The cloud file storage container is not small – about 13GB today. 52 weeks of those will have a notable impact on storage at server2 (but storage is cheap);
I still have to routinely check that the backup containers actually WORK (so practice drills are needed);
I have to manually add new production containers to my script – easy to forget;
I don’t like scheduled downtime for the servers…

But overall, today, I am satisfied with this approach. The backup script will be placed in a cron file for auto-execution weekly. I may make my script a bit more friendly by sending log files and/or email notification etc., but for now a manual check-up on backup status will suffice.

February 28, 2018

Self-Hosting Journey Continues

Microsoft helped the journey to independence from them this week: EXPLOINSIGHTS Inc. (EI) signed-up for Project Online service, but after several days of frustration, the decision has been reversed.
Was it a bad program? No. Or more correctly: we don’t know. The service was never added to EI services portal. Even after about five or more emails to the Microsoft help-desk. So we never got to USE the program we PAID for.
We have been evaluating OnlyOffice as an alternative office suite to the Microsoft Office 365 products. That journey is still underway, as EI’s customers use Microsoft products (because their customer, the DoD, uses the same product), and compatibility is a concern; BUT OnlyOffice is definitely a contender. The self-hosted server EI installed for online creation/editing of the EI cloud storage server Nextcloud is proved to be stable and reliable, which is an encouraging start.
With a need for a Project Management software package (and in the absence of a Microsoft option, even one we paid for!) the journey has expanded this week, and several Open S0urce Project Management packages are being evaluated, including:

Open Project
OnlyOffice Community Server (Project app)
Gantt Project
Redmine

These all have pro’s and cons of course. Web portal offerings are most attractive as they allow for Customer visibility of programs, but they seem to be the least configurable so far. The Open Source ‘Gantt Project’ product is excellent and is a virtual Microsoft Project replacement BUT it runs on Java script, which is a system security weak-point. And it’s client-side desktop install only, so no server protection or easy customer access either. Of the four, Open Project has been abandoned. It was easy to install and it has a great web portal BUT you can’t add a new task easily: it always appears at the end of a schedule, which makes for complicated-looking Gantt charts. The OnlyOffice portal is better, but it does not allow for dependencies across milestones, which is counter-intuitive and makes it easy to miss important implications for e.g. a slipping milestone/task.
Redmine has potential: it’s a server-side install BUT the system depends on third-party plugins to get really good features, and these are always an area of concern from a security perspective (and they make installation more difficult too).
As has proved to be the case for Office suite software, finding a replacement for Microsoft Project is not “easy”, but it’s a rewarding journey because of the greatly improved awareness it creates regarding the options.
Much more to do before EI can officially drop Microsoft, but the journey continues.

February 22, 2018July 27, 2018

Ransomware

Another day, another government agency hit by ransomware:
Here
We need some serious effort to take down those behind these attacks; crypto currency does not help, as bad guys hide behind anonymous payments. I am also left wondering how long before I get hacked AND whether my backup strategy will work. That said, since my backups are NOT CONNECTED TO THE INTERNET so I at least have a fighting chance.
I backup whenever I am in my office, onto separate drives that are not internet connected, so ransomware cannot easily affect them. That doesn’t help my systems, which can be rebuilt, but my data at least are relatively safe.
My Nextcloud instance provides some protection too, as file versioning means my changed files are always retained even if changed by malware.
Good luck to those who have to worry about this stuff. #MeToo!
#Offline is sometimes the only way

February 19, 2018

Ubuntu Copying Microsoft

So Ubuntu are now starting to collect “telemetry data”. It’s an opt-out “feature” (but not 100%), but this is a bad move.
Maybe time to start evaluating other distros, because it’s a slippery slope that WILL only mine more and more data with every release; and ultimately it will antagonize compliance requirements for EXPLOINSIGHTS Inc., which are already difficult enough to manage.

February 9, 2018

HTTP:// is dead, long live HTTPS://

Google is effectively killing off HTTP web-sites. Wonderful news, as http sites are very weak links in the security chain. And there’s simply no excuse for running an insecure site these days, as high quality HTTPS: certificates from a CA are free of charge (but not free of effort) – e.g “letsencrypt“.
Are you ready for the flip of the kill-switch?

February 6, 2018

Server Updates

Updating critical software is something not to be taken lightly. It’s nerve-wracking when your business operations rely upon such systems.
What has helped EXPLOINSIGHTS Inc. (EI) sleep better at night is the extensive use of unprivileged containers or so-called virtual machines. Most of the EI support software is installed on unprivileged LXC containers, which is a standard component of the Ubuntu 16.04 Linux distribution.
Today was a typical day for EI: an update to a major release of Nextcloud. This critical software houses EI’s data and is the hub for data sharing with customers and stakeholders. If this upgrade goes wrong, my customers can’t download their files. #Embarrassing – or maybe even worse; loss of critical data? To make it more enjoyable, I am not in the office today – so the update has to be performed remotely via secure SSH. That’s an excellent recipe for high stress…normally.
So how did EI mitigate this update risk? With the following simple command entered at the host machine terminal via secure SSH access (i.e. WITHOUT SuperUser privileges!):
LXC snapshot NC pre-13-upgrade
That’s it. Painless. Super-safe (no Superuser rights!). Blindingly fast. Very efficient. And this creates a full working snapshot of the EI current cloud configuration – files, links, settings, SSL-certs, SQL database, apache2 configs – absolutely everything needed to completely restore the setup should the upgrade process break something critical.
Breaking this command down:

LXC – this is the command we issue to fire up the Ubuntu LXC/LXD virtual machine management hypervisor, followed by three parameters:
- snapshot – tells LXC to take a full working snapshot of the running instance;
- NC – the name of the EI container that runs the Nextcloud instance – the one we want to backup;
- pre-13-upgrade – a name assigned to the snapshot (easy to remember).

Yes, it’s that simple. After that, the Nextcloud upgrade process was initiated…and as it happens, everything went smoothly, so the snapshot was NOT actually needed to recover the pre-version-13 upgrade – but it will be kept for a while just to make sure there are no bugs waiting in the shadows. Here’s the new EI cloud instance:

EI cloud software – UPDATED to latest version

If a major problem arises, then the following command entered at the same terminal, again as a non-SuperUser, restores the entire pre-version 13 instance:
LXC restore NC pre-13-upgrade
#NoWorries 🙂
This restore command overwrites the current instance with the pre-upgraded and fully functioning snapshot. The only risk is losing files/links created since completing the upgrade process – way better than a total rebuild.
LXC makes it so convenient to update major platforms. The entire process was fast, safe; easy. And because all the work is performed in non-SuperUser unprivileged mode, it comes with the confidence of knowing you can’t accidentally break an important part of the core system on the way. It’s so good, it’s almost boring – but only almost.
Checkout using LXC to run your small business support software, it’s better than prescription-grade sleeping tablets for helping you with the upgrade process! Official documents are here. And there’s a ton of useful tutorials to get you started – Google is your friend.