Signal

Nice to see the Signal messaging app getting some love today:

See here.

We agree with a lot of what is said in that article. Signal is a major upgrade over WhatsApp and is quite simply the BEST, most SECURE and most PRIVATE messaging app developed for any platform.

To hack Signal messages, a users device has to be compromised.

We encourage our Customers to use Signal as a vehicle for receiving second-factor passwords for file download links from our servers where we store our precious Customer data.

Check out Signal – available from IOS and Android app stores. 👍

Encrypting Data in the Cloud

So we saw this online article, and it got our attention:

https://www.windowscentral.com/how-encrypt-data-storing-it-cloud-and-why-you-should

What got our attention was that the title alone makes sense – you should ALWAYS end-to-end encrypt your personal data if you want to positively keep it secure. But the article doesn’t go into any useful information as to how you can share end-to-end encrypted data, and nor does it point out that in fact you cannot share end-to-end encrypted data without exposing the key. So we think that’s an important oversight in their publication.

We run our own cloud server, using Nextcloud (version 15 as of today, but the Version 16 upgrade is being pushed out now, so we expect to update our server shortly). You’d think that because we run our own server that perhaps we don’t bother or need to encrypt our own cloud-stored data. And you’d be wrong! We have a seasoned set-up using Nextloud and Cryptomator – two completely independent and Open Source software programs that handle our cloud and encrypted-data therein respectively. We wrote an article about how we implement and use this, including how we SHARE files with clients and stakeholders (see article here).

We applaud articles that advocate end-to-end encryption of cloud-stored data, but we have our spin on it in a way that takes it deeper than the article on Windows Central’s web-site.

Go forth and encrypt!

Free Munfs ago I cooldan’t evan spill Enginear. Now I Are One

So we know that’s a pretty strange blog-post title, but it comes from an old Chemistry class long ago, when we were complaining about Engineers and how they like to improve things that are not broken (so that they become broken). Well that’s what we did, and this posting is written in the unlikely hope that this lesson can be passed on.

We were “cleaning up” our HAPROXY server configuration file because, well, we had comment fields in there from prior configuration set-ups and so forth. So what harm can you do when you are “just” removing comment lines from a config file, huh? But hey, while we are there, let’s make things “consistent”. We did. And shortly after, coincidentally, our superbly reliable and brilliant OnlyOffice document server…stopped working.

We thought maybe a router setting had gotten reset in a power outage (we have had that before). So we checked. Nope, not the router. Maybe the DNS was pointing to the inactive backup server? Nope, not that either. What could it possibly be?

We eventually discovered what we did, which is why we now think we are Engineers: we tinkered with something that wasn’t broken, and then we broke it. It was in our HTTPS HAPROXY server configuration setting. This was the ‘new improved’ line (we changed IP addresses in the line below just because we could/should):

server is_onlyoffice 192.169.11.87:443 check

And this was the ‘old, unimproved’ line:

server is_onlyoffice 192.169.11.87:443

We added the ‘check’ as haproxy is able to check to see if a server is alive – but it doesn’t always work with every server, and as we now know, it certainly doesn’t work with ours. So we ‘unimproved’ that configuration line and…we now have a functioning OnlyOffice document server running again, which we also upgraded to the latest version to make sure we hadn’t completely wasted our time at the terminal.

Key takeaway? If it ain’t broke, PLEASE, don’t fix it. Hopefully we can learn that lesson, even if no-one else can.

#Embarrassing 🙂

Nextcloud 15 – Evaluation Update

*** FURTHER UPDATE: 27-DEC-18
We continued operating the Nextcloud 15 instances in development mode. We are convinced it has some operational bugs but as far as we can tell they do not impact stability or security (they are ‘annoying’ because they are pervasive). But the instances themselves are stable and operating without significant issue. We have continued with the auto-syncing instances plan in these development servers and that has performed well. They act as one server: you change a file in one instance, and within a few minutes, all servers have the changes. It’s not terribly useful because it does not sync file-links and such, but it’s a reasonable back-up option for a power cut on one of the servers. We are going to upgrade this development set-up to Production status and begin the retirement of our Nextcloud 13 servers. This could be an interesting phase, since the Version 13 servers have performed well (but are approaching end-of-update life).
*** END OF UPDATE

As much as we really hate to say this: Nextcloud 15 is a poor replacement for Nextcloud 13, albeit SO FAR. Just too many bugs to make it even worthy of our time for evaluating right now. We have reported some bugs, but not all of them as they are too non-reproducible. Maybe we were too quick off the mark, but one thought struck us: would we have fielded our first Nextcloud a year ago if we had seen as many errors and strange results in THAT version? #NotSure. We are very glad we have Nextcloud version 13 running – it has proven to be rock-solid for us.

We will wait for the next stable release of Nextcloud version 15 before we even TRY to evaluate this updated version further. Hopefully it will be a lot more consistent and error-free.

We still like our plan to set up several auto-synchronizing Nextcloud ‘nodes’, but we have abandoned our plans to look at using Nextcloud 15 for this project, so it goes on-hold for a while.

Nextcloud Release 15

Well we got on the Nextcloud 15 release quickly and have created our first networked Nextcloud server instances. These are still a development effort – not yet good enough for Production use, but maybe after a patch or two they will be.

Our NEW configuration involves several simultaneous networked Nextcloud installs, so we have built-in server-independant redundancy. If any one instance goes down, the others are ready to take up the slack.

This is the first time we have tried a configuration like this, and it’s really just a step on the journey for a more robust system that has better Operational Continuity.

We have servers that will, once the configuration is complete, be operated in different geographical locations to provide for power-outage or an even more catastrophic event at any one location (think meteorite hitting the office…). And we have tried to make this client-invisible: we have done this by configuring EACH server with basically two SSL certs – one of which is a unique server-specific SSL, the other is an SSL shared by all servers:

server 1 www.server1.com and cloud.server.com
server2 www.server2.com and cloud.server.com
server3 www.server3.com and cloud.server.com

The servers are sync’d via our sftp installations, which is the heart of our cloud file access system. A file can be changed either via the web portal or via an sftp connection, and the file changes are propagated quickly to each of the Networked servers. This gives each server a copy of the files – AND the revised files. That in itself provides a back-up capability, but that’s not what we did this for:

The primary cloud.server.com is IP configured at our DNS server (we use google’s service) and if the current live ‘cloud.server.com’ site goes down (power cut, malfunction, theft, fire or famine…etc.), we can redirect the DNS server and change the IP address to the next server. This takes a little time for now (it’s a manual change at the google DNS server for now) but what it does allow is for us to provide a file link from cloud.server.com and it will work at any/all of servers 1, 2 or 3 user-transparently. This is still a bit fiddly and we know we really need to do more, but it’s a start.

Ultimately we want to automate this a lot more (having to change a DNS record requres human intervention – an automatic weak-link). But for now, it should give us somewhat improved operational continuity when we get our next major outage at one of the server locations.

Nextcloud’s server installation probing seems happy with our configuration:

This was welcome, as the installation process has changed from prior versions.

And SSL labs is OK with our HTTPS configuration too (and yes, we note that TLS 1.1 is still enabled – we have not yet pulled the trigger on that one, but we are likely too once we are sure it won’t impact our customers):

We have a lot of work to do before we can launch this new configuration for Production, but hopefully this won’t take too long.

Nextcloud 15 has a lot of life in it, so hopefully this will allow us to find the time to further strengthen our Operational Continuity further. But this is not a bad start. 🙂

Our Real-World Response to a Major Power Loss

So we suffered a MAJOR power failure in our office district earlier this week.  A major snowfall came to town, and a few lines and supplies got damaged.  Damaged so badly, that in our case we lost power.  Not for the usual few seconds or even minutes (which our battery backups can easily handle), but for what turned out to be a whole day.

Our servers, all three of them, went DOWN after the batteries powering them became exhausted.  (We were not present when they went down, we just knew it was coming…).

For a small-business IT failure, this is a problem.  We don’t have a ‘gaggle’ of IT staff waiting in the wings to take care of issues like this.  So, we don’t have a redundant array of back-up servers that can seamlessly kick-in presumably like the big IT giants have.

BUT we did have a plan.  AND it did (most) of the job.  AND what made it easier for us was our favorite service: LXC.

We routinely make live snapshot backups of our LXC containers.  We have several of them:

  • HAPROXY (our front end traffic handler)
  • Our Primary Nextcloud file sharing and cloud storage (mission-critical)
  • Our VPN server
  • Our OnlyOffice Document server
  • TWO web sites (this being one of them) 

We have backup containers stored across our servers, one of which is located in a facility that uses a different Electrical power utility company, deliberately located away from the primary head-office.  And THAT is what we brought online.    So what happened?  We learned of the power cut (you can get text-messages from most utility companies, but don’t expect them to be quick – so maybe get someone to call you too!).  We spun-up our backup LXC containers, then we went to our domain name service web portal and changed the IP address of our HAPROXY server (switched from the Head Office to the alternate location).  We had to wait about an hour for the DNS records to get propagated.  Then…our services were back online.

How did we do?

We did “OK”.  We did not do “GREAT”.  Embarrassingly, one of our LXC containers had NOT been copied at all due to human error.  It turned out we had an experimental/development LXC container named as our primary web site.  So we thought we had a copy of the web-site container, but in fact, it was a container for a new WordPress Install …intended to become our primary web server.  We had to scramble there, so we give ourselves 3/10 for that one.  We also had THIS website that did work flawlessly except that the last article posted did not get propagated to the backup server.  We give us 9/10 for that one.

The HAPROXY server worked brilliantly – 10/10.   We did not have OnlyOffice copied over – 0/10 there.  And the big one; the Nextcloud server?  Well that worked BUT the url had to change.  It was the ONLY change.  We did not lose a file, but we really don’t like having to change our url.  So…we are going to try to devise  a solution to that, but we give us 9/10 for that one, since no services or files or sharing services were lost or even delayed.  

Oh, and THIS TIME our router worked as it should.  We had our settings correctly saved (a lesson from a prior power failure, but this time we did not learn it the hard way again!)

Overall?  We give us an 8/10 as we had to scramble a little as we realized a web container was NOT what we thought it was.  It’s not mission critical, but it is ‘mission-desirable’, so we have to deduct some points.

We are completely convinced we could NOT have done this if we did not use the totally AWESOME LXC service.  It is such an incredibly powerful Linux service.  

MAJOR LESSONS AND TAKEAWAYS

  • Power utility companies SUCK at informing customers of power outages.  Don’t expect this to be reliable – we have a plan to be less reliant on them telling us about their outages in the future.
  • DNS propagation takes time.  That will give you some down-time if you have to change an IP address.  We think there’s a workaround here, using MULTIPLE domain names for each service, but all having one URL in common.  That has to be tested, and then implemented…and then we’ll write about it.  🙂
  • Erm, perhaps run LXC container copies in a SCRIPT so that you don’t get confused by poorly names R&D containers.  <BLUSH>
  • An LXC lesson: containers DO NOT ALWAYS AUTOSTART after an unplanned shutdown of the PC.  You have to manually restart them after boot up.  This is EASY to do but you have downtime when you are used to them automatically starting after boot-up.  They auto-started in two of our servers, but not a third.  Go figure…
  • We have to figure out how to use ‘WAKE-ON-LAN’ to reliably avoid requiring a human to press a power button on a server.  More on that to come another day we think…
  • Having a PLAN to deal with a power outage is a BRILLIANT IDEA, because you probably will have a major outage one day.  A plan that involves an alternate power-company is a GREAT idea.  We had a plan.  We had an alternate location.  We still had services when the districts power went down.

Updated WordPress – Good or Bad?

So lots of chatter on Twitter about the new WordPress, and especially the new editor.  With a lot more negative comments than positive.  So here’s to our new Editor – this is just a simple post to see if it works.  And if it doesn’t, we might not worry too much as we have recovered from a major (near city-wide) power outage that even killed off our servers.  Yet we stayed online, because we had a plan.

Look for our story on the power failure which includes some major take-away lessons for us, in an article to be posted soon.  Short version: it is worth planning for events like that.  We did, but it didn’t go perfectly.  Hopefully the next such excursion will go even better for us.

Dumb Admin LXC-Fan VPN “Tip”

So, if you are wondering why you can’t remotely decrypt your server after a reboot, you might check your VPN…

Ours was running in an LXC container…on the very computer we were trying to reboot. SO of course, the container WAS NOT RUNNING. After some panic when SSH did not respond, we found that ALL web services did not respond.  We actually really liked that!   It was only then that we figured out or dumb mistake…

“Pro Tip” – Disconnect the non-functioning VPN before you SSH to the encrypted server.   It’s a lot less stressful than trying to do it the other way round.  😐

TLS 1.3 has rolled out for Apache2

The best cryptography in HTTPS got a helping hand for wider adoption, as Apache2 now incorporates TLS 1.3 support.

We have updated Apache2 in some of our LXC-based servers already, and the rest will be completed soon enough.  Apache version 2.4.37 gives us this upgraded TLS support.  #AWESOME.

And this upgrade is non-trivial.  TLS 1.3 is very robust for deterring eavesdropping on your connections, for even a determined hacker.   This is another step to improving the security of the internet, and we welcome and support it.   TLS 1.3 is also FASTER, which is a welcome side-effect.

As part of our server-side Apache upgrade, this site now offers TLS 1.3 to your browser during the https handshake.  And it works too, as shown in a snapshot of this site/post from one of our Android mobile devices:

“The connection uses TLS 1.3.” 👍

We are now contemplating disabling the cryptographically weaker TLS 1.1 connections with our sites now, which might block some users who still deploy old browsers, but it will make the connections more secure.   We are thinking that perhaps causing some customer inconvenience (by blocking TLS 1.1) outweighs the risk of malware /cyberattacks on what might be OUR data.  We encourage EVERYONE who visits this site to use modern, up-to-date browsers like Google Chrome, Firefox etc.  We’ll post an update when we make the decision to actively block TLS 1.1, but if you use a really old browser, you might not ever read it because this site too will drop support of TLS 1.1 once we roll out that policy.  🙂

For the curious, we recommend you visit the excellent SSL-Labs site to test your servers (if any), the sites you visit and your actual web-browser.  Happy browsing!

EXPLOINSIGHTS IT OVERVIEW

So we have been asked for more information about our Information System by small-business owners who are also coming to grips with ITAR, NIST-800-171 etc., and the need for a USABLE AND USEFUL ‘IT’ infrastructure.

Our methods might not suit all: our system is especially developed to suit a remote-access and remote-maintenance capability, as we are spending a LOT of time way from the corporate office (where we locate our servers), but in case it helps, here’s a snapshot of what we do, in a rather summary form (with hyperkinks added to most of the referenced software, so you can drill down in any item of interest):

Hardware: we use LAPTOPS to host our primary Linux servers.  They are extremely capable for our small business needs AND they come with built-in battery back-up for the odd power-cut or so.

Software: we run ALL of our services in LXD containers, hosted on a single hardware server that runs Linux Ubuntu 16.04 as the host service.  We make minimal changes to the real server; we deploy as much as we can via LXD (since it’s containers are, in our case, unprivileged and thus safe and secure de-facto).

We secure all of our hardware with Linux LUKS full disk encryption, which we can remotely reboot AND unlock via Dropbear SSH.  #PrettyCool

ALL of our SSH (most of which are via OpenSSH, except for the DropBear disk decryption process after reboot) connections require public/private keys.  Our port numbers are…unusual (we don’t exist at port 22, and we don’t make that search easy, but feel free to verify that).  #SSH-somewhat-Hardened

We employ 2FA for all of our server SSH logins.  That actually means we have TRIPLE factor authentication since our private SSH key needs a password too. #WayBetterThan-US-State-Department

We have THREE servers that mirror capabilities (so that as and when one dies, we can still be online.  It works too, having tried it once already for real – yikes).  It’s not as good as what a large corporation will do, but it’s better than nothing.  #LiveBackupsAreEssential

Our LXD containers are all running on LUKS encrypted zfs drives. #VeryCool

SOFTWARE:

We have already mentioned Ubuntu 16.04 server and LXD, but it’s so good it’s worth a re-mention.  🙂

We run Nextcloud server (latest version, or maybe latest-1 -we don’t enjoy being the first to field the latest version of this mission-critical software).  This is the HEART of our Operations, as ALL of our CUI/ITAR documents are managed via the totally brilliant Nextcloud.  All logins are via 2FA second-factor credentials.  We also employ server-side encryption of Nextcloud; and…

We extensively use Cryptomator to provide END TO END encryption of ALL CUI/ITAR and other mission-critical data.

We use sftp via our strong SSH to access server files.  This is via a Nautilus/Cryptomator (Linux) or Mountain Duck (Windows) interface.

We run an OnlyOffice server so we can remotely create/edit CUI (Word, Excel and PowerPoint formats) even when overseas.  The free Desktop Apps are also used by us as part of our journey to move away from the expensive and metadata-mining Microsoft Office365 products (the only software subscription we have).  Regrettably, DoD uses Microsoft so it’s hard to completely eliminate the Office365 products from our toolbox.  One day, maybe…

We use WordPress for our web-site services (including for THIS POST).

We use haproxy as our front-end server just behind the router.  Fast, reliable.

We run our own OpenVPN server and use it whenever we are in an un-trusted location.  We don’t use this to hide our identity/location (like many privacy-minded people, and even some bad-guys do).  Rather, we use it to prevent man-in-the-middle threats to our online data when at hotels etc. that meet our high-risk profile.  It’s unwise to trust a free wifi hotspot.  And even many paid-for services should be viewed with suspicion.  That said, our connectivity to our servers is always via  HTTPS via LetsEncrypt certificates, so the VPN server is arguably an overkill at times.

We use ‘andOTP‘ to manage our multiple 2FA credentials.  This is better than the standard Google Authentication app.

We use android devices.  We employ full disk encryption on our android devices.

We still use Microsoft Office 365 for our email.  We constantly agonise over that, and maybe one day we will run our own mail server.  But not today.  We don’t use the Outlook app – we only access email via the web portal.  We think it sucks, but email is so hard for small businesses: customers servers will likely reject self-hosted server emails as an anti-spam measure, so you may never know if some emails make it or not.  We can’t operate with that risk.  We think it’s important to have a reliable email service.  Office 365 does that job nicely, as much as we hate to admit it…  🙂

We believe we are COMPLIANT with all the regulations that impact our primary business.

Overall, we are quite pleased at how well our integrated systems WORK TOGETHER. It’s proven to be reliable, usable and satisfactory for our business needs.  How do you run YOUR small business IT?

Questions or comments as usual via email to: administration@exploinsights.com

Or you can message us on LinkedIn or Twitter, which is probably faster (but not private).

🙂