

fail2ban isn’t a WAF?
fail2ban isn’t a WAF?
If you’re trying to do VDI in the cloud, that can get expensive fast on account of the GPU processing needed
Most of the protocols I know of the run CPU-only (and I’m perfectly happy to be proven wrong and introduced to something new) tend to fray at high latency or high resolution. The usual top two I’ve seen are VNC and RDP (XRDP project on Linux), with NoMachine and plain x11 over SSH being right behind that. I think NoMachine had the best performance of those three, but it’s been a hot minute since I’ve personally used it. XRDP is the one I’ve used the most often, but getting login/lock/unlock working was fiddly at first but seems to be stable holding.
Jumping from the “basic connection, maybe barely but not always suitable for video” to “ultra high grade high speed”, we have Parsec and Sunshine+Moonlight. Parsec is currently limited to only Windows/Mac hosting (with Linux client available), and both Parsec and Sunshine require or recommend a reasonable GPU to handle the encoding stage (although I believe Sunshine may support an X264 encoder which may exert a heavy CPU tax depending on your resolution). The specific problem of sourcing a GPU in the cloud (since you mention EC2) becomes the expensive part. This class of remote access tends to fray at high resolution and frame rate less because it’s designed to transport video and games, rather than taking shortcuts to get a minimum desktop visible.
If you have a server with out-of-band/lights-out management such as iDRAC (Dell), iLO (HPe), IPMI (generic, Supermicro, and others) or equivalent, those can measure the server’s power draw at both PSUs and total.
Yeah that’s totally fair. I have nearly a kilowatt of real time power draw these days, Rome was not built in a day.
That’s the neat part - Ceph can use a full mesh of connections with just a pair of switches and one balance-slb 2-way bond per host. So each host only needs 2 NIC ports (could be on the same NIC, I’m using eno1 and eno2 of my R730’s 4-port LOM), and then you plug each of the two ports into one switch (two total for redundancy, in case a switch goes down for maintenance or crash). You just need to make sure the switches have a path to each other at the top.
I think you’re asking too much from ZFS. Ceph, Gluster, or some other form of cluster native filesystem (GFS, OCFS, Lustre, etc) would handle all of the replication/writes atomically in the background instead of having replication run as a post processor on top of an existing storage solution.
You specifically mention a gap window - that gap window is not a bug, it’s a feature of using a replication timer, even if it’s based on an atomic snapshot. The only way to get around that gap is to use different tech. In this case, all of those above options have the ability to replicate data whenever the VM/CT makes a file I/O - and the workload won’t get a write acknowledgement until the replication has completed successfully. As far as the workload is concerned, the write just takes a few extra milliseconds compared to pure local storage (which many workloads don’t actually care about)
I’ve personally been working on a project to convert my lab from ESXi vSAN to PVE+Ceph, and conversions like that (even a simpler one like PVE+ZFS to PVE+Ceph would require the target disk to be wiped at some point in the process.
You could try temporarily storing your data on an external hard drive via USB, or if you can get your workloads into a quiet state or maintenance window, you could use the replication you already have and rebuild the disk (but not the PVE OS itself) one node at a time, and restore/migrate the workload to the new Ceph target as it’s completed.
On paper, (I have not yet personally tested this), you could even take it a step farther: for all of your VMs that connect to the NFS share for their data, you could replace that NFS container (a single point of failure) with the cluster storage engine itself. There’s not a rule I know of that says you can’t. That way, your VM data is directly written to the engine at a lower latency than VM -> NFS -> ZFS/Ceph/etc
Yeah it’s a bit of a chonk. I don’t remember the exact itemization on the power bill and I don’t have one in front of me.
My server rack has
All together that draws… 0.1 kWh… in 0.327s.
In real time terms, measured at the UPS, I have a running stable state load of 900-1100w depending on what I have at load. I call it my computationally efficient space heater because it generates more heat than is required for my apartment in winter except for the coldest of days. It has a dedicated 120v 15A circuit
I wondered if someone would post that second one.
For the first, I think Square Enix got it right - headphones light the right image, but with the bridge between the ear cups flopped back on their head.
Alternatively, you could have headphones like the first but with the drivers in the upper cat ear portion by their actual ears.
OT but am I the only one that noticed the fox’s headphones aren’t on their ears?
Hardware RAID just works, and for many, that’s good enough. In more advanced systems, all its got to handle is a boot partition, and if you’re doing your job as a sysadmin there’s zero important data in there that can’t be easily rebuilt or restored.
I never said I didn’t use software RAID, I just wanted to add information about hardware RAID controllers. Maybe I’m blind, but I’ve never seen a good implementation of software RAID for the EFI partition or boot sector. During boot, most systems I’ve seen will try to always access one partition directly and a second in order, which is bypassing the concept of a RAID, so the two would need to be kept manually in sync during updates.
Because of that, there’s one notable place where I won’t - I always use hardware RAID for at minimum the boot disk because Dell firmware natively understands everything about it from a detect/boot/replace perspective. Or doesn’t see anything at all in a good way. All four of my primary servers have a boot disk on either a Startech RAID card similar to a Dell BOSS or have an array to boot off of directly on the PERC. It’s only enough space to store the core OS.
Other than that, at home all my other physical devices are hypervisors (VMware ESXi for now until I can plot a migration), dedicated appliance devices (Synology DSM uses mdadm), or don’t have a redundant disks (my firewall - backed up to git, and my NUC Proxmox box, both firewalls and the PVE are all running ZFS for features).
Three of my four ESXi servers run vSAN, which is like Ceph and replaces RAID. Like Ceph and ZFS, it requires using an HBA or passthrough disks for full performance. The last one is my standalone server. Notably, ESXi does not support any software RAID natively that isn’t vSAN, so both of the standalone server’s arrays are hardware RAID.
When it comes time to replace that Synology it’s going to be on TrueNAS
For recovering hardware RAID: most guaranteed success is going to be a compatible controller with a similar enough firmware version. You might be able to find software that can stitch images back together, but that’s a long shot and requires a ton of disk space (which you might not have if it’s your biggest server)
I’ve used dozens of LSI-based RAID controllers in Dell servers (of both PERC and LSI name brand) for both work and homelab, and they usually recover the old array to the new controller pretty well, and also generally have a much lower failure rate than the drives themselves (I find myself replacing the cache battery more often than the controller itself)
Only twice out of the handful of times I went to a RAID controller from a different generation
As others have pointed out, this is where backups come into play. If you have to replace the server with one from a different generation, you run the risk that the drives won’t import. At that point, you’d have to sanitize the super block of the array and re-initialize it as a new array, then restore from backup. Now, the array might be just fine and you never notice a difference (like my users that had to replace a failed R815 with an 820), but the result pattern is really to the extremes of work or fault with no in between.
Standalone RAID controllers are usually pretty resilient and fail less often than disks, but they are very much NOT infallible as you are correct to assess. The advantage to software systems like mdadm, ZFS, and Ceph is that it removed the precise hardware compatibility requirements, but by no means does it remove the software compatible requirements - you’ll still have to do your research and make sure the new version is compatible with the old format, or make sure it’s the same version.
All that’s said, I don’t trust embedded motherboard RAIDs to the same degree that I trust standalone controllers. A friend of mine about 8-10 years ago ran a RAID-0 on a laptop that got it’s super block borked when we tried to firmware update the SSDs - stopped detecting the array at all. We did manage to recover data, but it needed multiple times the raw amount of storage to do so.
Apologies for being late, I wanted to be as correct as I could be.
So, straight to the point: Nextcloud by default uses plain files if you don’t configure the primary storage to be an S3/object store. As far as I can tell, this is not automatic and is an intentional change at system creation by the original admin. There is a third-party migration script, but there does not appear to be a first-party method of converting between the two. That’s very good news for you! (I think/hope)
My instance was set up as a standalone, so I cannot speak for the all-in-one image. Poking around the root data directory (datadirectory
in the config.php
), I was able to locate my user account by internal username - which if you do not use LDAP will be the shortened login name. On default LDAP configs, this internal username may be a GUID, but that can be changed during the LDAP enablement process by overriding the Internal Username field in the Expert LDAP settings.
Once in the user’s home folder in the root data directory, my subdirectory options are cache, files, files_trashbin, files_versions, uploads.
files
contains the “live” structure of how I perceive my Nextcloud home folder in the Web UI and the Nextcloud Desktop sync enginefiles_trashbin
is an unstructured data folder containing every file that was deleted by this user and kept per the trash folder’s retention policy (this can be configured at the site level). Files retain their original name, but have a suffix added which takes the form .d
where the numbers appear to be a Unix timestamp, likely the deletion date. A quick scan of these with the file
command in Linux showed that each one had an expected file header based on its extension (i.e., a .png
showed as a PNG image with an expected resolution). In the Web UI, there is metadata about which folder the file originally resided in, but I was not able to quickly identify this in the file structure. I believe this info is coming in from the SQL database.files_version
are how Nextcloud is storing its file version history (if enabled). Old versions are cleaned up per a set of default behaviors to keep more copies of more recent changes, up to a maximum age deletion threshold set at the site level. This folder is stored in approximately the same structure as the main files
live structure, however each copy of each version is appended a suffix .v
where the number appears to be the Unix timestamp the version was taken (*I have not verified that this exactly matches what the UI shows, nor have I read the source code that generates this). I’ve spot checked via the Linux file
command and sha256 that the files in this versions structure appear to be real data - tested one Excel doc and one plain text doc.I think that should get a fairly rough answer to your original question, but if I left something out you’re curious about, let me know.
Finally, I wanted to thank you for making me actually take a look at how I had decided to configure and back up my Nextcloud instance and ngl it was kind of a mess. The trash bin and versions can both get out of hand if you have frequently changing or deleting/recreating files (I have network synchronization glued onto some of my games that do not have good remote save support). Retention policy on trash and versions cleaned up extraneous data a lot, as only one of those was partially configured.
I can see a lot of room for improvements… just gotta rip the band-aid off and make intelligent decisions rather than just slapping an rsync
job that connects to the Nextcloud instance and replicates down the files and backend database. Not terrible, but not great.
In the backend I’m already using ZFS for my files and Redis database, but my core SQL database was located on the server’s root partition (which is XFS - I’d rather not mess with a DKMS module from a boot CD if something happens and upstream borks the compile, which is precisely what happened when I upgraded to OpenZFS 2.1.15).
I do not have automatic ZFS snapshots configured at this time, but based on the above, I’m reasonably confident that I could get data back from a ZFS snapshot if any of the normal guardrails within Nextcloud failed or did not work as intended (trash bin and internal version history). Plus, the data in that cursed rsync backup should be at least 90% functional.
Probably best to go with something in the 3.5" line, unless you’re going enterprise 2.5" (which are entirely different birds than consumer drives)
Whatever you get for your NAS, make sure it’s CMR and not SMR. SMR drives do not perform well in NAS arrays.
Many years ago I for some low cost 2.5" Barracuda for my servers only to find out years after I bought them that they were SMR and that may have been a contributing factor to them not being as fast as I expected.
TLDR: Read the datasheet
I don’t have a full answer to snapshots right now, but I can confirm Nextcloud has VFS support on Windows. I’ve been working on a project to move myself over to it from Syno drive. Client wise, the two have fairly similar features with one exception - Nextcloud generates one Explorer sidebar object per connection, which I think Synology handles as shortcuts in the one directory. If prefer if NC did the later or allowed me to choose, but I’m happier with what I got for now.
As for the snapshotting, you should be able to snapshot the underlying FS/DB at the same time, but I haven’t poked deeply at that. Files I believe are plain (I will disassemble my nextcloud server to confirm this tonight and update my comment), but some do preserve version history so I want to be sure before I give you final confirmation. The Nextcloud root data directory is broken up by internal user ID, which is an immutable field (you cannot change your username even in LDAP), probably because of this filesystem.
One thing that may interest you is the external storage feature, which I’ve been working on migrating a large data set I have to:
Admin docs for reference: https://docs.nextcloud.com/server/latest/admin_manual/configuration_files/external_storage_configuration_gui.html
I use LDAP user auth to my nextcloud, with two external shares to my NAS using a pass-through session password (the NAS is AD joined to the same domain as Nextcloud uses for LDAPS). I don’t know if/how the “store password in database” option is encrypted, but if anyone knows I would be curious, because using session passwords prevents the user from sharing the folder to at least a federated destination (I tried with my friend’s NC server, haven’t tried with a local user yet but I assume the same limitations apply). If that’s your vibe, then this is a feature XD.
One of my two external storage mounts is a “common” share with multiple users accessing the same directory, and the second share is \\nas.example.com\home\nextcloud. Internally, these I believe is handled by PHP spawning smbclient
subprocesses, so if you have lots of remote files and don’t want to nuke your Nextcloud, you will probably need to increase the PHP child limits (that too me too long to solve lol)
That funny sub-mount name above handles an edge case where Nextcloud/DAV can’t handle directories with certain characters - notably the # that Synology uses to expose their #recycle and #snapshot structures. This means that remote mount to SMB has a limitation at the moment where you can’t mount the base share of a Synology NAS that has this feature enabled. I tried a server-side Nextcloud plugin to try to filter this out before it exposed to DAV, but it was glitchy. Unsure if this was because I just had too many files for it to handle thanks to the way Synology snapshots are exposed or if it actually was something else - either way I worked around the problem for now by not ever mounting a base share of my Synology NAS. Other snapshot exposure methods may be affected - I have a ZFS TrueNAS Core, so maybe I’ll throw that at it and see if I can break Nextcloud again :P
Edit addon: OP just so I answer your real question when I get to this again this evening - when you said that Nextcloud might not meet your needs, was your concern specifically the server-side data format? I assume from the rest of your questions that you’re concerned with data resilience and the ability to get your data back without any vendor tools - that it will just be there when you need it.
As others said, depends on your use case. There are lots of good discussions here about mirroring vs single disks, different vendors, etc. Some backup systems may want you to have a large filesystem available that would not be otherwise attainable without a RAID 5/6.
Enterprise backups tend to fall along the recommendation called 3-2-1:
On my home system, I have 3-2-0 for most data and 4-3-0 for my most important virtual machines. My home system doesn’t have an off-site, but I do have two external hard drives connected to my NAS.
Story time
I had one of my two backup drives fail a few months ago. Literally actually nothing of value was lost, just went down to the electronics shop and bought a bigger drive from the same vendor (preserving the one on each vendor approach). Reformatted the disk, recreated the backup job, then ran the first transfer. Pretty much not a big deal, all the data was still in 2 other places - the source itself, and the NAS primary array.
The most important thing to determine about a backup when you plan one - think about how much the data is valuable to you. That’s how much you might be willing to spend on keeping that data safe.
Running nextcloud (non docker version) and I don’t see near so many client updates - usually once every few weeks, which would be a reasonable expected pace. Server updates are less frequent.
On Windows (all of my primary devices), I just install the NC client update and skip the explorer restart, pending full reboot later. Tis the nature of literally anything that deeply integrates with Explorer. I’ve seen explorer “death” during updates from several vendors that have similar explorer plugins, not just NC. Explorer sometimes just decides to nope out even without NC updating.
Now on one device I hadn’t opened for a while, I saw NC run two updates in a row, but that was my fault for procrastinating the first one.
Here’s the desktop release history: https://github.com/nextcloud/desktop/releases
I don’t see a “one every day” within the block of time between Dec 6 and today, unless you had the release candidate builds which may have been more frequent in a few spots.
In the IT world, we just call that a server. The usual golden rule for backups is 3-2-1:
So, if the data is only server side, it’s just data. If the data is only client side, it’s just data. But if the data is fully replicated on both sides, now you have a backup.
There’s a related adage regarding backups: “if there’s two copies of the data, you effectively have one. If there’s only one copy of the data, you can never guarantee it’s there”. Basically, it means you should always assume one copy somewhere will fail and you will be left with n-1 copies. In your example, if your server failed or got ransomwared, you wouldn’t have a complete dataset since the local computer doesn’t have a full replica.
I recently had a a backup drive fail on me, and all I had to do was just buy a new one. No data loss, I just regenerated the backup as soon as the drive was spun up. I’ve also had to restore entire servers that have failed. Minimal data loss since the last backup, but nothing I couldn’t rebuild.
Edit: I’m not saying what your asking for is wrong or bad, I’m just saying “backup” isn’t the right word to ask about. It’ll muddy some of the answers as to what you’re really looking for.
Found the FF14 fan lol
The release names are hilarious