• 0 Posts
  • 97 Comments
Joined 2 years ago
cake
Cake day: June 20th, 2023

help-circle


  • solrize@lemmy.worldtoSelfhosted@lemmy.worldSelfhosting wikipedia
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    2
    ·
    4 days ago

    I haven’t looked in a few years but 20TB is probably plenty. I agree that Wikipedia lost its way once it got all that attention online and all that search traffic. Everyone should have their own copy of Wikipedia. I used to download the daily incremental data dumps but got tired of it. I still have a few TB of them around that I’ve been wanting to merge.


  • The text is in not-exactly-convenient database dumps (see other commenter’s link) and there are daily diffs (mostly bot noise), but then there are the images and other media, which are way up in the terabytes by now. There are some docs, maybe out of date, about how to run the software yourself. It’s written in PHP and it’s big and complicated.





  • I see, fair enough. Replication is never instantaneous, so do you have definite bounds on how much latency you’ll accept? Do you really want independent git servers online? Most HA systems have a primary and a failover, so users only see one server. If you want to use Ceph, in practice all servers would be in the same DC. Is that ok?

    I think I’d look in one of the many git books out there to see what they say about replication schemes. This sounds like something that must have been done before.


  • Why do you want 5 git servers instead of, say, 2? Are you after something more than high availability? Are you trying to run something like GitHub where some repos might have stupendous concurrent read traffic? What about update traffic?

    What happens if the servers sometimes get out of sync for 0.5 sec or whatever, as long as each is in a consistent state at all times?

    Anyway my first idea isn’t rsync, but rather, use update hooks to replicate pushes to the other servers, so the updates will still look atomic to clients. Alternatively, use a replicated file system under Ceph or the like, so you can quickly migrate failed servers. That’s a standard cloud hosting setup.

    What real world workload do you have, that appeared suddenly enough that your devs couldn’t stay in top of it, and you find yourself seeking advice from us relatively clueless dweebs on Lemmy? It’s not a problem most git users deal with. Git is pretty fast and most users are ok with a single server and a backup.





  • Dedi will perform a lot better and be more consistent and reliable. They’re not THAT expensive if you’re making nontrivial use of them. Otherwise maybe you can keep moving around between Contabo products. Keep in mind too that hdd performance will seem a lot better when you’re not sharing it with dozens of other users. I have an HDD server and it’s fine for browsing. Might not be great for large seek-intensive databases but I’m not currently doing that

    Anyway you can also ask on lowendspirit.com which is a forum about budget vps.




  • I used Squirrelmail briefly. It had a minor security bug which was easy to fix, but when I reported it to the devs, I couldn’t convince them that it was actually a bug. I decided that they weren’t paranoid enough to be working on that type of software, so I stopped using it.

    Currently I’m not self-hosting email but am using mxroute.com which has a FOSS mail client that seems ok. I can’t check right now what it is, but maybe later.

    Fastmail’s webmail is pretty good and they said something a while back about releasing it as FOSS but idk if that has happened.

    Right now I mostly use Thunderbird rather than webmail. It sucks in many ways but I’ve had too much going on to pursue alternatives.

    I think Google got it right early on when they realized that email clients should be backed by a serious search engine. The search features of a typical IMAP server aren’t enough and the one in Thunderbird is crap. So I think this is an area where FOSS clients could use some work, if it hasn’t already been done.





  • solrize@lemmy.worldtoSelfhosted@lemmy.worldcalibre 8.0
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 month ago

    Thanks yeah I don’t have a kobo reader so was asking if there was a way to read paid-for kobo downloaded books that have drm, similar to how decss lets you watch DVDs that you bought. I don’t mind paying for books but don’t want a locked down reading device with it’s own crappy software and possible invasive phoning home.