#archlinux-ports | Logs for 2017-08-03

[00:16:43] -!- piernov has quit [Quit: No Ping reply in 120 seconds.]
[00:17:51] -!- piernov has joined #archlinux-ports
[00:24:54] -!- piernov has quit [Quit: No Ping reply in 120 seconds.]
[00:25:42] -!- isacdaavid has quit [Ping timeout: 260 seconds]
[00:27:01] -!- piernov has joined #archlinux-ports
[01:14:05] -!- p71 has quit [Read error: Connection reset by peer]
[01:35:20] -!- {levi} has quit [Quit: Leaving]
[01:49:48] -!- p71 has joined #archlinux-ports
[03:10:14] -!- Faalagorn has quit [Remote host closed the connection]
[03:56:52] -!- eschwartz has quit [Ping timeout: 260 seconds]
[03:58:43] -!- guys has quit [Ping timeout: 258 seconds]
[04:04:31] -!- guys has joined #archlinux-ports
[04:06:32] -!- eschwartz has joined #archlinux-ports
[04:53:09] <tyzoid> deep42thought: Gluster looks like it *might* work, but georeplication through gluster is only one-way. We'd likely set up a shared NFS using gluster, using georeplication to back it up to offline/local disks.
[04:53:17] <tyzoid> Assuming we went with gluster.
[05:52:05] -!- guys has quit [Quit: A random quit message]
[05:52:42] -!- anyone has joined #archlinux-ports
[06:47:12] -!- deep42thought has joined #archlinux-ports
[07:02:44] -!- deep42thought has quit [Remote host closed the connection]
[07:38:24] -!- deep42thought has joined #archlinux-ports
[07:43:31] -!- deep42thought has quit [Remote host closed the connection]
[08:22:43] -!- eschwartz has quit [Ping timeout: 240 seconds]
[08:26:57] -!- eschwartz has joined #archlinux-ports
[08:30:19] -!- deep42thought has joined #archlinux-ports
[09:28:07] -!- Faalagorn has joined #archlinux-ports
[09:47:21] -!- Faalagorn has quit [Ping timeout: 255 seconds]
[10:02:24] -!- Faalagorn has joined #archlinux-ports
[10:11:36] -!- alyptik has quit [Ping timeout: 260 seconds]
[10:13:27] -!- alyptik has joined #archlinux-ports
[14:07:22] -!- dantob has joined #archlinux-ports
[14:48:23] <tyz> deep42thought: Hi
[14:48:37] <tyz> deep42thought: Not sure if you got my message from last night
[14:49:00] <deep42thought> got it
[14:50:02] <tyz> deep42thought: Yeah, so it looks like if we want to use glusterfs, we'd just use the distributed feature of the system itself, basically setting up a raid1-like config between all the servers
[14:50:18] <tyz> deep42thought: Then if we had local disks/archive disks, we could make backups there.
[14:50:27] <deep42thought> and potentially other (less public) servers
[14:50:33] <tyz> right
[14:51:02] <deep42thought> the problem I see with glusterfs is, that it's not that different from simple backups via rsync, then
[14:51:03] <tyz> so I think it could work, but I'm not sure how it would function given the latency between the nodes.
[14:51:19] <tyz> deep42thought: That's gluster georeplication, which is essentially rsyncs
[14:51:39] <deep42thought> but we need geo-replication due to the otherwise high latency
[14:51:40] <tyz> deep42thought: but the general NFS-like architecture allows the shared filesystem we're looking for
[14:51:47] <tyz> deep42thought: We'd need to do some testing
[14:52:01] <deep42thought> glusterfs is really bad for high latency
[14:52:08] <tyz> deep42thought: Perhaps we could set up a simple two-node system between your server and mine just to check?
[14:52:28] <tyz> deep42thought: If we set it up as raid-1, I don't see the latency being too bad of an issue for reads.
[14:52:49] <tyz> the writes will be more common though, but it can't hurt to test.
[14:53:12] <tyz> deep42thought: The main issue is we don't want one-way replication, we want many-way replication
[14:53:22] <deep42thought> https://www.nuxeo.com
[14:53:23] <phrik> Title: Some GlusterFS Experiments and Benchmarks | Nuxeo (at www.nuxeo.com)
[14:54:03] <deep42thought> but many-way replication on high latency connection won't give a fast file system
[14:54:33] <deep42thought> ah, ok, these stats are for writes only
[14:55:13] <deep42thought> what I also don't like about gluster (vs. infinity.sh) is that it does store all data on the local server, too
[14:55:19] <deep42thought> e.g. it's not assymetric
[14:55:21] <deep42thought> afaik
[14:58:16] <tyz> deep42thought: gluster can be asymetric, my vision of a raid-1 like config wasn't but we could totally make it asymetric.
[14:58:31] <tyz> deep42thought: All you need to do is set up multiple bricks on different servers.
[14:58:46] <deep42thought> ah, right
[14:59:14] <tyz> deep42thought: Re the writes issue: Writes will be our most common operation
[15:00:30] <deep42thought> if you have your production directory copied to the local glusterfs mountpoint: yes
[15:00:53] <deep42thought> but this is, how it should be done anyway, right?
[15:00:55] <tyz> deep42thought: My thought was taking the prod directory, and periodically updating the gluster moutpoint
[15:01:01] <deep42thought> yeah
[15:01:04] <deep42thought> that's what I meant
[15:01:15] <tyz> deep42thought: That way, the prod system isn't affected by the slowness of whatever mount we choose
[15:01:42] <deep42thought> actually, then we don't care about latency - we just copy many files on parallel
[15:01:47] <tyz> deep42thought: The other option is to make a container that has read keys to all of the servers/filesystems, which we would boot and then use to copy each other's stuff.
[15:01:47] <deep42thought> :-/
[15:02:19] <tyz> deep42thought: Though I like having a more integrated solution like gluster
[15:03:39] <deep42thought> or just set up read-only rsync daemons?
[15:04:01] <tyz> deep42thought: That's what my idea for the containerized process was
[15:04:18] <deep42thought> but actually, glusterfs sounds better (redundancy is handled automatically)
[15:04:30] <tyz> deep42thought: That's what I was thinking
[15:05:30] <deep42thought> does glusterfs need local space (besides for caching)?
[15:05:47] <tyz> deep42thought: Only for storing the bricks, afaik
[15:06:02] <tyz> deep42thought: Also, I thought it only did in-memory caching
[15:06:07] <deep42thought> and I can't connect to a glusterfs if I don't contribute bricks?
[15:06:20] <tyz> deep42thought: No, anyone can connect, it acts like NFS
[15:06:27] <deep42thought> ah, good
[15:06:46] <tyz> deep42thought: The client and server are separated in the gluster architecture, we're just making all of our servers also clients.
[15:06:58] <deep42thought> I would rather not
[15:07:04] <tyz> but there's no need to actually add storage capability to the servers themselves.
[15:07:10] <tyz> deep42thought: why not?
[15:07:51] <deep42thought> because e.g. the build master does not need and have much space, thus it would not be a good idea to let it provide backup space
[15:09:15] <tyz> deep42thought: Okay, that's fair.
[15:11:57] <deep42thought> ok, so let's set up some glusterfs-cluster between your and my server
[15:12:00] <tyz> deep42thought: Do you have time to set up a test system between our two servers?
[15:12:02] <tyz> ah, lol
[15:12:28] <deep42thought> do you have a guide?
[15:13:30] <tyz> deep42thought: I'm using the one on digitalocean: https://www.digitalocean.com
[15:13:32] <phrik> Title: How To Create a Redundant Storage Pool Using GlusterFS on Ubuntu Servers | DigitalOcean (at www.digitalocean.com)
[15:13:47] <tyz> I'm running ubuntu server 14.04 on my mirror, so it seems to work fine there.
[15:14:13] <tyz> deep42thought: What worries me is that there's no auth requirement to start the peering process
[15:14:35] <deep42thought> hmm
[15:14:37] <tyz> I wonder if there's a way to configure one
[15:15:56] <tyz> looks like auth can be set up between clients and brick servers, via ssl/tls
[15:16:08] <deep42thought> http://gluster.readthedocs.io
[15:16:10] <phrik> Title: Export and Netgroup Authentication - Gluster Docs (at gluster.readthedocs.io)
[15:16:30] <deep42thought> I have no idea if that's what we're looking for ...
[15:17:32] <tyz> deep42thought: No, that's ip restrictions on client connections to the server
[15:17:38] <deep42thought> hmpf
[15:17:44] <tyz> deep42thought: Let's do the test first, then we can see about auth.
[15:17:54] <tyz> deep42thought: We don't need to put sensitive data on it yet.
[15:18:33] <deep42thought> ok
[15:19:28] <tyz> deep42thought: Let me know when you start peering with me, my host is cdn.tyzoid.com
[15:53:42] -!- eschwartz has quit [Ping timeout: 240 seconds]
[16:02:57] -!- eschwartz has joined #archlinux-ports
[16:11:56] <deep42thought> somehow I'm not happy with how moving packages between the repos works
[16:12:09] <tyz> isn't it just a simple case of moving the files then repo-add?
[16:12:18] <deep42thought> in principle: yes, but:
[16:12:29] <deep42thought> the build master is different than the master mirror
[16:12:42] <deep42thought> so it needs to download the files from the master mirror first
[16:13:27] <tyz> deep42thought: If you're worried about inconsistant state, you could keep a staging repo tree, then do a copy from the staging repo tree over to the true tree
[16:13:49] <tyz> not sure if it's possible to merge repo database files
[16:13:58] <deep42thought> but the tree might get large
[16:14:06] <tyz> deep42thought: It'd only be a diff tree
[16:14:15] <tyz> so only things that change between mirror updates
[16:14:18] <deep42thought> e.g. current community-testing plus testing is 10GB >= size of / of buildmaster
[16:14:48] <tyz> deep42thought: The idea is that the extra tree is just a staging area
[16:15:00] <deep42thought> I do something similar, I think
[16:15:01] <tyz> deep42thought: The majority of files should live on the master mirror
[16:15:23] <tyz> but then you can download them to a staging area, and do a copy from localfs to localfs
[16:15:31] <tyz> (or a move)
[16:15:40] <deep42thought> but I still need to download all packages that should be mover
[16:15:44] <deep42thought> moved*
[16:15:57] <deep42thought> and this seems to take a lot of time
[16:16:05] <deep42thought> (not really sure, why, though)
[16:16:22] <tyz> Right, but the point is that the download location is a cache dir, which means that you don't add it to the mirror until the dl is complete
[16:16:49] <deep42thought> ah, you mean a cache on the mastermirror server
[16:16:55] <tyz> deep42thought: right
[16:17:20] <tyz> deep42thought: It may be possible - though I'm not sure, to do an overlayfs from an existing directory with a tempfs on top
[16:17:24] <tyz> as your caching dir
[16:17:32] <tyz> with the root being the master mirror
[16:17:46] <tyz> then when you're ready to "apply", carry all those changes down to the root at the same time.
[16:17:59] <deep42thought> I think the copy steps are atomic enough
[16:18:09] <deep42thought> my worry is the huge amount of time it takes at all
[16:18:14] <tyz> deep42thought: Right, but this would allow you to do the repo-add step before doing the copy
[16:18:23] <deep42thought> I do already
[16:19:10] <deep42thought> i download the database files locally, do the repo-add and repo-removes (with local copies of the packages) and then copy the database file via rsync and move the packages within the master mirror via sftp
[16:19:43] <deep42thought> before the last two steps, nothing on the master mirror has changed
[16:19:50] <deep42thought> and those two steps happen pretty fast
[16:19:54] <tyz> I see.
[16:20:01] <deep42thought> the problem is the first part, which takes like forever
[16:20:14] <deep42thought> (e.g. my ssh connection died, before it was finished)
[16:20:30] <tyz> deep42thought: When that happens, is the master mirror left in an inconsistant state?
[16:20:37] <deep42thought> no
[16:20:52] <deep42thought> only if it's aborted in the middle of the last two steps
[16:21:28] <tyz> deep42thought: So what happens if you glusterfs mirror between the buildmaster and the master mirror?
[16:21:43] <tyz> then it'll be passively copied, and you can just copy off of the local mirror?
[16:22:07] <tyz> that'll basically create a shared directory between the two servers
[16:22:23] <deep42thought> but I would need the complete space of the master mirror on the build master
[16:22:29] <deep42thought> which I don't want to
[16:22:32] <tyz> all the bricks could live on the master mirror
[16:22:48] <tyz> deep42thought: just glusterfs-client on the buildmaster, connecting up to the mirror
[16:22:49] <deep42thought> then the access would be so slow, that the operations would not be atomic anymore
[16:22:58] <tyz> deep42thought: They don't need to be?
[16:23:17] <tyz> deep42thought: The glusterfs mount on the master mirror would be a different directory
[16:23:38] <tyz> so when you need to do the copy, you're copying from the glusterfs dir to the mirror webroot, which is a local copy
[16:23:42] <deep42thought> Then I don't understand, what problem you want to tackle with glusterfs
[16:24:04] <tyz> deep42thought: keeping a persistant shared directory, so that ssh death isn't a problem
[16:24:15] <tyz> deep42thought: Also reducing the need to maintain that connection
[16:24:43] <tyz> It'd essentially be a glusterfs with only one node and two clients
[16:24:54] <tyz> where the server node is itself a client of itself
[16:25:18] <deep42thought> and you would use this solely for moving packages and packa databases around
[16:25:40] <tyz> deep42thought: Correct. That would elimintate the sshfs/rsync between the buildmaster and the mirror
[16:26:04] <deep42thought> I'm not sure if this is a simplification
[16:26:08] <tyz> deep42thought: And since the buildmaster is in france, it shouldn't be that high of a latency
[16:26:42] <tyz> deep42thought: It's less of a simplification and more of an optimization. That means you don't have to maintain that connection.
[16:27:05] <tyz> deep42thought: Instead of doing a manual sync from buildmaster up to the mirror, it always syncs
[16:27:06] <deep42thought> but I need to take care on both ends about moving files
[16:27:27] <tyz> deep42thought: Not really. It moves the "caretaking" from the buildmaster up to solely your master mirror
[16:29:04] <deep42thought> but since all the "intelligence" runs on the build master, it'd need to tell the master mirror which packages it requires
[16:37:17] <tyz> true
[16:42:59] <deep42thought> cu later
[16:44:30] -!- deep42thought has quit [Quit: Leaving.]
[17:24:56] -!- dantob has quit [Quit: dantob]
[17:46:33] -!- deep42thought has joined #archlinux-ports
[17:59:27] -!- isacdaavid has joined #archlinux-ports
[18:20:12] -!- isacdaavid has quit [Ping timeout: 240 seconds]
[18:21:10] -!- isacdaavid has joined #archlinux-ports
[18:45:32] -!- eschwartz has quit [Remote host closed the connection]
[18:46:03] -!- eschwartz has joined #archlinux-ports
[20:59:13] -!- alyptik has quit [Ping timeout: 248 seconds]
[20:59:36] -!- alyptik has joined #archlinux-ports
[21:36:00] -!- typikal has joined #archlinux-ports
[21:36:19] -!- alyptik has quit [Ping timeout: 260 seconds]
[21:37:11] typikal is now known as alyptik
[22:12:17] -!- fsckd has quit [Ping timeout: 248 seconds]
[23:04:06] -!- tyz has quit [Quit: WeeChat 1.9]
[23:17:22] -!- fsckd has joined #archlinux-ports