15:41:16 <dgilmore> #startmeeting RELENG (2016-05-23)
15:41:16 <zodbot> Meeting started Mon May 23 15:41:16 2016 UTC.  The chair is dgilmore. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:41:16 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
15:41:16 <zodbot> The meeting name has been set to 'releng_(2016-05-23)'
15:41:16 <dgilmore> #meetingname releng
15:41:16 <zodbot> The meeting name has been set to 'releng'
15:41:16 <dgilmore> #chair dgilmore nirik tyll sharkcz bochecha masta pbrobinson pingou maxamillion
15:41:16 <zodbot> Current chairs: bochecha dgilmore masta maxamillion nirik pbrobinson pingou sharkcz tyll
15:41:19 <dgilmore> #topic init process
15:41:35 <nirik> morning
15:41:42 * pbrobinson o/
15:42:52 <dgilmore> lets get started
15:43:05 <dgilmore> #topic ostree status
15:43:28 <dgilmore> #info atomoic repos removed from the mirrors
15:43:33 <maxamillion> sorry I'm late
15:43:46 <dgilmore> so we had to remove the atomic repos from the mirrors last week
15:44:02 <pbrobinson> maxamillion: dgilmore was 11 mins late so what ever ;-)
15:44:20 <maxamillion> \o/
15:44:25 <dgilmore> it was causing issues for the mirrors in mirroring /pub/fedora/ it was causing load issues for the netapp
15:44:26 * nirik is planning a blog post/mailing list post on mirror changes later this week.
15:44:35 <maxamillion> nirik: +1
15:44:59 <dgilmore> I did implement automated cleanup of the rawhide ostree repo
15:45:13 <dgilmore> it now only has 2 weeks of history
15:45:48 <dgilmore> we are exporting the ostree repo on kojipkgs
15:46:08 <dgilmore> #info need to investigate pulp for ostree repo management
15:46:26 <dgilmore> #info need to engage mirrors for ways to mirror ostree repos
15:46:45 * masta is here
15:46:46 <dgilmore> we need to work out a better way than rsync to mirror the content
15:47:01 <dgilmore> and betetr repo management
15:48:01 <dgilmore> does anyone have any questions, comments or concerns?
15:48:44 <dgilmore> we also need to work with ostree upstream and mirrormanager to make sure that mirror support works
15:48:56 <dgilmore> so that users actually use the mirror network
15:49:36 <maxamillion> +1
15:49:43 <nirik> yeah.
15:49:44 <dgilmore> this is going to be critically important in f25
15:50:14 <maxamillion> dgilmore: either that or we need to have it be able to use something like pulp-crane that will serve 302 redirects to the content's final resting place
15:50:37 <maxamillion> (i.e. - have something like crane but for ostrees instead of docker)
15:50:40 <dgilmore> as there will be a workstation ostree, and xdgapps(renamed to something) are delivered as ostree layers
15:50:46 <pbrobinson> or add a crane style implementation to mirror manager that will work for both docker and ostree
15:50:53 <nirik> flatpak
15:51:10 <dgilmore> thanks nirik
15:51:28 <nirik> probibly something good to discuss at flock with mm developers at some point
15:51:41 <dgilmore> I am expeting us to be delivering more content via ostree repos
15:51:48 <dgilmore> nirik: yeah
15:52:19 <dgilmore> one of my flock talks is on getting new artifacts into fedora
15:52:30 <dgilmore> there are so many things to be aware ogf
15:52:31 <dgilmore> of
15:53:05 <nirik> yeah
15:53:23 <dgilmore> I expect we will need to have some more discussions around this leading up to f25 and a bunch of work
15:53:54 <dgilmore> at least workstation ostree is just a preview thing at this point
15:54:21 <dgilmore> anyway
15:54:22 <dgilmore> #topic Secondary Architectures updates
15:54:23 <dgilmore> #topic Secondary Architectures update - ppc
15:54:32 <dgilmore> pbrobinson: how is ppc looking?
15:54:39 <pbrobinson> not bad
15:54:43 <dgilmore> the hub migrated last week right?
15:54:44 <pbrobinson> we moved the hub on Friday!
15:54:53 <pbrobinson> so PPC is now 100% ansible
15:54:55 <dgilmore> #info ppc hub now in ansible
15:55:03 <pbrobinson> I pushed a bunch of DNS cleanups this morning
15:55:34 <dgilmore> cool
15:55:42 <dgilmore> any compose issues to note?
15:55:47 <pbrobinson> there's a few minor bits of cleanup to do on the PPC infra but the vast majority of the ppc infra rebuild is now complete :) (finally)
15:55:53 <nirik> hurray!
15:56:13 <nirik> pbrobinson: should we shutdown the old hub now? or leave it until we remove that host?
15:56:45 <pbrobinson> nirik: I was going to give it a couple of days more, I had a req for the contents of a home dir
15:57:02 <pbrobinson> but I think we should shut it down and ship it off to IBM
15:57:02 <nirik> ok. I see it's still trying to run some crons.
15:57:26 <pbrobinson> nirik: I'll login and have a look and see what those are shortly
15:58:01 <pbrobinson> but overall by the end of the week I think it can be killed off once and for all
15:58:12 <dgilmore> excellent
15:58:22 <nirik> great
15:58:22 <pbrobinson> compose side of things we have a grub2 issue for f-24
15:58:38 <dgilmore> pjones working on it?
15:58:52 <pbrobinson> and I need to look at why rawhide compose is failing (actually all secondary arches are but all for different reasons :-/ )
15:58:57 <pbrobinson> yep, he is
15:59:24 <pbrobinson> but overall, other than the grub2 issue, ppc is looking pretty good.
15:59:32 <nirik> there's a samba needs something systemd doesn''t provide thing in primary at least
15:59:49 <dgilmore> cool
15:59:54 <dgilmore> #topic Secondary Architectures update - s390
16:00:01 <dgilmore> sharkcz: how is s390?
16:00:28 <pbrobinson> making s390 compose complete is my other task for this week
16:00:36 <sharkcz> dgilmore: build-wise it looks good, but still working on the compose side
16:00:37 <dgilmore> okay
16:00:52 <dgilmore> webdav not working quite right?
16:00:57 <pbrobinson> I have a couple of days PTO at the end of the week but hope to get it glued together before then
16:01:02 <pbrobinson> it's very slow
16:01:17 <pbrobinson> it kind of works but not really well enough it seems sadly
16:01:24 <dgilmore> :( okay
16:01:44 <dgilmore> I need to take some PTO myself
16:02:10 <dgilmore> #info s390 composes still being worked on
16:02:25 <pbrobinson> I have family in town over the next 10 days
16:02:25 <dgilmore> #topic Secondary Architectures update - arm
16:02:32 <dgilmore> how is aarch64 pbrobinson?
16:02:47 <pbrobinson> arm is looking reasonable, the shim/grub2 issue is now closed
16:03:04 <pbrobinson> build wise it's looking very good with only about 20 packages behind
16:03:22 <pbrobinson> I've been working on disk images and have them composing using imagefactory
16:03:31 <dgilmore> nice
16:03:35 <pbrobinson> now I just need to get boards like piun64 booting
16:03:38 <pbrobinson> pine64
16:03:53 <pbrobinson> got a fix into blivet and that's now stable
16:04:08 <pbrobinson> so we're reasonably good there
16:04:17 * nirik ordered one of those, dont have it yet tho
16:04:43 <pbrobinson> at the moment I have issues between u-boot/grub2/kernel that I'm trying to work out
16:04:48 <dgilmore> nirik: I have a couple of them coming, I think mine is held up on the ABS case
16:05:05 <pbrobinson> it's almost there I suspect it's just a little tweak needed, I just need to work it out.
16:05:31 <dgilmore> pbrobinson: cool, I will poke at it myself when I get the hardware
16:05:40 <dgilmore> unless its sorted before then
16:05:44 <pbrobinson> I plan on poking at it again this evening and looking at a couple of the other devices I have, but Pine64 is the one I want to get working as a lot of people will have it soon
16:06:04 <pbrobinson> I want it sorted with everything stable by freeze so I have less than a week :)
16:06:45 <dgilmore> freeze will be on us really soon
16:06:56 <pbrobinson> yes, a touch over a week right?
16:07:29 <dgilmore> next tuesday
16:07:43 <dgilmore> which is the day after a public holiday in the US
16:07:54 <pbrobinson> yes, next Mon is a pub holiday here too
16:08:16 <dgilmore> will need to make sure that the freeze announcement email goes out thursday
16:08:17 <pbrobinson> in fact I think across most of the EU
16:08:48 <dgilmore> #action dgilmore to send freeze announcement email Thursday
16:09:18 <dgilmore> anything else we need to discuss?
16:09:20 <nirik> oh yeah, I forgot about that.
16:09:35 <nirik> well, we will still be pushing updates in until tuesday no?
16:10:24 <dgilmore> nirik: yeah, but people need to be sure that they get the request in before they go away if they are taking monday off
16:10:34 <dgilmore> and getting the request in Tuesday will be too late
16:10:58 <pbrobinson> I think I'm on push duty next week
16:11:03 <nirik> sure... just needs to be a bit different than the normal "freeze has started" one...
16:11:06 <dgilmore> pbrobinson: I think its me
16:11:09 <dgilmore> you are this week
16:11:14 * nirik is this week
16:11:16 <pbrobinson> nirik: is this week
16:11:17 <maxamillion> :)
16:11:23 <dgilmore> my bad
16:11:23 <pbrobinson> and I normally follow him
16:11:42 <nirik> we should add maxamillion into the fun sometime. ;)
16:11:44 <dgilmore> nirik: yeah, will need to be a bit different
16:11:50 <pbrobinson> poor maxamillion
16:12:00 <dgilmore> nirik: yeah It is on the cards to happen
16:12:18 <maxamillion> nirik: +1
16:12:26 <maxamillion> I absolutely want to join the fun
16:12:28 <dgilmore> #topic next meeting
16:12:50 <maxamillion> nirik: I need to learn that stuff and get thrown into rotation :)
16:12:58 <dgilmore> given that next monday is a public holiday for most of us, we will not be having a meeting next week
16:13:33 <dgilmore> #info meeting on May 30 will be skipped
16:13:39 <dgilmore> #topic open floor
16:13:50 <nirik> I had one note...
16:14:07 <dgilmore> nirik: go for it
16:14:17 <nirik> I've scheduled outages for tomorrow/wed for mass update/reboots... tomorrow is the build side ones.
16:14:36 <nirik> I don't expect any issues, just want to make sure we are all up to date before freeze.
16:14:40 <pbrobinson> nirik: this is just the standard patch and reboot cycle?
16:14:43 <maxamillion> nirik: +1
16:14:44 <nirik> yep.
16:14:57 <nirik> pbrobinson: I can do the secondary arch stuff, or leave it for you if you prefer... whatever works.
16:14:59 <pbrobinson> cool, let me know if there's any you want me to do
16:15:19 <nirik> and will likely do primary builders sometime this week as time permits.
16:15:29 <pbrobinson> nirik: I'm happy for you to run with it, now should be all in ansible so straight forward
16:15:34 <nirik> yep.
16:15:43 <dgilmore> nirik: cool. i am working on some patches for koji
16:15:59 <dgilmore> so we can disable the login button, and fix some kojiweb bugs
16:15:59 <pbrobinson> nirik: leave the secondary builders to me if you like, I need to do a arm/ppc updates push so should do that first
16:16:04 <nirik> dgilmore: nice.
16:16:12 <nirik> pbrobinson: sounds good.
16:16:27 <dgilmore> I will try get them in place
16:16:57 <pbrobinson> dgilmore: we'd likely be better to hold off than rush and end up with breakage?
16:17:21 <dgilmore> pbrobinson: well I would test in stage first, they are mostly done at this point
16:17:33 <pbrobinson> cool
16:17:36 <pbrobinson> just checking
16:17:55 <dgilmore> I need to make sure we get teh oz with the anaconda logging in place also
16:18:00 <dgilmore> we lost it last month
16:18:22 <nirik> our infra/builder repos need a housecleaning
16:18:31 <dgilmore> yeah
16:19:04 <dgilmore> nirik: I would really like to see us do real builds and make a compose for them all
16:19:19 <dgilmore> but that may be a slightly down the road thing
16:19:41 <nirik> we have talked about that in the past... it would be nice to have a build target for infra/builder stuff... so we could build things with deps that aren't in yet
16:19:53 <dgilmore> yeah
16:19:59 <nirik> right now we only have copr, but thats not nearly as... auditable
16:20:03 <dgilmore> maybe we should sit down and figure it out
16:20:46 <dgilmore> I have one thing to ring up when this is done
16:20:56 <nirik> sure. I think we should. It would be helpfull for infra.
16:21:18 <dgilmore> nirik: cool. maybe once freeze is in place we can talk about it
16:21:50 <nirik> ok.
16:22:00 <dgilmore> alright
16:22:08 <dgilmore> my thing is /pub
16:22:23 <dgilmore> I ran hardlink across all of /pub last week
16:22:34 <dgilmore> I thik it took about 3 days to run
16:22:38 <maxamillion> jeez
16:22:42 <nirik> yeah, it was a long long time.
16:22:43 <dgilmore> but saved over 568G
16:22:47 <maxamillion> nice
16:22:51 <nirik> and it also made things slow
16:23:01 <dgilmore> right :(
16:23:17 <dgilmore> so we need to do better about making sure that we hardlink
16:23:38 <dgilmore> as we rsync content in place using appropriate --link-dest
16:23:51 <dgilmore> some of it was hardlinking noacrh drpms etween arches
16:23:57 <dgilmore> between
16:24:27 <dgilmore> there was a bunch of savings going on in the gnome software centre screenshots directory
16:24:43 <dgilmore> a lot of duolicated content there
16:25:00 <dgilmore> duplicated even
16:25:22 <dgilmore> its not something that I would like us to run often
16:25:23 <nirik> what --link-dest do we pass?
16:25:40 <dgilmore> nirik: depends on what the thing is
16:26:02 <nirik> we might check them over again... also in new rsync you can pass more than one.
16:26:19 <dgilmore> the reason that I manually rsync RC composes is to make sure I get all the different places in
16:26:53 <nirik> yeah, I saw there was a lot of hardlink between alpha/beta and development/24
16:27:08 <dgilmore> nirik: yeah I pass multiple for RC's, pub/fedora/linux/development/24/Everything and teh previos compose Everything
16:27:37 <nirik> the server/cloud/everything trees likely have a lot too
16:27:47 <dgilmore> there should be a lot to /pub/alt/stage/ also
16:27:54 <dgilmore> yep
16:28:14 <dgilmore> something we need to becareful of
16:28:32 <dgilmore> I suspect a lot of hardlinking was broken in the archive process
16:28:36 <nirik> yeah, perhaps we should do a /pub hardlink a few weeks after every cycle or something...
16:28:52 <nirik> or at archive time perhaps
16:28:58 <dgilmore> nirik: probably wouldn't hurt
16:29:11 <dgilmore> there is probablya  few times we should do it
16:29:25 <dgilmore> and there is parts of the tree we should do more often
16:29:38 <dgilmore> /pub/alt probably weekly
16:29:57 <nirik> it hammers the storage pretty hard tho.
16:30:07 <dgilmore> it does
16:30:39 <dgilmore> maybe with the new filelist update we can do it a bit more calculated
16:30:53 <dgilmore> and limit the parts of disk we run it on
16:31:21 <nirik> could be... might be able to use those lists to find all the things that should be hardlinked and just run on them
16:31:36 <dgilmore> yeah
16:31:41 <dgilmore> which reminds me
16:32:03 <dgilmore> tibbs_: has been working on new thing to make mirroring fedora simpler
16:32:06 <dgilmore> and faster
16:32:28 <tibbs|w> Well, at least faster.
16:32:39 * nirik has been testing on download-ib01
16:32:48 <tibbs|w> AH, cool.  For me it's working well.
16:32:49 <dgilmore> in order to have it work we need to update a file on the master mirror everytime we update content
16:33:13 <dgilmore> right now I believe nirik has it running in a cron job
16:33:39 <dgilmore> but we should look at extenting the scripts for RC's and secondary arches, and other content pushes to run it
16:33:41 <nirik> yeah, there's actually multiple of them
16:34:15 <nirik> epel and fedora (taken care of in the updates sync script)
16:34:20 <nirik> alt (in cron hourly)
16:34:25 <dgilmore> nightly.sh for rawhide and branched will need to run it n /pub/fedora and /pub/alt
16:34:29 <nirik> archive (manual for now)
16:34:42 <dgilmore> nirik: fedora-secondary?
16:34:42 <nirik> fedora-secondary (added to pbrobinson's list to add)
16:35:08 <dgilmore> nirik: fedora is updated by more than updates pushes
16:35:56 <nirik> dgilmore: oh, right, the rawhide/branched ones update it too
16:36:50 <dgilmore> ideally we probably want to have something listening for content being changed, and running it
16:37:09 <dgilmore> rather than updating 20 or 30 places to update it
16:37:17 <nirik> yeah, possibly incron might work, but it's a ton of files/dirs to watch
16:38:10 <dgilmore> yeah
16:38:11 <dgilmore> 11T  8.7T  1.8T  84% /pub
16:38:26 <dgilmore> pub is using 8.7T of 11T
16:38:37 <dgilmore> it is a big volume
16:38:54 <nirik> also, If it matters at all.. fedora_koji is not hardlinked very much at all.
16:38:59 <nirik> but thats probibly fine.
16:39:10 <dgilmore> nirik: it should be
16:39:19 <dgilmore> what parts are not hardlinked?
16:39:29 <tibbs|w> Your pub is way smaller than my pub.  I'm at pretty much 12T.
16:39:58 <dgilmore> tibbs|w: something is wrong on yoru side then, that is the master
16:40:07 <nirik> dgilmore: well, the netapp's dedupe is saving a ton
16:40:17 <nirik> Filesystem                used      saved       %saved
16:40:17 <nirik> /vol/fedora_koji/  33834717248
16:40:17 <nirik> 24311087116          42%
16:40:29 <dgilmore> nirik: okay, I wonder if taht will go away as we purge old mashes for updates
16:40:30 <pbrobinson> nirik: but the nfs will still report the total, not the de-dedupe right?
16:41:04 <nirik> Filesystem               total       used      avail capacity  Mounted on
16:41:04 <nirik> /vol/fedora_koji/         38TB       31TB     6716GB      83%  /fedora_koji
16:41:37 <tibbs|w> I don't rightly know.  But I did a complete rsync run over all of the modules (just plain rsync, not using my script) and that's what I got.
16:42:02 <dgilmore> tibbs|w: there is a lot of content hardlinked between modules
16:42:07 <tibbs|w> I can run it again, but the script does find newly hardlinked files.
16:42:37 <nirik> to add to confusion... the actual ftp volume is 10T ;)
16:43:02 <tibbs|w> If hardlink didn't take three days to run....
16:43:13 <nirik> so the 11 there must be dedupe savings + snapshots
16:43:15 <tibbs|w> Anyway, I didn't mean to distract from the meeting.
16:43:22 <nirik> Filesystem                used      saved       %saved
16:43:22 <nirik> /vol/fedora_ftp/    9280723952 1498453156          14%
16:43:28 <dgilmore> tibbs|w: no problems
16:43:42 <nirik> note that the netapp dedupe is block level so it's likely always going to get more than hardlinking
16:43:52 <tibbs|w> The repo is https://pagure.io/quick-fedora-mirror if anyone wants to look.
16:44:13 <tibbs|w> The script is in zsh currently, but has no other deps besides rsync.
16:44:29 <dgilmore> nirik: yeah it can probably dedupe signed and unsigned rpm content
16:44:36 <dgilmore> something we can not do with hardlink
16:46:54 <dgilmore> anyway, we are over on time
16:47:08 <dgilmore> does anyone have anything else? or should I wrap up?
16:47:13 <tibbs|w> BTW, if you add up DIRECTORY_SIZES.txt you do get nearly 12TB, so it must be the netapp.
16:49:34 <dgilmore> tibbs|w: that does not account for cross module hardlinking
16:49:42 <tibbs|w> You're right.
16:49:50 <dgilmore> I wonder if you used buffet if it hardlinked more
16:50:11 <tibbs|w> Well, the script transfers all of the modules you mirror together.
16:50:17 <maxamillion> buffet?
16:50:34 <tibbs|w> fedora-buffet is the module which contains everything.
16:50:53 <maxamillion> ah
16:50:56 <tibbs|w> So even if you don't mirror buffet, the script should detect hardlinks between the modules which you do mirror.
16:51:16 <maxamillion> I was thinking some storage
16:51:22 <tibbs|w> Well, the script doesn't do it; rsync should do it.  It appears to do so when I try.
16:51:33 <maxamillion> I was thinking some storage util I didn't know about
16:51:34 <maxamillion> *
16:52:08 <dgilmore> I do not have enough disk to mirror all of fedora up
16:52:25 <dgilmore> I have a bunch of 1T disks I oculd use, just nothing to run them in
16:52:56 <tibbs|w> I have 4x4TB RAID0 in each of my mirror hosts.
16:53:30 <dgilmore> nice, I also have a 400G a month data cap
16:53:54 <tibbs|w> Yeah, that wouldn't go too far even if you sneakernet the initial data.
16:54:45 <dgilmore> nopee
16:55:06 <tibbs|w> These are just three old machines I threw together.  Sadly only 32GB of RAM apiece which limits their speed significantly.  I just wanted to have something I could experiment with.
16:55:43 * nirik notes our download servers only have 32GB. ;)
16:55:47 <dgilmore> tibbs|w: :) I am glad you could, I think its going to be useful, Just need to get the mirrors to use it.
16:56:01 <tibbs|w> Yeah, one thing at a time.
16:56:13 <tibbs|w> But every client is load saved, I think.
16:56:24 <tibbs|w> There is one big issue, though: directory timestamps.
16:56:46 <tibbs|w> It's fixable, but I don't know how important it is that all mirrors have exactly the same timestamps on the directories.
16:57:48 <tibbs|w> I can focus on that if people think it's important.
16:57:54 <dgilmore> tibbs|w: not sure it matters too much
16:58:15 <tibbs|w> Someone brought it up on the mirror list.
16:58:17 <dgilmore> it will mean you get some churn if you switch the mirror you pull from
16:58:19 * nirik either
16:58:36 <tibbs|w> If you're using this script, you won't actually get churn.
16:59:14 <tibbs|w> And rsync doesn't care about timestamps on directories anyway.  It will copy them but that's about it.
16:59:48 <dgilmore> so not really a big deal, the content and permissions are more critical
17:00:09 <dgilmore> as in making sure its readable during staging correctly
17:01:06 <tibbs|w> In the end it uses rsync, so it does whatever rsync does.
17:01:17 * dgilmore needs to wrap up this, I have another meeting starting now
17:01:23 <dgilmore> tibbs|w: works for me :)
17:01:33 <nirik> so right now during stage if you have access you sync, if you don't you get errors for that content.
17:01:34 <tibbs|w> Feel free to ask questions wherever you find me, or file issues in pagure.
17:01:56 <dgilmore> #endmeeting