15:41:16 #startmeeting RELENG (2016-05-23) 15:41:16 Meeting started Mon May 23 15:41:16 2016 UTC. The chair is dgilmore. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:41:16 Useful Commands: #action #agreed #halp #info #idea #link #topic. 15:41:16 The meeting name has been set to 'releng_(2016-05-23)' 15:41:16 #meetingname releng 15:41:16 The meeting name has been set to 'releng' 15:41:16 #chair dgilmore nirik tyll sharkcz bochecha masta pbrobinson pingou maxamillion 15:41:16 Current chairs: bochecha dgilmore masta maxamillion nirik pbrobinson pingou sharkcz tyll 15:41:19 #topic init process 15:41:35 morning 15:41:42 * pbrobinson o/ 15:42:52 lets get started 15:43:05 #topic ostree status 15:43:28 #info atomoic repos removed from the mirrors 15:43:33 sorry I'm late 15:43:46 so we had to remove the atomic repos from the mirrors last week 15:44:02 maxamillion: dgilmore was 11 mins late so what ever ;-) 15:44:20 \o/ 15:44:25 it was causing issues for the mirrors in mirroring /pub/fedora/ it was causing load issues for the netapp 15:44:26 * nirik is planning a blog post/mailing list post on mirror changes later this week. 15:44:35 nirik: +1 15:44:59 I did implement automated cleanup of the rawhide ostree repo 15:45:13 it now only has 2 weeks of history 15:45:48 we are exporting the ostree repo on kojipkgs 15:46:08 #info need to investigate pulp for ostree repo management 15:46:26 #info need to engage mirrors for ways to mirror ostree repos 15:46:45 * masta is here 15:46:46 we need to work out a better way than rsync to mirror the content 15:47:01 and betetr repo management 15:48:01 does anyone have any questions, comments or concerns? 15:48:44 we also need to work with ostree upstream and mirrormanager to make sure that mirror support works 15:48:56 so that users actually use the mirror network 15:49:36 +1 15:49:43 yeah. 15:49:44 this is going to be critically important in f25 15:50:14 dgilmore: either that or we need to have it be able to use something like pulp-crane that will serve 302 redirects to the content's final resting place 15:50:37 (i.e. - have something like crane but for ostrees instead of docker) 15:50:40 as there will be a workstation ostree, and xdgapps(renamed to something) are delivered as ostree layers 15:50:46 or add a crane style implementation to mirror manager that will work for both docker and ostree 15:50:53 flatpak 15:51:10 thanks nirik 15:51:28 probibly something good to discuss at flock with mm developers at some point 15:51:41 I am expeting us to be delivering more content via ostree repos 15:51:48 nirik: yeah 15:52:19 one of my flock talks is on getting new artifacts into fedora 15:52:30 there are so many things to be aware ogf 15:52:31 of 15:53:05 yeah 15:53:23 I expect we will need to have some more discussions around this leading up to f25 and a bunch of work 15:53:54 at least workstation ostree is just a preview thing at this point 15:54:21 anyway 15:54:22 #topic Secondary Architectures updates 15:54:23 #topic Secondary Architectures update - ppc 15:54:32 pbrobinson: how is ppc looking? 15:54:39 not bad 15:54:43 the hub migrated last week right? 15:54:44 we moved the hub on Friday! 15:54:53 so PPC is now 100% ansible 15:54:55 #info ppc hub now in ansible 15:55:03 I pushed a bunch of DNS cleanups this morning 15:55:34 cool 15:55:42 any compose issues to note? 15:55:47 there's a few minor bits of cleanup to do on the PPC infra but the vast majority of the ppc infra rebuild is now complete :) (finally) 15:55:53 hurray! 15:56:13 pbrobinson: should we shutdown the old hub now? or leave it until we remove that host? 15:56:45 nirik: I was going to give it a couple of days more, I had a req for the contents of a home dir 15:57:02 but I think we should shut it down and ship it off to IBM 15:57:02 ok. I see it's still trying to run some crons. 15:57:26 nirik: I'll login and have a look and see what those are shortly 15:58:01 but overall by the end of the week I think it can be killed off once and for all 15:58:12 excellent 15:58:22 great 15:58:22 compose side of things we have a grub2 issue for f-24 15:58:38 pjones working on it? 15:58:52 and I need to look at why rawhide compose is failing (actually all secondary arches are but all for different reasons :-/ ) 15:58:57 yep, he is 15:59:24 but overall, other than the grub2 issue, ppc is looking pretty good. 15:59:32 there's a samba needs something systemd doesn''t provide thing in primary at least 15:59:49 cool 15:59:54 #topic Secondary Architectures update - s390 16:00:01 sharkcz: how is s390? 16:00:28 making s390 compose complete is my other task for this week 16:00:36 dgilmore: build-wise it looks good, but still working on the compose side 16:00:37 okay 16:00:52 webdav not working quite right? 16:00:57 I have a couple of days PTO at the end of the week but hope to get it glued together before then 16:01:02 it's very slow 16:01:17 it kind of works but not really well enough it seems sadly 16:01:24 :( okay 16:01:44 I need to take some PTO myself 16:02:10 #info s390 composes still being worked on 16:02:25 I have family in town over the next 10 days 16:02:25 #topic Secondary Architectures update - arm 16:02:32 how is aarch64 pbrobinson? 16:02:47 arm is looking reasonable, the shim/grub2 issue is now closed 16:03:04 build wise it's looking very good with only about 20 packages behind 16:03:22 I've been working on disk images and have them composing using imagefactory 16:03:31 nice 16:03:35 now I just need to get boards like piun64 booting 16:03:38 pine64 16:03:53 got a fix into blivet and that's now stable 16:04:08 so we're reasonably good there 16:04:17 * nirik ordered one of those, dont have it yet tho 16:04:43 at the moment I have issues between u-boot/grub2/kernel that I'm trying to work out 16:04:48 nirik: I have a couple of them coming, I think mine is held up on the ABS case 16:05:05 it's almost there I suspect it's just a little tweak needed, I just need to work it out. 16:05:31 pbrobinson: cool, I will poke at it myself when I get the hardware 16:05:40 unless its sorted before then 16:05:44 I plan on poking at it again this evening and looking at a couple of the other devices I have, but Pine64 is the one I want to get working as a lot of people will have it soon 16:06:04 I want it sorted with everything stable by freeze so I have less than a week :) 16:06:45 freeze will be on us really soon 16:06:56 yes, a touch over a week right? 16:07:29 next tuesday 16:07:43 which is the day after a public holiday in the US 16:07:54 yes, next Mon is a pub holiday here too 16:08:16 will need to make sure that the freeze announcement email goes out thursday 16:08:17 in fact I think across most of the EU 16:08:48 #action dgilmore to send freeze announcement email Thursday 16:09:18 anything else we need to discuss? 16:09:20 oh yeah, I forgot about that. 16:09:35 well, we will still be pushing updates in until tuesday no? 16:10:24 nirik: yeah, but people need to be sure that they get the request in before they go away if they are taking monday off 16:10:34 and getting the request in Tuesday will be too late 16:10:58 I think I'm on push duty next week 16:11:03 sure... just needs to be a bit different than the normal "freeze has started" one... 16:11:06 pbrobinson: I think its me 16:11:09 you are this week 16:11:14 * nirik is this week 16:11:16 nirik: is this week 16:11:17 :) 16:11:23 my bad 16:11:23 and I normally follow him 16:11:42 we should add maxamillion into the fun sometime. ;) 16:11:44 nirik: yeah, will need to be a bit different 16:11:50 poor maxamillion 16:12:00 nirik: yeah It is on the cards to happen 16:12:18 nirik: +1 16:12:26 I absolutely want to join the fun 16:12:28 #topic next meeting 16:12:50 nirik: I need to learn that stuff and get thrown into rotation :) 16:12:58 given that next monday is a public holiday for most of us, we will not be having a meeting next week 16:13:33 #info meeting on May 30 will be skipped 16:13:39 #topic open floor 16:13:50 I had one note... 16:14:07 nirik: go for it 16:14:17 I've scheduled outages for tomorrow/wed for mass update/reboots... tomorrow is the build side ones. 16:14:36 I don't expect any issues, just want to make sure we are all up to date before freeze. 16:14:40 nirik: this is just the standard patch and reboot cycle? 16:14:43 nirik: +1 16:14:44 yep. 16:14:57 pbrobinson: I can do the secondary arch stuff, or leave it for you if you prefer... whatever works. 16:14:59 cool, let me know if there's any you want me to do 16:15:19 and will likely do primary builders sometime this week as time permits. 16:15:29 nirik: I'm happy for you to run with it, now should be all in ansible so straight forward 16:15:34 yep. 16:15:43 nirik: cool. i am working on some patches for koji 16:15:59 so we can disable the login button, and fix some kojiweb bugs 16:15:59 nirik: leave the secondary builders to me if you like, I need to do a arm/ppc updates push so should do that first 16:16:04 dgilmore: nice. 16:16:12 pbrobinson: sounds good. 16:16:27 I will try get them in place 16:16:57 dgilmore: we'd likely be better to hold off than rush and end up with breakage? 16:17:21 pbrobinson: well I would test in stage first, they are mostly done at this point 16:17:33 cool 16:17:36 just checking 16:17:55 I need to make sure we get teh oz with the anaconda logging in place also 16:18:00 we lost it last month 16:18:22 our infra/builder repos need a housecleaning 16:18:31 yeah 16:19:04 nirik: I would really like to see us do real builds and make a compose for them all 16:19:19 but that may be a slightly down the road thing 16:19:41 we have talked about that in the past... it would be nice to have a build target for infra/builder stuff... so we could build things with deps that aren't in yet 16:19:53 yeah 16:19:59 right now we only have copr, but thats not nearly as... auditable 16:20:03 maybe we should sit down and figure it out 16:20:46 I have one thing to ring up when this is done 16:20:56 sure. I think we should. It would be helpfull for infra. 16:21:18 nirik: cool. maybe once freeze is in place we can talk about it 16:21:50 ok. 16:22:00 alright 16:22:08 my thing is /pub 16:22:23 I ran hardlink across all of /pub last week 16:22:34 I thik it took about 3 days to run 16:22:38 jeez 16:22:42 yeah, it was a long long time. 16:22:43 but saved over 568G 16:22:47 nice 16:22:51 and it also made things slow 16:23:01 right :( 16:23:17 so we need to do better about making sure that we hardlink 16:23:38 as we rsync content in place using appropriate --link-dest 16:23:51 some of it was hardlinking noacrh drpms etween arches 16:23:57 between 16:24:27 there was a bunch of savings going on in the gnome software centre screenshots directory 16:24:43 a lot of duolicated content there 16:25:00 duplicated even 16:25:22 its not something that I would like us to run often 16:25:23 what --link-dest do we pass? 16:25:40 nirik: depends on what the thing is 16:26:02 we might check them over again... also in new rsync you can pass more than one. 16:26:19 the reason that I manually rsync RC composes is to make sure I get all the different places in 16:26:53 yeah, I saw there was a lot of hardlink between alpha/beta and development/24 16:27:08 nirik: yeah I pass multiple for RC's, pub/fedora/linux/development/24/Everything and teh previos compose Everything 16:27:37 the server/cloud/everything trees likely have a lot too 16:27:47 there should be a lot to /pub/alt/stage/ also 16:27:54 yep 16:28:14 something we need to becareful of 16:28:32 I suspect a lot of hardlinking was broken in the archive process 16:28:36 yeah, perhaps we should do a /pub hardlink a few weeks after every cycle or something... 16:28:52 or at archive time perhaps 16:28:58 nirik: probably wouldn't hurt 16:29:11 there is probablya few times we should do it 16:29:25 and there is parts of the tree we should do more often 16:29:38 /pub/alt probably weekly 16:29:57 it hammers the storage pretty hard tho. 16:30:07 it does 16:30:39 maybe with the new filelist update we can do it a bit more calculated 16:30:53 and limit the parts of disk we run it on 16:31:21 could be... might be able to use those lists to find all the things that should be hardlinked and just run on them 16:31:36 yeah 16:31:41 which reminds me 16:32:03 tibbs_: has been working on new thing to make mirroring fedora simpler 16:32:06 and faster 16:32:28 Well, at least faster. 16:32:39 * nirik has been testing on download-ib01 16:32:48 AH, cool. For me it's working well. 16:32:49 in order to have it work we need to update a file on the master mirror everytime we update content 16:33:13 right now I believe nirik has it running in a cron job 16:33:39 but we should look at extenting the scripts for RC's and secondary arches, and other content pushes to run it 16:33:41 yeah, there's actually multiple of them 16:34:15 epel and fedora (taken care of in the updates sync script) 16:34:20 alt (in cron hourly) 16:34:25 nightly.sh for rawhide and branched will need to run it n /pub/fedora and /pub/alt 16:34:29 archive (manual for now) 16:34:42 nirik: fedora-secondary? 16:34:42 fedora-secondary (added to pbrobinson's list to add) 16:35:08 nirik: fedora is updated by more than updates pushes 16:35:56 dgilmore: oh, right, the rawhide/branched ones update it too 16:36:50 ideally we probably want to have something listening for content being changed, and running it 16:37:09 rather than updating 20 or 30 places to update it 16:37:17 yeah, possibly incron might work, but it's a ton of files/dirs to watch 16:38:10 yeah 16:38:11 11T 8.7T 1.8T 84% /pub 16:38:26 pub is using 8.7T of 11T 16:38:37 it is a big volume 16:38:54 also, If it matters at all.. fedora_koji is not hardlinked very much at all. 16:38:59 but thats probibly fine. 16:39:10 nirik: it should be 16:39:19 what parts are not hardlinked? 16:39:29 Your pub is way smaller than my pub. I'm at pretty much 12T. 16:39:58 tibbs|w: something is wrong on yoru side then, that is the master 16:40:07 dgilmore: well, the netapp's dedupe is saving a ton 16:40:17 Filesystem used saved %saved 16:40:17 /vol/fedora_koji/ 33834717248 16:40:17 24311087116 42% 16:40:29 nirik: okay, I wonder if taht will go away as we purge old mashes for updates 16:40:30 nirik: but the nfs will still report the total, not the de-dedupe right? 16:41:04 Filesystem total used avail capacity Mounted on 16:41:04 /vol/fedora_koji/ 38TB 31TB 6716GB 83% /fedora_koji 16:41:37 I don't rightly know. But I did a complete rsync run over all of the modules (just plain rsync, not using my script) and that's what I got. 16:42:02 tibbs|w: there is a lot of content hardlinked between modules 16:42:07 I can run it again, but the script does find newly hardlinked files. 16:42:37 to add to confusion... the actual ftp volume is 10T ;) 16:43:02 If hardlink didn't take three days to run.... 16:43:13 so the 11 there must be dedupe savings + snapshots 16:43:15 Anyway, I didn't mean to distract from the meeting. 16:43:22 Filesystem used saved %saved 16:43:22 /vol/fedora_ftp/ 9280723952 1498453156 14% 16:43:28 tibbs|w: no problems 16:43:42 note that the netapp dedupe is block level so it's likely always going to get more than hardlinking 16:43:52 The repo is https://pagure.io/quick-fedora-mirror if anyone wants to look. 16:44:13 The script is in zsh currently, but has no other deps besides rsync. 16:44:29 nirik: yeah it can probably dedupe signed and unsigned rpm content 16:44:36 something we can not do with hardlink 16:46:54 anyway, we are over on time 16:47:08 does anyone have anything else? or should I wrap up? 16:47:13 BTW, if you add up DIRECTORY_SIZES.txt you do get nearly 12TB, so it must be the netapp. 16:49:34 tibbs|w: that does not account for cross module hardlinking 16:49:42 You're right. 16:49:50 I wonder if you used buffet if it hardlinked more 16:50:11 Well, the script transfers all of the modules you mirror together. 16:50:17 buffet? 16:50:34 fedora-buffet is the module which contains everything. 16:50:53 ah 16:50:56 So even if you don't mirror buffet, the script should detect hardlinks between the modules which you do mirror. 16:51:16 I was thinking some storage 16:51:22 Well, the script doesn't do it; rsync should do it. It appears to do so when I try. 16:51:33 I was thinking some storage util I didn't know about 16:51:34 * 16:52:08 I do not have enough disk to mirror all of fedora up 16:52:25 I have a bunch of 1T disks I oculd use, just nothing to run them in 16:52:56 I have 4x4TB RAID0 in each of my mirror hosts. 16:53:30 nice, I also have a 400G a month data cap 16:53:54 Yeah, that wouldn't go too far even if you sneakernet the initial data. 16:54:45 nopee 16:55:06 These are just three old machines I threw together. Sadly only 32GB of RAM apiece which limits their speed significantly. I just wanted to have something I could experiment with. 16:55:43 * nirik notes our download servers only have 32GB. ;) 16:55:47 tibbs|w: :) I am glad you could, I think its going to be useful, Just need to get the mirrors to use it. 16:56:01 Yeah, one thing at a time. 16:56:13 But every client is load saved, I think. 16:56:24 There is one big issue, though: directory timestamps. 16:56:46 It's fixable, but I don't know how important it is that all mirrors have exactly the same timestamps on the directories. 16:57:48 I can focus on that if people think it's important. 16:57:54 tibbs|w: not sure it matters too much 16:58:15 Someone brought it up on the mirror list. 16:58:17 it will mean you get some churn if you switch the mirror you pull from 16:58:19 * nirik either 16:58:36 If you're using this script, you won't actually get churn. 16:59:14 And rsync doesn't care about timestamps on directories anyway. It will copy them but that's about it. 16:59:48 so not really a big deal, the content and permissions are more critical 17:00:09 as in making sure its readable during staging correctly 17:01:06 In the end it uses rsync, so it does whatever rsync does. 17:01:17 * dgilmore needs to wrap up this, I have another meeting starting now 17:01:23 tibbs|w: works for me :) 17:01:33 so right now during stage if you have access you sync, if you don't you get errors for that content. 17:01:34 Feel free to ask questions wherever you find me, or file issues in pagure. 17:01:56 #endmeeting