20:00:28 #startmeeting 20:00:38 #topic Infrastructure -- Who's here? 20:00:57 ping 20:01:05 * tmz listens in for once 20:01:07 * johe waves 20:01:12 * SmootherFrOgZ is 20:01:19 * rjune_wrk lurks 20:02:00 * dgilmore 20:02:22 k, lets get started 20:02:26 #topic Infrastructure -- tickets 20:02:28 .tiny https://fedorahosted.org/fedora-infrastructure/query?status=new&status=assigned&status=reopened&group=milestone&keywords=~Meeting&order=priority 20:02:34 here 20:02:35 so 20:02:38 .ticket 1503 20:02:40 abadger1999: around? 20:02:52 * nirik is here in the back. 20:03:50 * mmcgrath assumes he's not around 20:04:11 so 20:04:12 mmcgrath: I'm here. Nothing new on this since last week. 20:04:13 * herlo is here for the infra meeting 20:04:20 abadger1999: oh, k. Sounds good. 20:04:30 #topic Infrastructure -- Proxy timeouts 20:04:32 No one's objected, I guess we can ake it off of the meeting for a bit 20:05:24 So, after much debugging, testing, re-debugging. We determined the problem was a difference in how ProxyPass handles proxying and RewriteRule [P] handles proxying. 20:05:35 Annoying on all fronts 20:05:39 and was causing great issues. 20:05:57 but thing seem to have calmed down quite a bit. I'm going to re-run our metrics tests in a bit to see if we're back down to pre-merge levels. 20:06:06 anyone have any questions on that? 20:06:24 yes 20:06:28 what did we settle on 20:06:36 ProxyPass or RewriteRule 20:06:39 well, we're back to RewriteRule 20:06:43 what proxy do we use, if i may ask 20:06:44 which is what we were before the merge. 20:06:50 Do we know what the difference is or just that they are different? 20:06:51 johe: apache + mod_proxy 20:07:01 abadger1999: well, according to a guy in #httpd there isn't one. 20:07:14 hah 20:07:15 I suspect ProxyPass takes on some default values. 20:07:16 mmcgrath, hehehe 20:07:16 thought about nginx for proxy? 20:07:35 johe: people bring up $OTHER_PROXY all the time but no one's ever really been able to answer why it'd be worth moving to. 20:08:10 I think RewriteRule might either take on different timeout values then ProxyPass or maybe doesn't even have them and behaves in a more raw manner. 20:08:20 There were also some keepalive values we altered. 20:08:24 okay, we should discuss this later on :-) 20:08:43 johe: we can discuss it now if you want, if not now take it to the list. 20:09:04 i take it to the list 20:09:08 johe: it is a fairly simple process we use 20:09:08 cool 20:09:24 Ok 20:09:28 sooo.... 20:09:29 * mmcgrath thinks. 20:09:36 #topic Infrastructure -- Databases 20:09:42 So we continue to see some database issues 20:09:56 but the wiki outages seem to be gone 20:09:57 https://admin.fedoraproject.org/haproxy/proxy1/ 20:10:10 shows 0 downtime since the last haproxy reboot. 20:10:14 that's a very good thing. 20:10:23 The last outstanding issue seems to be with smolt. 20:10:46 now, this is something that came about after the merge, but I actually think we were having it before. 20:10:56 I think ricky disabled some caching that we were doing against smolt. 20:11:11 so even when it went down, nagios didn't notice because it kept pulling up the cached version. 20:11:22 There's still a few options we're investigating 20:11:39 mostly based around switching to innodb, changing how we do backups, and using different queries for render-stats 20:11:45 as render-stats seems to be the thing taking everything down 20:12:34 mmcgrath I was talking with ricky about myisam->innodb last night, specifically about the host_links table 20:12:37 what does render-stats do? 20:12:57 ssmoogen: it generates - http://smolts.org/static/stats/stats.html 20:13:12 thekad: you have some experience with this? 20:13:24 myisam vs innodb I mean 20:13:24 mmcgrath: i think innodb would be a helpful move 20:13:43 dgilmore: everything we've heard has said it will be a godo thing, except for when we actually go to do it. 20:13:53 importing from an innodb dump I think took 14 hours, vs les then 1 hour for myisam. 20:13:54 mmcgrath, a bit, yeah, I was telling him that the amount of time that table takes to load doesn't sound like too far fetched 20:13:58 so we're worried about other impacts. 20:14:07 load from a mysqldump I mean 20:14:15 thekad: will we see a similar slow down (more than n order of magnitude) for usage of the table? 20:14:36 mmcgrath: you should be able to convert on the fly 20:14:43 but not sure what effect that would have 20:14:53 mmcgrath, that's possible, I was telling him to maybe run a couple tests with a tenth of the rows, then doubling it 20:15:10 dgilmore: we were trying, we never got through one, usually killed it around 20 hours 20:15:33 mmcgrath, the thing is, all the sanity checks that you disable while loading it, are done during every transaction 20:15:36 thekad: so I think the general thought is we want to move to innodb, but we want to see what we're getting ourselves into :) 20:15:43 mmcgrath: gahh ok 20:16:13 thekad: our understanding is innodb will be 'slower' then myisam. Do you know exactly what that means? 20:16:22 I know it's larger then myisam. 20:16:32 so i'd just assumed that extra time was just because it's reading more from the disks. 20:16:48 mmcgrath: in some ways it will be quicker 20:16:54 mmcgrath, innodb does some data integrity checks on every operation, myisam doesn't support FKs for example, so that brings an overhead 20:17:05 since yu can do row level locking for updates 20:17:23 thekad: 'every operation' we talking writes or reads or both? 20:17:26 so there could be multiple updates at ones 20:17:36 mmcgrath, writes, I mean update, insert, delete 20:17:39 mmcgrath: i think he means writes here 20:17:46 thekad: how's it's read compare? 20:18:38 mmcgrath, almost the same, it degrades a bit because of the IO operations, but makes up for it by using indexes and other stuff innodb has 20:18:47 so it should be mostly the same, if not a lil' bit faster 20:19:26 thekad: are there different indexing options we should look at for our larger tables? or do the types of indexes offered by myisam directly map to innodb indexes? 20:19:28 thekad my minimal reading was that if one needed to scale to mysql clustering.. innodb was needed. 20:19:39 the clear advantage of using innodb is data integrity, if you don't mind about that too much, there's no real gain from innodb 20:19:56 ssmoogen, that is correct 20:20:28 mmcgrath, I think you can tweak them a little, may need to give some more reading about that though 20:20:38 k 20:20:49 thekad: well thanks for your help on that, please do stick around and help us through that transition. 20:21:08 right now ricky's got lead on that, i'm sort of just keeping tabs and testing from time to time. 20:21:19 mmcgrath, there can be some more options such as clustering, sharding, master/slave, we can check them all and evaluate some more 20:21:21 Anyone have any questions on this before we move on? 20:21:30 mmcgrath, ok 20:21:51 thekad: thanks 20:22:15 so withour merge outages and smolt, thats really about all I've been up to. 20:22:18 I'll open the floor 20:22:30 #topic Infrastructure -- Open Floor 20:22:30 my only question is how we can test to see how we spread the load 20:22:46 ssmoogen: well, I've done clustering in the past and we can do that if we have to :) 20:22:49 I'd like to avoid it though 20:23:35 So yeah, bodhi now supports EPEL. 20:23:41 lmacken: take it 20:23:45 It required more hacking than I expected, but the result makes bodhi much more flexible and gives me a much better idea of what is needed from the model in the upcoming bodhi rewrite. 20:23:48 mmcgrath, my only experience with mysql has been with spread out clusters :). If you have another app get another DB :).. which I usually thought was horse manure but was what the Mysql people I dealt with believed in 20:23:55 oh sorry.. 20:23:58 I also fixed a lot of other bugs in the process. 20:24:00 Also, I just updated the Bodhi SOP with details on how to push updates as well: https://fedoraproject.org/wiki/Bodhi_Infrastructure_SOP 20:24:03 ssmoogen: :) 20:24:05 that's all I got :) 20:24:07 thanks lmacken and dgilmore 20:24:30 lmacken: thanks. 20:24:33 lmacken, does bodhi have workflows? 20:24:37 Yay! 20:24:45 dgilmore: how have the pushes been going? 20:24:48 as in forcing people to push things into testing versus publish to production? 20:24:49 lmacken: did we get the bug fixed preventing the summary email of testing updates being sent? 20:24:54 ssmoogen: not exactly, but Fedora Community will be putting those in place soon 20:25:08 dgilmore: yeah, I sent those out by hand last night 20:25:13 mmcgrath: I have one issue i need to fix. all packages are getting touched on each update 20:25:21 lmacken: ok ive not seen them 20:25:28 dgilmore: on the master mirror? 20:25:47 dgilmore: about how long does it take to do a push (both real time and actual person typing time) 20:25:53 mmcgrath: yeah 20:26:10 lmacken: Seems like the wrong layer... if it's just in Community, people can still circumvent it. 20:26:26 mmcgrath: seems to take about 3-4 hours for a push. which is longer that the 30 minutes or so previously 20:26:32 like cvs force tagging and our koji workflow 20:26:36 abadger1999: right, but the workflow layer has to encompass more than just bodhi (eg: cvs, koji, etc) 20:27:09 dgilmore: I've never actually gone through the bodhi process. Is it pretty intensive work for 3-4 hours? Or is it type a few things and go get some tea? 20:27:10 As a whole... but the bottom layers ahve to support the locking down that we want to do at the top. 20:27:33 dgilmore: http://lists.fedoraproject.org/pipermail/epel-package-announce/2009-July/000004.html -- I may have accidently sent it to the package-announce-list by hand instead 20:27:35 lmacken: i have to say anything workflow wise in fedora community feels like the wrong place for it 20:27:54 lmacken: yep wrong list 20:27:54 well, propose something better :) 20:28:09 the plan for fcomm v2.0 was initially: workflows 20:28:28 I wanted to tackle security bug tracking in there as well as others 20:28:39 lmacken: it all has to work without fedora community 20:29:06 it can happen in community but it needs to work with koji,cvs,bodhi,pkgdb also 20:29:07 all of the workflows already work w/o it. 20:29:13 we're just making it easier 20:29:34 lmacken: maybe i miss understood what your trying to do 20:29:36 lmacken: Not in this case. In this case, we're trying to prevent people from doing something. 20:29:47 lmacken: So that has to be enabled at the bottom layer. 20:30:00 abadger1999: which in preventing things it needs to happen at the lowest level not the highest 20:30:28 anyway, we can take the workflow tangent elsewhere :) 20:30:37 yeah 20:31:01 mmcgrath: real world doing things is about 30 minutes or so a push 20:31:01 hi guys.. I need a picture on both sides ... I sense people may be in violent agreement but IRC is not the medium to convey the pictures 20:31:18 dgilmore: k. 20:31:46 ssmoogen: agreed. we'll make sure to lay out a full roadmap for any workflow things we are thinking about attempting, and take everyones thoughts and suggestions into consideration beforehand 20:31:57 alot of it is checking on what people are trying to push into stable 20:32:11 mmcgrath: biggest issue so far is people pushing straight to stable 20:32:13 I'm going to try and get EPEL integrated into Fedora Community soon sa well 20:32:13 dgilmore, and hitting them with a stun gun when it isnt security :) 20:32:17 and bypassing testing 20:32:47 ssmoogen: right the only ones ive allowed though are security ones 20:32:56 like mmcgrath's nagios packages 20:33:01 * nb|away is here but late 20:33:02 dgilmore: ah, the issue being that's not in the spirit of epel? 20:33:11 mmcgrath: right 20:33:20 These things need to get discussed here. we're looking at how to enforce policy on all of our apps... that needs to be talked about as a fedora infrastructure issue. 20:33:29 it was something much more easily controlled in the old setup 20:34:10 mmcgrath: i think i can do some work on bodhi to support epel policies better 20:34:19 yeah. 20:34:36 I mean, at the end of the day we have to trust the packagers somewhat, but I'm sure there's somethign we can do. 20:35:05 abadger1999: so what are some bullet points we should hit? 20:35:35 trust? humans? mmcgrath are you sure you are a system administrator? 20:35:41 What's the policy we're trying to enforce? --dgilmore has ideas on that for EPEL. 20:35:55 Do we also need enforcment for Fedora? 20:36:15 ah 20:36:16 What apps need to change to enforce that workflow? -- for updates, that would be bodhi. 20:36:24 abadger1999, I have heard f13 would like it.. but I would take that to be a cultural thing 20:36:28 abadger1999: all packages but security updates need to go through testing 20:36:41 in EPEL we have a good track record of "build" -> "testing" -> "stable" 20:36:46 What UI do we want to put on top of that? Probably has Bodhi and Fedora Community changes. 20:36:47 but that's largely because of the man behind the scenes. 20:37:13 mmcgrath: right it was easy to do with the old scripts 20:38:03 dgilmore: what did you have in mind for it? 20:38:10 we made packages lie in testing for between a week and just over a month 20:38:12 I mean, in EPEL, we could enforce "nothing but securty" if we wanted 20:38:45 mmcgrath: automate moving things to testing if people try them in stable 20:39:08 lmacken: how painful would that be? 20:39:13 dgilmore: would that all be 'behind the scenes stuff'? 20:39:16 and enforcing at least two weeks in testing without autokarma moving to stable 20:39:24 and if they wanted somethign pushed to testing, would it stay there until they push it to stable? 20:39:26 mmcgrath: i think it needs to be 20:39:29 abadger1999: how's that work now, do you know? 20:40:01 mmcgrath: right, it would stay in testing until moved to stable 20:40:12 mmcgrath: to enforce testing except for security updates? bodhi /used/ to do that, but I made it configurable after many complaints 20:40:17 but we should make it in testing two weeks without karma 20:40:45 * mmcgrath suspects this doesn't seem to need any UI changes in bodhi 20:40:47 am I right on that? 20:41:09 dgilmore, your description of wait/karma was exactly what I was hoping for from bodhi 20:41:12 right, but it may need some backend tweaks depending on what poly we want to enforce 20:41:14 More error messages, maybe some documentation that there's limits in place 20:41:42 lmacken: i think we could do it all with the client when running admin tasks 20:41:55 dgilmore: cool 20:42:04 lmacken: to give us the list of packages that we want to push 20:42:06 Maybe take out the ability to select "stable" when creating an update. 20:42:47 abadger1999: unless its a security update 20:42:59 dgilmore: Isn't that what security is for? 20:43:13 ie: testing/security instead of testing/stable/security 20:43:35 abadger1999: well the bugzilla security update was only requested to go to testing 20:44:17 abadger1999: the type of update is different to the location 20:44:32 but maybe we can tweak that and add some policy support to bodhi 20:44:37 Hmm.. in Fedora, I know I've flagged an update as security and then the security team reviews it and pushes it to stable... no matter whether I selected testing or stable initially. 20:44:45 that way epel can have one set of rules and Fedora another 20:45:05 But perhaps that's the particular person who handled that security update rather than policy. 20:45:07 abadger1999: the security team doesnt look at epel 20:45:10 at least not yet 20:45:13 20:45:29 but the policies of epel and fedora only diverge when explicitly stated. 20:45:55 abadger1999: pushing policy is one of them 20:46:03 20:46:07 Ok, so I think we're generally in agreement about what all has to happen. 20:46:16 is this all blocking on luke to get it done? 20:46:17 mmcgrath: :) right 20:46:20 lmacken: what would you need? 20:46:31 mmcgrath: tickets saying exactly what you guys want to happen :) 20:46:36 mmcgrath: it will need his help. but im going to work on what i can 20:46:54 ok 20:47:06 well, not to take up the rest of the meeting with that, anything else to discuss on that topic? 20:47:39 lmacken, how can I help 20:48:05 ssmoogen: I'm not sure yet, we need to figure out exactly what needs to get done first 20:48:23 dgilmore: you ok being on ticket patrol? Getting which ones need to be created and creating them? 20:48:34 mmcgrath: :) sure 20:48:44 Ok, well if there's nothing else on that topic 20:48:50 anyone have anything else while the floor is open? 20:48:53 blogs.fp.org is going pretty well, basically everything is ready I believe except for we are blocking on FAS integration. FWIW, I'd rather stick with our original solution of letting people sign up, as long as they use a @fedoraproject.org email address, as wordpress-mu is not that cooperative with authentication plugins, but we'll keep working on it. 20:48:57 lmacken, ok let me know how I can help test and document... and I will do the best I can 20:49:04 ssmoogen: will do :) 20:50:02 we have a plugin that nigel made, but for some reason it keeps letting everyone in as long as they have a cla_done username, no matter if the password is correct or not 20:50:09 nb: what seems to be the problem? 20:50:17 interesting 20:50:21 and hilarious ;) 20:50:25 yeah 20:50:35 is it ignoring the json response or something? 20:50:52 not sure, i havent looked into it much, nigel couldn't figure out what was wrong the other night 20:51:01 * nb hopes to look at it some today 20:51:16 nb: excellent. thanks for the roundup. 20:51:20 Anyone have anything else? 20:51:27 its /usr/share/wordpress-mu/wp-content/mu-plugins/fasauth.php if anyone knows much about php and fas 20:51:33 on publictest15 20:52:32 nb: might be good to hit up the f-i-l 20:52:39 good idea 20:52:41 Ok, if no one has anything else we'll close the meeting in 30 20:52:46 something fairly minor, we occassionally get folks having permission problems with git repos on hosted. 20:52:49 nb: I looked briefly the other day -- couldn't figure out which methods it was calling when it verified the password though. 20:53:24 my only statement is that I am still catching up and hopefully will be promoted to ricky's assistant soon 20:53:39 tmz: what was the latest? 20:53:40 I think I've found and fixed the last of the current problems causing those. but to ensure they don't come back, I think a cron job to check for a few common problems might be good. 20:53:45 ssmoogen: :) 20:53:55 mmcgrath: the latest one was docs/install-guide and docs/release-notes. 20:54:04 both lacked the core.sharerepository setting. 20:54:13 were they created incorrectly? 20:54:17 and that cause the reflogs to have the wrong perms. 20:54:23 20:54:51 I'm guessing they were creating without either --shared option to git init, or cloned and then the sharerepository option never set. 20:55:06 the setup scripts should keep this from happening on new repos. 20:55:28 20:56:05 Ok, things are quiet :) 20:56:18 tmz: thanks for that. 20:56:24 if no one has anything else I'll close in 30 20:56:41 10 20:56:53 #endmeeting