20:00:28 <mmcgrath> #startmeeting
20:00:38 <mmcgrath> #topic Infrastructure -- Who's here?
20:00:57 <ggruener> ping
20:01:05 * tmz listens in for once
20:01:07 * johe waves
20:01:12 * SmootherFrOgZ is
20:01:19 * rjune_wrk lurks
20:02:00 * dgilmore 
20:02:22 <mmcgrath> k, lets get started
20:02:26 <mmcgrath> #topic Infrastructure -- tickets
20:02:28 <mmcgrath> .tiny https://fedorahosted.org/fedora-infrastructure/query?status=new&status=assigned&status=reopened&group=milestone&keywords=~Meeting&order=priority
20:02:34 <ssmoogen> here
20:02:35 <mmcgrath> so
20:02:38 <mmcgrath> .ticket 1503
20:02:40 <mmcgrath> abadger1999: around?
20:02:52 * nirik is here in the back.
20:03:50 * mmcgrath assumes he's not around
20:04:11 <mmcgrath> so
20:04:12 <abadger1999> mmcgrath: I'm here.  Nothing new on this since last week.
20:04:13 * herlo is here for the infra meeting
20:04:20 <mmcgrath> abadger1999: oh, k.  Sounds good.
20:04:30 <mmcgrath> #topic Infrastructure -- Proxy timeouts
20:04:32 <abadger1999> No one's objected, I guess we can ake it off of the meeting for a bit
20:05:24 <mmcgrath> So, after much debugging, testing, re-debugging.  We determined the problem was a difference in how ProxyPass handles proxying and RewriteRule [P] handles proxying.
20:05:35 <mmcgrath> Annoying on all fronts
20:05:39 <mmcgrath> and was causing great issues.
20:05:57 <mmcgrath> but thing seem to have calmed down quite a bit.  I'm going to re-run our metrics tests in a bit to see if we're back down to pre-merge levels.
20:06:06 <mmcgrath> anyone have any questions on that?
20:06:24 <ssmoogen> yes
20:06:28 <ssmoogen> what did we settle on
20:06:36 <ssmoogen> ProxyPass or RewriteRule
20:06:39 <mmcgrath> well, we're back to RewriteRule
20:06:43 <johe> what proxy do we use, if i may ask
20:06:44 <mmcgrath> which is what we were before the merge.
20:06:50 <abadger1999> Do we know what the difference is or just that they are different?
20:06:51 <mmcgrath> johe: apache + mod_proxy
20:07:01 <mmcgrath> abadger1999: well, according to a guy in #httpd there isn't one.
20:07:14 <abadger1999> hah
20:07:15 <mmcgrath> I suspect ProxyPass takes on some default values.
20:07:16 <ssmoogen> mmcgrath, hehehe
20:07:16 <johe> thought about nginx for proxy?
20:07:35 <mmcgrath> johe: people bring up $OTHER_PROXY all the time but no one's ever really been able to answer why it'd be worth moving to.
20:08:10 <mmcgrath> I think RewriteRule might either take on different timeout values then ProxyPass or maybe doesn't even have them and behaves in a more raw manner.
20:08:20 <mmcgrath> There were also some keepalive values we altered.
20:08:24 <johe> okay, we should discuss this later on :-)
20:08:43 <mmcgrath> johe: we can discuss it now if you want, if not now take it to the list.
20:09:04 <johe> i take it to the list
20:09:08 <dgilmore> johe: it is a fairly simple process we use
20:09:08 <mmcgrath> cool
20:09:24 <mmcgrath> Ok
20:09:28 <mmcgrath> sooo....
20:09:29 * mmcgrath thinks.
20:09:36 <mmcgrath> #topic Infrastructure -- Databases
20:09:42 <mmcgrath> So we continue to see some database issues
20:09:56 <mmcgrath> but the wiki outages seem to be gone
20:09:57 <mmcgrath> https://admin.fedoraproject.org/haproxy/proxy1/
20:10:10 <mmcgrath> shows 0 downtime since the last haproxy reboot.
20:10:14 <mmcgrath> that's a very good thing.
20:10:23 <mmcgrath> The last outstanding issue seems to be with smolt.
20:10:46 <mmcgrath> now, this is something that came about after the merge, but I actually think we were having it before.
20:10:56 <mmcgrath> I think ricky disabled some caching that we were doing against smolt.
20:11:11 <mmcgrath> so even when it went down, nagios didn't notice because it kept pulling up the cached version.
20:11:22 <mmcgrath> There's still a few options we're investigating
20:11:39 <mmcgrath> mostly based around switching to innodb, changing how we do backups, and using different queries for render-stats
20:11:45 <mmcgrath> as render-stats seems to be the thing taking everything down
20:12:34 <thekad> mmcgrath I was talking with ricky about myisam->innodb last night, specifically about the host_links table
20:12:37 <ssmoogen> what does render-stats do?
20:12:57 <mmcgrath> ssmoogen: it generates - http://smolts.org/static/stats/stats.html
20:13:12 <mmcgrath> thekad: you have some experience with this?
20:13:24 <mmcgrath> myisam vs innodb I mean
20:13:24 <dgilmore> mmcgrath: i think innodb would be a helpful move
20:13:43 <mmcgrath> dgilmore: everything we've heard has said it will be a godo thing, except for when we actually go to do it.
20:13:53 <mmcgrath> importing from an innodb dump I think took 14 hours, vs les then 1 hour for myisam.
20:13:54 <thekad> mmcgrath, a bit, yeah, I was telling him that the amount of time that table takes to load doesn't sound like too far fetched
20:13:58 <mmcgrath> so we're worried about other impacts.
20:14:07 <thekad> load from a mysqldump I mean
20:14:15 <mmcgrath> thekad: will we see a similar slow down (more than n order of magnitude) for usage of the table?
20:14:36 <dgilmore> mmcgrath: you should be able to convert on the fly
20:14:43 <dgilmore> but not sure what effect that would have
20:14:53 <thekad> mmcgrath, that's possible, I was telling him to maybe run a couple tests with a tenth of the rows, then doubling it
20:15:10 <mmcgrath> dgilmore: we were trying, we never got through one, usually killed it around 20 hours
20:15:33 <thekad> mmcgrath, the thing is, all the sanity checks that you disable while loading it, are done during every transaction
20:15:36 <mmcgrath> thekad: so I think the general thought is we want to move to innodb, but we want to see what we're getting ourselves into :)
20:15:43 <dgilmore> mmcgrath: gahh ok
20:16:13 <mmcgrath> thekad: our understanding is innodb will be 'slower' then myisam.  Do you know exactly what that means?
20:16:22 <mmcgrath> I know it's larger then myisam.
20:16:32 <mmcgrath> so i'd just assumed that extra time was just because it's reading more from the disks.
20:16:48 <dgilmore> mmcgrath: in some ways it will be quicker
20:16:54 <thekad> mmcgrath, innodb does some data integrity checks on every operation, myisam doesn't support FKs for example, so that brings an overhead
20:17:05 <dgilmore> since yu can do row level locking for updates
20:17:23 <mmcgrath> thekad: 'every operation' we talking writes or reads or both?
20:17:26 <dgilmore> so there could be multiple updates at ones
20:17:36 <thekad> mmcgrath, writes, I mean update, insert, delete
20:17:39 <dgilmore> mmcgrath: i think he means writes here
20:17:46 <mmcgrath> thekad: how's it's read compare?
20:18:38 <thekad> mmcgrath, almost the same, it degrades a bit because of the IO operations, but makes up for it by using indexes and other stuff innodb has
20:18:47 <thekad> so it should be mostly the same, if not a lil' bit faster
20:19:26 <mmcgrath> thekad: are there different indexing options we should look at for our larger tables?  or do the types of indexes offered by myisam directly map to innodb indexes?
20:19:28 <ssmoogen> thekad my minimal reading was that if one needed to scale to mysql clustering.. innodb was needed.
20:19:39 <thekad> the clear advantage of using innodb is data integrity, if you don't mind about that too much, there's no real gain from innodb
20:19:56 <thekad> ssmoogen, that is correct
20:20:28 <thekad> mmcgrath, I think you can tweak them a little, may need to give some more reading about that though
20:20:38 <mmcgrath> k
20:20:49 <mmcgrath> thekad: well thanks for your help on that, please do stick around and help us through that transition.
20:21:08 <mmcgrath> right now ricky's got lead on that, i'm sort of just keeping tabs and testing from time to time.
20:21:19 <thekad> mmcgrath, there can be some more options such as clustering, sharding, master/slave, we can check them all and evaluate some more
20:21:21 <mmcgrath> Anyone have any questions on this before we move on?
20:21:30 <thekad> mmcgrath, ok
20:21:51 <mmcgrath> thekad: thanks
20:22:15 <mmcgrath> so withour merge outages and smolt, thats really about all I've been up to.
20:22:18 <mmcgrath> I'll open the floor
20:22:30 <mmcgrath> #topic Infrastructure -- Open Floor
20:22:30 <ssmoogen> my only question is how we can test to see how we spread the load
20:22:46 <mmcgrath> ssmoogen: well, I've done clustering in the past and we can do that if we have to :)
20:22:49 <mmcgrath> I'd like to avoid it though
20:23:35 <lmacken> So yeah, bodhi now supports EPEL.
20:23:41 <mmcgrath> lmacken: take it
20:23:45 <lmacken> It required more hacking than I expected, but the result makes bodhi much more flexible and gives me a much better idea of what is needed from the model in the upcoming bodhi rewrite.
20:23:48 <ssmoogen> mmcgrath, my only experience with mysql has been with spread out clusters :). If you have another app get another DB :).. which I usually thought was horse manure but was what the Mysql people I dealt with believed in
20:23:55 <ssmoogen> oh sorry..
20:23:58 <lmacken> I also fixed a lot of other bugs in the process.
20:24:00 <lmacken> Also, I just updated the Bodhi SOP with details on how to push updates as well: https://fedoraproject.org/wiki/Bodhi_Infrastructure_SOP
20:24:03 <mmcgrath> ssmoogen: :)
20:24:05 <lmacken> that's all I got :)
20:24:07 <ssmoogen> thanks lmacken and dgilmore
20:24:30 <dgilmore> lmacken: thanks.
20:24:33 <ssmoogen> lmacken, does bodhi have workflows?
20:24:37 <abadger1999> Yay!
20:24:45 <mmcgrath> dgilmore: how have the pushes been going?
20:24:48 <ssmoogen> as in forcing people to push things into testing versus publish to production?
20:24:49 <dgilmore> lmacken: did we get the bug fixed preventing the summary email of testing updates being sent?
20:24:54 <lmacken> ssmoogen: not exactly, but Fedora Community will be putting those in place soon
20:25:08 <lmacken> dgilmore: yeah, I sent those out by hand last night
20:25:13 <dgilmore> mmcgrath: I have one issue i need to fix.  all packages are getting touched on each update
20:25:21 <dgilmore> lmacken: ok ive not seen them
20:25:28 <mmcgrath> dgilmore: on the master mirror?
20:25:47 <mmcgrath> dgilmore: about how long does it take to do a push (both real time and actual person typing time)
20:25:53 <dgilmore> mmcgrath: yeah
20:26:10 <abadger1999> lmacken: Seems like the wrong layer... if it's just in Community, people can still circumvent it.
20:26:26 <dgilmore> mmcgrath: seems to take about 3-4 hours for a push. which is longer that the 30 minutes or so previously
20:26:32 <abadger1999> like cvs force tagging and our koji workflow
20:26:36 <lmacken> abadger1999: right, but the workflow layer has to encompass more than just bodhi (eg: cvs, koji, etc)
20:27:09 <mmcgrath> dgilmore: I've never actually gone through the bodhi process.  Is it pretty intensive work for 3-4 hours?  Or is it type a few things and go get some tea?
20:27:10 <abadger1999> <nod> As a whole... but the bottom layers ahve to support the locking down that we want to do at the top.
20:27:33 <lmacken> dgilmore: http://lists.fedoraproject.org/pipermail/epel-package-announce/2009-July/000004.html -- I may have accidently sent it to the package-announce-list by hand instead
20:27:35 <dgilmore> lmacken: i have to say anything workflow wise in fedora community feels like the wrong place for it
20:27:54 <dgilmore> lmacken: yep wrong list
20:27:54 <lmacken> well, propose something better :)
20:28:09 <lmacken> the plan for fcomm v2.0 was initially: workflows
20:28:28 <lmacken> I wanted to tackle security bug tracking in there as well as others
20:28:39 <dgilmore> lmacken: it all has to work without fedora community
20:29:06 <dgilmore> it can happen in community  but it needs to work with koji,cvs,bodhi,pkgdb also
20:29:07 <lmacken> all of the workflows already work w/o it.
20:29:13 <lmacken> we're just making it easier
20:29:34 <dgilmore> lmacken: maybe i miss understood what your trying to do
20:29:36 <abadger1999> lmacken: Not in this case.  In this case, we're trying to prevent people from doing something.
20:29:47 <abadger1999> lmacken: So that has to be enabled at the bottom layer.
20:30:00 <dgilmore> abadger1999: which in preventing things it needs to happen at the lowest level not the highest
20:30:28 <lmacken> anyway, we can take the workflow tangent elsewhere :)
20:30:37 <mmcgrath> yeah
20:31:01 <dgilmore> mmcgrath: real world doing things is about 30 minutes or so a push
20:31:01 <ssmoogen> hi guys.. I need a picture on both sides ... I sense people may be in violent agreement but IRC is not the medium to convey the pictures
20:31:18 <mmcgrath> dgilmore: k.
20:31:46 <lmacken> ssmoogen: agreed.  we'll make sure to lay out a full roadmap for any workflow things we are thinking about attempting, and take everyones thoughts and suggestions into consideration beforehand
20:31:57 <dgilmore> alot of it is checking on what people are trying to push into stable
20:32:11 <dgilmore> mmcgrath: biggest issue so far is people pushing straight to stable
20:32:13 <lmacken> I'm going to try and get EPEL integrated into Fedora Community soon sa well
20:32:13 <ssmoogen> dgilmore, and hitting them with a stun gun when it isnt security :)
20:32:17 <dgilmore> and bypassing testing
20:32:47 <dgilmore> ssmoogen: right the only ones ive allowed though are security ones
20:32:56 <dgilmore> like mmcgrath's nagios packages
20:33:01 * nb|away is here but late
20:33:02 <mmcgrath> dgilmore: ah, the issue being that's not in the spirit of epel?
20:33:11 <dgilmore> mmcgrath: right
20:33:20 <abadger1999> These things need to get discussed here.  we're looking at how to enforce policy on all of our apps... that needs to be talked about as a fedora infrastructure issue.
20:33:29 <dgilmore> it was something much more easily controlled in the old setup
20:34:10 <dgilmore> mmcgrath: i think i can do some work on bodhi to support epel policies better
20:34:19 <mmcgrath> yeah.
20:34:36 <mmcgrath> I mean, at the end of the day we have to trust the packagers somewhat, but I'm sure there's somethign we can do.
20:35:05 <mmcgrath> abadger1999: so what are some bullet points we should hit?
20:35:35 <ssmoogen> trust? humans? mmcgrath are you sure you are a system administrator?
20:35:41 <abadger1999> What's the policy we're trying to enforce?  --dgilmore has ideas on that for EPEL.
20:35:55 <abadger1999> Do we also need enforcment for Fedora?
20:36:15 <mmcgrath> ah
20:36:16 <abadger1999> What apps need to change to enforce that workflow? -- for updates, that would be bodhi.
20:36:24 <ssmoogen> abadger1999, I have heard f13 would like it.. but I would take that to be a cultural thing
20:36:28 <dgilmore> abadger1999: all packages but security updates need to go through testing
20:36:41 <mmcgrath> in EPEL we have a good track record of "build" -> "testing" -> "stable"
20:36:46 <abadger1999> What UI do we want to put on top of that?  Probably has Bodhi and Fedora Community changes.
20:36:47 <mmcgrath> but that's largely because of the man behind the scenes.
20:37:13 <dgilmore> mmcgrath: right it was easy to do with the old scripts
20:38:03 <mmcgrath> dgilmore: what did you have in mind for it?
20:38:10 <dgilmore> we made packages lie in testing for between a week and just over a month
20:38:12 <mmcgrath> I mean, in EPEL, we could enforce "nothing but securty" if we wanted
20:38:45 <dgilmore> mmcgrath: automate moving things to testing if people try them in stable
20:39:08 <mmcgrath> lmacken: how painful would that be?
20:39:13 <mmcgrath> dgilmore: would that all be 'behind the scenes stuff'?
20:39:16 <dgilmore> and enforcing at least two weeks in testing without autokarma moving to stable
20:39:24 <mmcgrath> and if they wanted somethign pushed to testing, would it stay there until they push it to stable?
20:39:26 <dgilmore> mmcgrath: i think it needs to be
20:39:29 <mmcgrath> abadger1999: how's that work now, do you know?
20:40:01 <dgilmore> mmcgrath: right, it would stay in testing until moved to stable
20:40:12 <lmacken> mmcgrath: to enforce testing except for security updates?  bodhi /used/ to do that, but I made it configurable after many complaints
20:40:17 <dgilmore> but we should make it in testing two weeks without karma
20:40:45 * mmcgrath suspects this doesn't seem to need any UI changes in bodhi
20:40:47 <mmcgrath> am I right on that?
20:41:09 <ssmoogen> dgilmore, your description of wait/karma was exactly what I was hoping for from bodhi
20:41:12 <lmacken> right, but it may need some backend tweaks depending on what poly we want to enforce
20:41:14 <abadger1999> More error messages, maybe some documentation that there's limits in place
20:41:42 <dgilmore> lmacken: i think we could do it all with the client when running admin tasks
20:41:55 <lmacken> dgilmore: cool
20:42:04 <dgilmore> lmacken: to give us the list of packages that we want to push
20:42:06 <abadger1999> Maybe take out the ability to select "stable" when creating an update.
20:42:47 <dgilmore> abadger1999: unless its a security update
20:42:59 <abadger1999> dgilmore: Isn't that what security is for?
20:43:13 <abadger1999> ie: testing/security instead of testing/stable/security
20:43:35 <dgilmore> abadger1999: well the bugzilla security update was only requested to go to testing
20:44:17 <dgilmore> abadger1999: the type of update is different to the location
20:44:32 <dgilmore> but maybe we can tweak that and add some policy support to bodhi
20:44:37 <abadger1999> Hmm.. in Fedora, I know I've flagged an update as security and then the security team reviews it and pushes it to stable... no matter whether I selected testing or stable initially.
20:44:45 <dgilmore> that way epel can have one set of rules and Fedora another
20:45:05 <abadger1999> But perhaps that's the particular person who handled that security update rather than policy.
20:45:07 <dgilmore> abadger1999: the security team doesnt look at epel
20:45:10 <dgilmore> at least not yet
20:45:13 <abadger1999> <nod>
20:45:29 <abadger1999> but the policies of epel and fedora only diverge when explicitly stated.
20:45:55 <dgilmore> abadger1999: pushing policy is one of them
20:46:03 <abadger1999> <nod>
20:46:07 <mmcgrath> Ok, so I think we're generally in agreement about what all has to happen.
20:46:16 <mmcgrath> is this all blocking on luke to get it done?
20:46:17 <dgilmore> mmcgrath: :)  right
20:46:20 <mmcgrath> lmacken: what would you need?
20:46:31 <lmacken> mmcgrath: tickets saying exactly what you guys want to happen :)
20:46:36 <dgilmore> mmcgrath: it will need his help.  but im going to work on what i can
20:46:54 <mmcgrath> ok
20:47:06 <mmcgrath> well, not to take up the rest of the meeting with that, anything else to discuss on that topic?
20:47:39 <ssmoogen> lmacken, how can I help
20:48:05 <lmacken> ssmoogen: I'm not sure yet, we need to figure out exactly what needs to get done first
20:48:23 <mmcgrath> dgilmore: you ok being on ticket patrol?  Getting which ones need to be created and creating them?
20:48:34 <dgilmore> mmcgrath: :) sure
20:48:44 <mmcgrath> Ok, well if there's nothing else on that topic
20:48:50 <mmcgrath> anyone have anything else while the floor is open?
20:48:53 <nb> blogs.fp.org is going pretty well, basically everything is ready I believe except for we are blocking on FAS integration.  FWIW, I'd rather stick with our original solution of letting people sign up, as long as they use a @fedoraproject.org email address, as wordpress-mu is not that cooperative with authentication plugins, but we'll keep working on it.
20:48:57 <ssmoogen> lmacken, ok let me know how I can help test and document... and I will do the best I can
20:49:04 <lmacken> ssmoogen: will do :)
20:50:02 <nb> we have a plugin that nigel made, but for some reason it keeps letting everyone in as long as they have a cla_done username, no matter if the password is correct or not
20:50:09 <mmcgrath> nb: what seems to be the problem?
20:50:17 <mmcgrath> interesting
20:50:21 <mmcgrath> and hilarious ;)
20:50:25 <nb> yeah
20:50:35 <mmcgrath> is it ignoring the json response or something?
20:50:52 <nb> not sure, i havent looked into it much, nigel couldn't figure out what was wrong the other night
20:51:01 * nb hopes to look at it some today
20:51:16 <mmcgrath> nb: excellent.  thanks for the roundup.
20:51:20 <mmcgrath> Anyone have anything else?
20:51:27 <nb> its /usr/share/wordpress-mu/wp-content/mu-plugins/fasauth.php if anyone knows much about php and fas
20:51:33 <nb> on publictest15
20:52:32 <mmcgrath> nb: might be good to hit up the f-i-l
20:52:39 <nb> good idea
20:52:41 <mmcgrath> Ok, if no one has anything else we'll close the meeting in 30
20:52:46 <tmz> something fairly minor, we occassionally get folks having permission problems with git repos on hosted.
20:52:49 <abadger1999> nb: I looked briefly the other day -- couldn't figure out which methods it was calling when it verified the password though.
20:53:24 <ssmoogen> my only statement is that I am still catching up and hopefully will be promoted to ricky's assistant soon
20:53:39 <mmcgrath> tmz: what was the latest?
20:53:40 <tmz> I think I've found and fixed the last of the current problems causing those.  but to ensure they don't come back, I think a cron job to check for a few common problems might be good.
20:53:45 <mmcgrath> ssmoogen: :)
20:53:55 <tmz> mmcgrath: the latest one was docs/install-guide and docs/release-notes.
20:54:04 <tmz> both lacked the core.sharerepository setting.
20:54:13 <mmcgrath> were they created incorrectly?
20:54:17 <tmz> and that cause the reflogs to have the wrong perms.
20:54:23 <mmcgrath> <nod>
20:54:51 <tmz> I'm guessing they were creating without either --shared option to git init, or cloned and then the sharerepository option never set.
20:55:06 <tmz> the setup scripts should keep this from happening on new repos.
20:55:28 <mmcgrath> <nod>
20:56:05 <mmcgrath> Ok, things are quiet :)
20:56:18 <mmcgrath> tmz: thanks for that.
20:56:24 <mmcgrath> if no one has anything else I'll close in 30
20:56:41 <mmcgrath> 10
20:56:53 <mmcgrath> #endmeeting