20:00:31 #startmeeting Infrastructure 20:00:32 :( 20:00:33 Meeting started Thu Mar 4 20:00:31 2010 UTC. The chair is mmcgrath. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:34 Useful Commands: #action #agreed #halp #info #idea #link #topic. 20:00:37 #topic Who's here? 20:00:41 * nirik is lurking. 20:00:45 * a-k is 20:01:19 * wzzrd is 20:01:22 * lmacken 20:01:27 Lets go over the release quickly, I have some policy and other fairly complicated issues to discuss after that. 20:01:35 here 20:01:40 awake 20:01:44 #topic Meeting Tickets 20:01:47 No tickets, as usual 20:01:52 yeah I said it before change of topic for once 20:01:58 so that's fine. 20:02:09 I will have one for next meeting 20:02:21 #topic Alpha Release - https://fedorahosted.org/fedora-infrastructure/report/9 20:02:26 .ticket 1996 20:02:27 mmcgrath: #1996 (Move archive to netapp) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/1996 20:02:32 This is done, I'll close it now. 20:02:46 #topic 10944 20:03:01 oops 20:03:05 yea! 20:03:06 .ticket 1944 20:03:08 mmcgrath: #1944 (Fedora 13 Alpha Partial Infrastructure Freeze 16/Feb - 3/March) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/1944 20:03:10 #topic Alpha Release - https://fedorahosted.org/fedora-infrastructure/report/9 20:03:19 The freeze has been moved back one week as I mentioned on the list 20:03:24 .ticket 1989 20:03:25 mmcgrath: #1989 (Verify Mirror Space) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/1989 20:03:37 Mirror space is now good thanks to ticket 1996, I'll close that as well right now 20:03:47 .ticket 1990 20:03:49 mmcgrath: #1990 (Release Day Ticket) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/1990 20:03:57 Just a tracking ticket, it'll be closed on release day. 20:04:01 .ticket 1991 20:04:02 mmcgrath: #1991 (Permissions and content verification) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/1991 20:04:11 smooge: that one's yours, do you know when they'll be staging content and all that? 20:04:23 not yet. its been pushed back 20:04:27 from when I knew 20:04:31 k 20:04:38 I know it's been signed so it's bound to be coming soon 20:04:43 ok 20:04:59 will talk with the dudes in charge after they get back from lunch 20:05:23 20:05:27 .ticket 1992 20:05:28 mmcgrath: #1992 (Lessons Learned) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/1992 20:05:32 This is for the day after the release. 20:05:37 All in all I think we're in good shape. 20:05:53 So anyone have any questions or concerns related to the release? 20:06:21 nthe only one i have was speed between RDU and PHX 20:06:27 smooge: whats up? 20:06:28 but you tested that and it looks good 20:06:33 20:06:57 but I wanted to mention it in case someone looks at the logs and says "hey didn't we have this problem last releease?" 20:07:05 :) 20:07:13 yeah right now, transfer times seem fairly reasonable. 20:07:21 Moving on 20:07:26 #topic Sporatic Outages 20:07:37 So, we've had just a lot of strange outages recently. 20:08:01 and I'm not sure what is causing it yet but I do want to make sure we take a moment to acknowledge it and start to look for what is going on. 20:08:13 sporaDic :P 20:08:14 I have some leads, but I don't want to go making changes until after the alpha. 20:08:21 yeah.. ricky has caught most of the night ones. 20:08:23 gholms: if I could spell, I would have become a writer. 20:08:40 instead he became an actor 20:08:46 smooge: I noticed I couldn't load smolt today and went to the haproxy page earlier and noticed mirrormanager 20:08:51 had just gone down. 20:08:54 ouch 20:09:02 it was down for literally 13 seconds before it went back up. 20:09:08 now, it was only down on app3 and app4. 20:09:13 and probably didn't impact any users. 20:09:19 but there's no reason even that should have happened. 20:09:20 gremlins 20:09:50 If anyone else has experienced this or does experience it, please let us know. 20:10:01 I really _really_ don't want that to become the norm. 20:10:23 some of the outages nagios has noticed, and really none of the recent ones required any manual intervention to fix. 20:10:30 which to me means we've got something misconfigured. 20:10:36 or a bum network somewhere. 20:10:48 or a bit of both 20:10:50 Anyone have any thoughts or comments on that? 20:11:29 it's actually happening right now - https://admin.fedoraproject.org/haproxy/proxy3/ 20:11:36 notice mirror-lists OPEN time 20:11:48 anywho 20:11:51 moving on 20:11:55 #topic Hosting Facilities 20:12:03 So this is an old topic but not one that we finalized. 20:12:13 whats the idea here? 20:12:19 ok 20:12:29 We've gotten to the point where people want to do things that we technically have space for, but don't have the team to manage. 20:12:36 and I don't think scaling the team will fix it. 20:13:00 the needs vary so wildly, some are temporary. 20:13:10 like the kerberos and ldap servers we set up. 20:13:45 but others, like susmits request, would be for long term hosting. 20:13:54 it would probably require a database, probably require username and login passwords. 20:13:59 and I'm just not sure how to fix that problem. 20:14:24 On the one hand we could just hand over hosts and start to budget for an actual long-term cloud. 20:14:31 but on the other hand, I'm not willing to be on call to fix that stuff. 20:14:48 I really like the way we've been scaling out web apps on app* servers 20:15:17 that concept, if not those same servers, for additional apps, would be better than "here's a cloud VM - enjoy!" 20:15:41 mdomsch: well, there's security concerns at that point as well as upstream closeness. 20:15:57 the apps that work well are the apps that have their own upstream independent of Fedora, or those that are basically at this meeting right now. 20:16:00 yes.. the security issues would require security plans and such 20:16:02 * mmcgrath notes susmit's not here. 20:16:03 oi 20:16:17 well, it's 3am in bangalore 20:16:19 and not to pick on susmit in particular, it's just he's our most recent example. 20:16:22 especially dealing with any 'personal' data depending on country 20:16:41 So what do we do here? 20:17:01 do we push that stuff onto the value servers and not store anything important there? 20:17:25 and how do we tell people no? 20:17:46 free media sounds just at the edge of something that is worth a webapp, if it were any smaller how do we turn them down? 20:17:50 in a web app world, nothing important lands on an app 20:17:52 hmm, we usually found that we had to give costing estimates even in a 'Free' society 20:17:54 that's the thing I've struggled with. 20:17:55 it's all in the db 20:18:23 sorry what is it in 'it's all in the db' 20:18:30 but users don't use the db, they use webapps 20:19:03 sure - just saying, you could run the webapp on the value servers if you wanted, as long as the data for such was on db* 20:19:12 eeek 20:19:22 FWIW I've sent an email to susmit stating we will host his app if he packages it but that they are responsible for package maintanence and stuff. 20:19:59 I would say if its data outside of needing a FAS we would want it to be its own db outside of the current ones. 20:20:04 * mdomsch doesn't understand the point of 'value' servers I guess 20:20:09 smooge: why? 20:20:25 bad experience with data extraction lawsuit. 20:20:25 mdomsch: to keep non crit-path away from crit-path stuff. 20:20:43 oh, ok - that's clear 20:20:57 * a-k_afk wil brb 20:21:01 a-k_afk: 20:21:14 currently the CLA covers a lot of ass. But stuff outside of that comes under a different legal realm. 20:21:26 so we need a value db* server, value app* server(s), and a way to authenticate them to FAS 20:21:38 abadger1999: you around? 20:21:54 mmcgrath: What's up? 20:22:23 we talked about sso in the webapps, how feasable actually is that? 20:23:07 in a way that we can be sure $RANDOM_APP isn't stealing usernames and passwords? 20:23:54 Hosted on fedora infrastructure but not written/audited by Fedora Infrastructure you mean? 20:24:00 yeah 20:24:12 I'm just thinking that's a risk we can't ignore and I want to know how we can mitigate it. 20:24:16 Not very feasible to tell that it isn't stealing usernames/passwords. 20:24:29 Even openid would be subject to spoofing. 20:24:32 yeah. 20:24:40 SSL Certs would be okay. 20:24:58 and we're super far away from centralized auth ala something like google ehh? 20:25:07 But my objection to those is still that there isn't a good way to keep those encrypted on the harddrive. 20:25:30 mmcgrath: You mean how google does all their web apps? 20:25:49 mmcgrath: I think we close to that -- but they're assuming an env where they write/audit all the web apps. 20:26:10 20:26:20 well, in theory we might be able to write/audit the auth layer of all the web apps. 20:26:24 we're not that far off from it now. 20:26:40 abadger1999, we had a SSO for web-apps at LANL that used a kerberos backend and a cookie front end. I think it was FLOSS but I will double check 20:26:56 smooge: Was it pubcookie? 20:27:00 it was pretty clean as it had to meet certain reviews 20:27:02 smooge: if it was shibboleth then it has patent issues 20:27:10 pubcookie? 20:27:10 what about CAS? 20:27:15 smooge: opensaml is encumbered by some crazy rules 20:27:18 NOO NOT CAS 20:27:24 sorry :) 20:27:28 http://www.pubcookie.org/ 20:27:33 :) 20:27:53 BTW on the kerberos side, I have reluctantly stopped working on it. It's an option but for it to not be a really hacky job, we'd actually have to gut FAS to use kerberos for it's username and password stuff. 20:28:05 smooge: So with kerberos backend... did it still require you to get a kerberos ticket to do the initial auth? 20:28:07 which may very well be worth it some day, but I don't have the time to do it right now so it won't be the quick fix I was hoping for. 20:28:21 abadger1999: yeah, it would have had to 20:28:34 kerberos doesn't even know your password 20:28:57 smooge: And how was the cookie passed between domains? Or did you have everything on a single domain? 20:29:09 mmcgrath: ... which would be so nice... 20:29:11 abadger1999, you authenticated with a OTP and I think the ticket was on the backend webservers for 20 minutes versus longer. There was only one domain 20:29:21 but yeah, a lot of effort for one go 20:29:39 20:29:40 abadger1999, However the kerberos backend was more of a hammer that we had. something like FAS could probably be used 20:30:50 mmcgrath: didn't you also work in yubikey integration? wouldnt that prevent password theft? 20:30:56 wouldnt fix sso though... 20:31:20 wzzrd: yeah, and that's still in progress, I've been looking for budget and things. But I've been using yubikeys at home with much success. 20:31:30 but even that won't fix everything 20:31:31 *but* 20:31:39 if we get to a situation where we have func in place. 20:31:42 I will be less worried about it 20:31:43 wzzrd: problem w/ yubikeys are they aren't time dependant 20:32:16 G: that's a minor problem though, someone would have to nab a key before I used it. 20:32:23 and even then they'd only be able to use it once. 20:32:35 wzzrd: in normal usage, it's no problem, but if it falls into the wrong hands for 10 minutes and not used for a while afterwards you have a huge attack window 20:32:38 G: true, but the keys the generate have a serial number and using a serial number invalidates all keys with a lower number 20:33:01 wzzrd: how many keys could someone generate in 10 minutes? 20:33:04 but it's still better protection then just the regular passwords. 20:33:05 a lot 20:33:17 but if i use it (once* after that, they are all invalid 20:33:22 G: but still people would need to be smart about their keys. 20:33:31 and in combination with ssh keys it's a good sytem 20:33:42 but anyway, this has gotten a bit off topic. 20:33:52 Does anyone have a problem with us hosting stuff like susmit's app on value1? 20:34:00 I think we should probably create a sysadmin-value group 20:34:40 is that the same app he put in the mailing list? 20:34:49 sijis: yeah 20:35:28 anyone still around? 20:35:34 I guess it's ok then. 20:35:38 So back on to the topic of more general hosting. 20:35:42 * sijis shrugs 20:35:45 we already provide _lots_ of offerings. 20:35:47 1) fedorahosted 20:35:51 2) half rack for secondary arch 20:35:57 I'm ok with susmit's app on value* 20:36:04 once the app is somewhat usable 20:36:04 people want more permanent hosting solutions though. 20:36:06 * abadger1999 also okay 20:36:06 which yet it's not 20:36:09 do we want to get in to that game? 20:36:10 mmcgrath, i am still around 20:36:18 mdomsch: 20:36:34 I have to say IBM's cloud interface is pretty slick. 20:36:52 mmcgrath, if they're directly related to delivering Fedora technologies or products in some way, yes. 20:37:01 mmcgrath, I think we would need to make a proposal and see how much work/cost/risk it would be and go from there 20:37:07 hosting DNS for domsch.com because I happen to be in sysadmin, no. 20:37:28 what types of things would we be hosting? 20:38:01 what's currently wrong with fedorahosted? 20:38:05 we might could also broker "discounted?" hosting for Fedora contributers with Fedora sponsors 20:38:11 sijis: it only provides the website and git hosting. 20:38:16 well scm hosting. 20:38:18 no applications 20:38:28 mdomsch: that's true. 20:38:35 like susmit's app 20:38:38 I guess my hesitation with this is the following: 20:38:47 I've been trying to build this psudo cloud thing for over a year now. 20:38:50 and it's a lot of work. 20:38:53 I do not want us to become an app cloud for $RANDOMAPP, like google, ec2, salesforce, ... 20:38:58 because we don't have a valid manager at the moment. 20:39:12 so I've been writing my own and that's all good and well, but that is also a great deal of work. 20:40:19 mmcgrath, I know I haven't been of help with the cloud bits 20:40:30 what about the potential of any of the folks volunteering lately though? 20:40:41 mmcgrath, I would say that the first thing we should house on such a cloud is "An application to run clouds." 20:40:50 can we scale-out using our volunteer base to say "this is yours, run with it?" 20:40:56 I've had some volunteers working on virt_web with me but I wouldn't say it's rapidly being developed besides the work I've done, they'ev been good about submitting patches though. 20:41:00 so what is Fedora's benefit for hosting these apps for individuals? 20:41:34 sijis: it'd be up to them, maybe someone needs a koji hub and can't afford it themselves. 20:41:44 sijis, in most cases it is to hope to build a community. 20:41:58 mmcgrath: i'd say a problem you have with your own manager is that you have to work with other projects which don't have the same priorities 20:42:19 of course we need to be careful that we dont promise too much. maybe call it FAILcloud 20:42:35 JoshBorke: that's certainly been part of it. libvirt-qpid is pretty immature at the moment and it's at the core of virt_web 20:43:57 smooge: yeah, understood. although it sound like building an infrastructure for community folks. i do not know if that necessarily means 'building' a community. (thinking aloud) 20:44:22 Ok, well anyone have anything else on this? 20:44:33 if not we'll open the floor. 20:44:43 no I think it would need a proposal. if I become inspired I will write it up this weekend 20:44:46 I presume, based on your discussion, Eucalyptus isn't quite what you want. 20:45:11 gholms: I tried it once and it was extreme overkill. If it gets packaged though I might try it again. 20:45:27 Okee dokee 20:45:38 we were going to use ovirt but development on it has all but stopped. 20:45:41 anywho 20:45:44 #topic Open Floor 20:45:51 anyone have anything they'd like to discuss? 20:46:05 not me 20:46:06 bodhi upgrade went out today... fixing a couple of minor issues and will be doing another one today 20:46:15 i didn't break it! 20:46:20 lmacken: all's well so far? 20:46:20 JoshBorke: no, you didn't :) 20:46:24 mmcgrath: yep, all is well 20:46:38 lmacken: did relepel01 get updated? 20:46:45 dgilmore: I'm waiting for the push to finish 20:46:51 [bodhi.masher] INFO 2010-03-04 20:46:46,254 Waiting for updates to hit mirror... 20:46:54 lmacken: its alomost done 20:47:14 mmcgrath: do you want me to push out that fedoracommunity fix today as well? 20:47:28 just incase BU goes down again and fcomm takes our entire app servers with it :( 20:47:40 lmacken: naw we can wait on that one, it'll only be a couple of days now. 20:47:44 mmcgrath: ok, will do 20:47:49 I we actually have community not running off all the app servers anyway 20:47:57 so it'll only take some of them down if BU goes down again. 20:48:02 oh ok, good 20:48:09 Ok, if no one has anything else we'll close the meeting in 30 20:48:22 oh 20:48:24 people1 20:48:46 smooge: was that a question or a comment? :) 20:48:53 speaking of which.. we could put people1 on osuosl if we moved some pt around 20:48:59 sorry slow typer 20:49:09 well, the thing about osuosl1 was that it was supposed to be the pt place 20:49:21 I was hoping we could move staging there as well. 20:49:27 ah we need another box somewhere then 20:49:28 ok 20:49:33 yup. 20:49:42 have we decided that though? we want people1 to be somewhere else at this point? 20:49:45 skvidal: ^^ 20:49:47 I had just seen it had enough disk space if we needed it though 20:49:50 * skvidal looks up 20:50:01 well just enough 20:50:03 * lmacken wonders how many MB of RPMs people are hosting on people1 20:50:08 lmacken: a lot 20:50:19 50 GB of stuff 20:50:20 we're not using that much of the total disk we have available 20:50:35 skvidal: what do you think? 20:50:37 I figured a 75 GB system would be ok for now 20:50:37 I'd feel most comfortable with about 100gb of disk for that service 20:50:43 what does osuosl have available? 20:50:50 smooge: and what about memory? 20:50:55 I like that fedorapeople is a 0 cost solution at the moment. 20:51:03 yes 20:51:06 currently it has about 50 but a couple of PTs' athat are more than they need 20:51:18 oh osuosl isn't? 20:51:29 It's not, we own that box 20:51:33 ah 20:51:44 perhaps it's time to find another hosting provider. 20:51:58 I'm more wanting to know if we're done with the BU hosting or want to give them a second chance at it? 20:51:59 hey, I know one.... 20:52:07 my main issue is that we are relying on it more now 20:52:14 if we're done. Then we can look at solutions, but lets not try to find out where to put it if we don't need to. 20:52:31 what do you think? I'm mostly ambivalent about it 20:52:39 I think I am too at the moment. 20:52:52 weve had a couple of problems in the last few months, that made us find out they're still running F8 20:52:59 but other then that it's been a pretty solid service 20:53:00 I don't mind giving them a second chance if they can move it to Centos-5.x 20:53:58 K, well lets hold off for now, I know they have some migration plans in their future already. 20:54:08 perhaps we can follow up and get a solid schedule, and re-evaulate then. 20:54:08 ok and I am done 20:54:09 sound good? 20:54:30 sounds fine to me 20:54:35 alrighty 20:54:44 well lets call the meeting, and don't forget. The cloud meeting is right after this one. 20:54:47 Thanks for coming everyone! 20:54:51 #endmeeting