14:00:45 #startmeeting OpenLMI (2013-11-11) 14:00:45 Meeting started Mon Nov 11 14:00:45 2013 UTC. The chair is sgallagh. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:45 Useful Commands: #action #agreed #halp #info #idea #link #topic. 14:00:49 #meetingname OpenLMI Public IRC Meeting 14:00:49 The meeting name has been set to 'openlmi_public_irc_meeting' 14:00:55 #chair sgallagh tsmetana jsafrane rdoty 14:00:55 Current chairs: jsafrane rdoty sgallagh tsmetana 14:01:01 #info Meetings are recorded and will be posted on www.openlmi.org. Opinions expressed do not necessarily reflect the reviews of the participant's employer. 14:01:11 #topic Roll Call 14:01:24 Who do we have here today? 14:01:38 Jan Safranek\ 14:02:47 Praveen Paladugu 14:03:12 Klaus Kämpf 14:03:25 Radek Novacek 14:03:38 Tomáš Bžatek 14:05:16 That looks like most of our regulars, sans tsmetana and rdoty. 14:06:10 If I recall correctly, we spent most of the previous meeting discussing access-control needs. 14:06:32 Do we want to pick up from there, or does anyone have other items they'd like to put on the agenda first? 14:07:45 Hmm, "auditing" was submitted as an agenda item for this week as well. 14:07:53 Maybe that would be a good place to start. 14:08:00 #topic Auditing Requirements 14:08:34 So, first order of business: what are our auditing requirements? What granularity do we expect to need here 14:08:50 stefw: If you're around, I recall you had a bit to say on this last week 14:09:54 i would suggest talking with mitr about the auditing requirements 14:10:05 he has a good handle on what is necessary 14:10:15 Ok, I'll see if he's available to join us. 14:10:27 Hmm, he doesn't seem to be online 14:10:40 We'll wing it for now. 14:10:56 the thing is, i have certain assumptions, but someone like mitr or sgrubb would be able to respond based on firmer requirements. 14:11:52 in theory an architecture would need to be able to audit the user taking an action. 14:12:12 however i don't know if auditing at the CIM input level is enough? 14:12:24 that is audit the incoming requests before handing them off to the system 14:12:26 What do you mean by the "input level"? 14:12:29 ah, right. 14:12:30 at the WBEM level 14:12:38 it seems it would be better than nothing 14:12:43 I was about to say that. 14:12:50 At minimum, we should be able to record all of that. 14:12:51 but would be prone to all sorts of oddities 14:13:07 such as the system not actually taking the action described in the CIM message 14:13:29 Well, CIM requires that an appropriate error be returned in that case. 14:13:38 Very few user will interact with openlmi on the cim/wbem protocol level. Most will use a management application. 14:13:40 Bugs can happen, but they're bugs. 14:14:01 Pegasus has some sort of audit logging, https://collaboration.opengroup.org/pegasus/pp/uploads/40/14428/PEP258_AuditLogging.htm 14:14:07 So we need to capture the user name at the management application and link this to the audit log at the cimom level. 14:14:09 so if two users make the same rquest symultaneously, what happens? 14:14:10 stefw: So auditing "this request was made" and "the system rejected it" should be sufficient (at that level) 14:14:17 kkaempf, this is 'user' in a security context 14:14:44 sgallagh, returning to the broader scale, this also makes the CIMOM a trusted part of teh audit system 14:14:49 Its someone authenticated against the management application. 14:14:55 there may be hard requirements about that for certain use cases 14:15:05 I'm not sure that's necessarily true. 14:15:11 The trusted part needs to be the kernel auditing 14:15:19 I'd have to consult with mitr/sgrubb on that 14:15:24 kkaempf, in the use case i'm examining openlmi for the management application uses the user's credentials to connect via CIM 14:15:34 so the user is in fact accessing CIM, albeit through s oftware 14:15:40 * stefw notes that's always the case 14:16:22 well the fact that software is in use :) 14:16:44 sgallagh, yes, it's important to get real requirements here 14:16:59 because auditing is pretty much driven by them 14:17:24 * sgallagh nods 14:17:27 i do know that work on kdbus was partially driven by the desire to have dbus calls audited in the kernel 14:17:32 The pegasus audit logging pep seems to be a good start 14:18:35 Yes 14:18:40 * sgallagh reads up on it right now 14:18:43 https://collaboration.opengroup.org/pegasus/pp/uploads/40/14428/PEP258_AuditLogging.htm 14:19:09 stefw: This looks to be pretty much exactly the "input audit logging" we were talking about above. 14:19:20 (Yay, our work here is done! :-P) 14:19:23 right, so the question is if that is enough to satisfy requirements 14:19:30 not whether it's useful (logging usually is) 14:20:01 Certainly 14:20:20 I think the answer we'd hear from sgrubb would be pretty much "there's no such thing as too much" 14:20:53 And Jack Rieden would note that people turn most of it off after the first week... 14:21:22 Do we need to capture full requirements up front ? 14:21:34 My suspicion is that if we take advantage of PolicyKit to do decision-making in the providers themselves (based on user identity) and log there, plus the kernel logging that will happen during any system call, that's probably going to paint a pretty complete picture. 14:22:18 sgallagh, i agree 14:23:10 sgallagh, the kernel logging is obviously less useful if we can't assume the loginuid 14:23:23 but i don't think any of the major linux system management services do that 14:23:42 sgallagh: are you proposing to add auditing/policy code to providers (rather than the cimom) ? 14:23:56 kkaempf: not *exactly* 14:24:28 hmm, i thought you were proposing that 14:24:33 One of the things we discussed last week was to have the providers execute their requests in the context of the UNIX user that was performing them (either by forking a helper or using seteuid) 14:24:46 And then invoking PolicyKit to authorize individual decisions 14:24:56 Then the auditing would technically be done by PolicyKit. 14:25:13 that would be a complete solution 14:25:19 sgallagh: I'd rather see this in the cimom, not the provider. 14:25:35 kkaempf: See which? 14:25:40 sgallagh: that would be hard with current design of providers 14:26:04 kkaempf: how would you see that working? 14:26:10 jsafrane: What's the limitation I'm missing? 14:26:34 sgallagh: we don't fork stuff, we just do it :) 14:27:19 jsafrane: Right, but that's not overly-difficult to change. And the alternative I mentioned would be to set the effective user ID (which doesn't require forking, but DOES necessitate locking) 14:27:40 rdoty: For sfcb, that would be some code in providerDrv.c. I don't know about Pegasus. 14:28:01 kkaempf: So you'd audit the request going into the provider? 14:28:50 I think what we're trying to say is that it's more complete to log whenever a privileged action is actually occurring on the system (i.e. "Can I format this partition?") 14:29:02 Rather than logging "The user has requested to format this partition" 14:29:25 s/"Can I format this partition?"/"PolicyKit granted the user permission to format a partition"/ 14:29:32 (That's a more clear distinction) 14:29:34 sgallagh: That's an awful lot of code in the providers and puts a lot of reponsibility on provider developers. 14:30:14 Plus 'format partition' is the action to be logged, not 'called mkfs'. 14:30:43 kkaempf: I'm not sure I see the distinction there 14:31:21 sgallagh: with some providers, it could be easy to have policy/audit check in the provider - we just forward messages using dbus to systemd or network manager; but storage and/or software would be error prone 14:31:22 Auditing at the cimom level is guaranteed to capture all provider calls. Auditing at the provider level needs properly coded providers. 14:32:10 kkaempf: Sure, understood. I've never once said we *shouldn't* audit at the CIMOM level. 14:32:22 :-) 14:32:23 In fact, that's already covered with the Pegasus audit logging you mentioned. 14:32:36 ...and you could audit non-OpenLMI providers as well 14:32:43 We're talking about what we should do in addition 14:33:18 I can't imagine requirements on top of that currently. 14:33:45 kkaempf: Yes, but you're also planning to use primarily the read-only functions of OpenLMI if I remember our earlier discussions. 14:33:54 monitoring and querying. 14:34:21 At that level, certainly the CIMOM auditing is going to be enough, since there should not be any potential damage 14:37:14 Does anyone have anything to add on the topic of auditing, or shall I take an action item to drag mitr/sgrubb into our meeting next week to discuss it further? 14:37:51 I think we can at least all agree that we should be using the Pegasus audit logging, yes? 14:37:56 Can we invite someone from the OpenPegasus team to join the meeting? 14:38:07 rdoty: Of course 14:38:33 Several of them should be aware of this meeting, it was announced to the public list 14:39:06 It doesn't look like any of them are here; time for a specific invitation and telling them what the subject is. 14:39:41 sgallagh, in general, the tihng the provider is calling already does the access/policykit/auditing logic 14:39:56 if we manage to calli t in the right loginuid/setuid context, then it'll just work 14:40:01 but not sure how easy that is 14:40:08 but that's why your solution above sounded so appealing 14:40:21 it would work,even without significant auditing changes and polkit calls in the providers 14:40:32 it just means running the providers in the right security context 14:40:40 * sgallagh nods 14:40:55 Ok, so we should probably investigate how to accomplish that 14:42:20 stefw: Do you have some thoughts on where to start? 14:42:34 side-note: 14:42:49 #agreed CIMOM-level audit logging is important. We should enable the built-in Pegasus audit log. 14:44:19 sgallagh, like wihch provider to start with? 14:44:29 or how to separate stuff into different security contexts? 14:44:34 stefw: the latter 14:44:40 And how to get it right 14:45:00 i guess for me unfamiliar with the CIMOM internals, the question i would ask is: can we run providers in a child process of the cimom? 14:45:07 for a given user logged in 14:45:14 would fork a child process and assume appropriate security context 14:45:18 and then run requests through there. 14:45:26 stefw: We *can*, but do not currently. 14:45:27 obviously this is HTTP requests 14:45:37 Right now they are child processes running in the CIMOM's context 14:45:45 so mapping a login session start/end isn't as obvious as it is with other transports 14:46:07 kkaempf: are you familiar with the Microsoft WMI security model? How do they handle it? 14:46:12 stefw: sfcb runs every provider as a separate process 14:46:13 jsafrane: Please fact-check me if I'm mistaken here 14:46:21 rdoty: No, I'm not. 14:46:24 kkaempf: As does Pegasus 14:46:47 kkaempf: But the question is which session do they run in. In our current model, they run as part of the CIMOM's session/cgroup 14:47:02 Pegasus can run a providers with UID of the logged-in user, I don't know what happens if two users use the same provider - does it run twice then? 14:47:04 I think what stefw is looking for is for this fork to become a new user session 14:47:08 stefw: But its per-provider, not per-user. So if user A issues a cim request and a provider gets loaded due to that request, it would run as user A. 14:47:46 stefw: if user B issues another request for the same provider, the provider would need to switch its security context. 14:48:08 right 14:48:33 sgallagh: sfcb providers run in the cimom's session/cgroup. 14:48:46 We need to figure out if Pegasus would load a second copy as the other user, take over the first copy and change its session, or (bad) run as the first user. 14:49:15 sgallagh, i imagine forking a provider per user would be more appropriate 14:49:20 note that this happens after authentication 14:49:33 stefw: Yes, but that doesn't mean it's what currently happens :) 14:49:45 and the use case is to have N users be single digits 14:49:50 stefw: providers are not supposed to run concurrently afaik. 14:49:59 only one provider runs at a time? 14:50:06 do they exit after each request? 14:50:08 vcrhonek: If you're around, as our resident Pegasus maintainer, could you look into this for next week? 14:50:12 that seems pretty intense :) 14:50:19 stefw: one copy of each provider runs at any time. 14:50:25 stefw: They don't exit immediately 14:50:31 sfcb implements a LRU sheme 14:50:37 ah, okay, makes sense 14:50:38 They stay alive for a few (configurable) minutes IIRC 14:51:11 providers can also indicate 'don't unload' and can stay loaded/running as long as the cimom runs. 14:51:25 so only curretnly the assumption that only one copy of a provider is running on a single kernel? 14:51:25 tbzatek: Or if you want to look into it, that's cool too :) 14:51:29 or it it more of a CIMOM API limitation 14:52:43 stefw: a provider instruments a resource, you only want one copy to be running at any time. 14:52:57 Is there anything today that keeps two people from running fdisk at the same time from the command line? 14:52:58 stefw: I need to check with Pegasus what it really does, sfcb has only one process per provider 14:53:06 otherwise you'll get into all kinds of synchronization problems. 14:53:42 #action jsafrane to dig into Pegasus and see how it handles per-user provider invocation 14:53:43 rdoty: providers can spawn long running tasks, this is supported in cmpi. 14:53:44 sgallagh: umm, integrating PK into Pegasus? I'm not aware of any its internals at all... 14:53:45 well if the provider is not syncing with the state of the system, we'll have more problems. 14:53:52 tbzatek: Not what I was talking about 14:54:08 I was talking about figuring out how Pegasus handles providers running as a user, but jsafrane just volunteered 14:54:14 ah 14:54:40 if a provider is implemented in such a way that it thinks it owns the managed resource, then that's a pretty brittle provider, and won't work well for Linux management, where you have many ways to manage a resource (for better or worse). 14:55:20 whereas if you have a provider implemented in such a way that it represents the state of the system, then it's not really a big deal to have two of them running, as they'll syncronize eventually/naturally. 14:55:22 stefw: Well, it's often fair to assume ownership while operating on it at least 14:55:27 (aka locking) 14:55:28 sgallagh, true 14:55:38 but that's expected with multiple users anyway 14:56:07 but that locking generally should happen in a way that other management methods also respect the lock. 14:56:15 ie: at a lower level than the provider itself 14:56:59 and it's totally expected that a second user performing an exclusive action on a resource already involved in such an action will get an error message. 14:57:37 stefw: It would be nice if every subsystem understood that. 14:58:57 stefw: that's impossible, anybody can format a disk or edit network-scripts... running any of our providers twice with different UIDs should be ok, but nobody has ever tested that 15:00:17 jsafrane, yeah it is more of a goal, rather than a hard requirement 15:00:37 but providers should be somewhat robust in such cases 15:00:45 and not segfault or deadlock if at all possible 15:00:59 and i imagine they already are 15:01:59 * sgallagh notes we're over time. 15:02:23 Shall we pick this up next week, hopefully with Jan providing us with details on Pegasus? 15:02:28 * stefw nods 15:03:52 ok 15:04:38 #action Resume discussion of process separation next week 15:04:47 Thank you for participating, everyone! 15:04:50 #endmeeting