14:00:45 <sgallagh> #startmeeting OpenLMI (2013-11-11) 14:00:45 <zodbot> Meeting started Mon Nov 11 14:00:45 2013 UTC. The chair is sgallagh. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:45 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic. 14:00:49 <sgallagh> #meetingname OpenLMI Public IRC Meeting 14:00:49 <zodbot> The meeting name has been set to 'openlmi_public_irc_meeting' 14:00:55 <sgallagh> #chair sgallagh tsmetana jsafrane rdoty 14:00:55 <zodbot> Current chairs: jsafrane rdoty sgallagh tsmetana 14:01:01 <sgallagh> #info Meetings are recorded and will be posted on www.openlmi.org. Opinions expressed do not necessarily reflect the reviews of the participant's employer. 14:01:11 <sgallagh> #topic Roll Call 14:01:24 <sgallagh> Who do we have here today? 14:01:38 <jsafrane> Jan Safranek\ 14:02:47 <praveen_pk> Praveen Paladugu 14:03:12 <kkaempf> Klaus Kämpf 14:03:25 <rnovacek> Radek Novacek 14:03:38 <tbzatek> Tomáš Bžatek 14:05:16 <sgallagh> That looks like most of our regulars, sans tsmetana and rdoty. 14:06:10 <sgallagh> If I recall correctly, we spent most of the previous meeting discussing access-control needs. 14:06:32 <sgallagh> Do we want to pick up from there, or does anyone have other items they'd like to put on the agenda first? 14:07:45 <sgallagh> Hmm, "auditing" was submitted as an agenda item for this week as well. 14:07:53 <sgallagh> Maybe that would be a good place to start. 14:08:00 <sgallagh> #topic Auditing Requirements 14:08:34 <sgallagh> So, first order of business: what are our auditing requirements? What granularity do we expect to need here 14:08:50 <sgallagh> stefw: If you're around, I recall you had a bit to say on this last week 14:09:54 <stefw> i would suggest talking with mitr about the auditing requirements 14:10:05 <stefw> he has a good handle on what is necessary 14:10:15 <sgallagh> Ok, I'll see if he's available to join us. 14:10:27 <sgallagh> Hmm, he doesn't seem to be online 14:10:40 <sgallagh> We'll wing it for now. 14:10:56 <stefw> the thing is, i have certain assumptions, but someone like mitr or sgrubb would be able to respond based on firmer requirements. 14:11:52 <stefw> in theory an architecture would need to be able to audit the user taking an action. 14:12:12 <stefw> however i don't know if auditing at the CIM input level is enough? 14:12:24 <stefw> that is audit the incoming requests before handing them off to the system 14:12:26 <sgallagh> What do you mean by the "input level"? 14:12:29 <sgallagh> ah, right. 14:12:30 <stefw> at the WBEM level 14:12:38 <stefw> it seems it would be better than nothing 14:12:43 <sgallagh> I was about to say that. 14:12:50 <sgallagh> At minimum, we should be able to record all of that. 14:12:51 <stefw> but would be prone to all sorts of oddities 14:13:07 <stefw> such as the system not actually taking the action described in the CIM message 14:13:29 <sgallagh> Well, CIM requires that an appropriate error be returned in that case. 14:13:38 <kkaempf> Very few user will interact with openlmi on the cim/wbem protocol level. Most will use a management application. 14:13:40 <sgallagh> Bugs can happen, but they're bugs. 14:14:01 <jsafrane> Pegasus has some sort of audit logging, https://collaboration.opengroup.org/pegasus/pp/uploads/40/14428/PEP258_AuditLogging.htm 14:14:07 <kkaempf> So we need to capture the user name at the management application and link this to the audit log at the cimom level. 14:14:09 <stefw> so if two users make the same rquest symultaneously, what happens? 14:14:10 <sgallagh> stefw: So auditing "this request was made" and "the system rejected it" should be sufficient (at that level) 14:14:17 <stefw> kkaempf, this is 'user' in a security context 14:14:44 <stefw> sgallagh, returning to the broader scale, this also makes the CIMOM a trusted part of teh audit system 14:14:49 <kkaempf> Its someone authenticated against the management application. 14:14:55 <stefw> there may be hard requirements about that for certain use cases 14:15:05 <sgallagh> I'm not sure that's necessarily true. 14:15:11 <sgallagh> The trusted part needs to be the kernel auditing 14:15:19 <sgallagh> I'd have to consult with mitr/sgrubb on that 14:15:24 <stefw> kkaempf, in the use case i'm examining openlmi for the management application uses the user's credentials to connect via CIM 14:15:34 <stefw> so the user is in fact accessing CIM, albeit through s oftware 14:15:40 * stefw notes that's always the case 14:16:22 <stefw> well the fact that software is in use :) 14:16:44 <stefw> sgallagh, yes, it's important to get real requirements here 14:16:59 <stefw> because auditing is pretty much driven by them 14:17:24 * sgallagh nods 14:17:27 <stefw> i do know that work on kdbus was partially driven by the desire to have dbus calls audited in the kernel 14:17:32 <kkaempf> The pegasus audit logging pep seems to be a good start 14:18:35 <sgallagh> Yes 14:18:40 * sgallagh reads up on it right now 14:18:43 <sgallagh> https://collaboration.opengroup.org/pegasus/pp/uploads/40/14428/PEP258_AuditLogging.htm 14:19:09 <sgallagh> stefw: This looks to be pretty much exactly the "input audit logging" we were talking about above. 14:19:20 <sgallagh> (Yay, our work here is done! :-P) 14:19:23 <stefw> right, so the question is if that is enough to satisfy requirements 14:19:30 <stefw> not whether it's useful (logging usually is) 14:20:01 <sgallagh> Certainly 14:20:20 <sgallagh> I think the answer we'd hear from sgrubb would be pretty much "there's no such thing as too much" 14:20:53 <rdoty> And Jack Rieden would note that people turn most of it off after the first week... 14:21:22 <kkaempf> Do we need to capture full requirements up front ? 14:21:34 <sgallagh> My suspicion is that if we take advantage of PolicyKit to do decision-making in the providers themselves (based on user identity) and log there, plus the kernel logging that will happen during any system call, that's probably going to paint a pretty complete picture. 14:22:18 <stefw> sgallagh, i agree 14:23:10 <stefw> sgallagh, the kernel logging is obviously less useful if we can't assume the loginuid 14:23:23 <stefw> but i don't think any of the major linux system management services do that 14:23:42 <kkaempf> sgallagh: are you proposing to add auditing/policy code to providers (rather than the cimom) ? 14:23:56 <sgallagh> kkaempf: not *exactly* 14:24:28 <stefw> hmm, i thought you were proposing that 14:24:33 <sgallagh> One of the things we discussed last week was to have the providers execute their requests in the context of the UNIX user that was performing them (either by forking a helper or using seteuid) 14:24:46 <sgallagh> And then invoking PolicyKit to authorize individual decisions 14:24:56 <sgallagh> Then the auditing would technically be done by PolicyKit. 14:25:13 <stefw> that would be a complete solution 14:25:19 <kkaempf> sgallagh: I'd rather see this in the cimom, not the provider. 14:25:35 <sgallagh> kkaempf: See which? 14:25:40 <jsafrane> sgallagh: that would be hard with current design of providers 14:26:04 <rdoty> kkaempf: how would you see that working? 14:26:10 <sgallagh> jsafrane: What's the limitation I'm missing? 14:26:34 <jsafrane> sgallagh: we don't fork stuff, we just do it :) 14:27:19 <sgallagh> jsafrane: Right, but that's not overly-difficult to change. And the alternative I mentioned would be to set the effective user ID (which doesn't require forking, but DOES necessitate locking) 14:27:40 <kkaempf> rdoty: For sfcb, that would be some code in providerDrv.c. I don't know about Pegasus. 14:28:01 <sgallagh> kkaempf: So you'd audit the request going into the provider? 14:28:50 <sgallagh> I think what we're trying to say is that it's more complete to log whenever a privileged action is actually occurring on the system (i.e. "Can I format this partition?") 14:29:02 <sgallagh> Rather than logging "The user has requested to format this partition" 14:29:25 <sgallagh> s/"Can I format this partition?"/"PolicyKit granted the user permission to format a partition"/ 14:29:32 <sgallagh> (That's a more clear distinction) 14:29:34 <kkaempf> sgallagh: That's an awful lot of code in the providers and puts a lot of reponsibility on provider developers. 14:30:14 <kkaempf> Plus 'format partition' is the action to be logged, not 'called mkfs'. 14:30:43 <sgallagh> kkaempf: I'm not sure I see the distinction there 14:31:21 <jsafrane> sgallagh: with some providers, it could be easy to have policy/audit check in the provider - we just forward messages using dbus to systemd or network manager; but storage and/or software would be error prone 14:31:22 <kkaempf> Auditing at the cimom level is guaranteed to capture all provider calls. Auditing at the provider level needs properly coded providers. 14:32:10 <sgallagh> kkaempf: Sure, understood. I've never once said we *shouldn't* audit at the CIMOM level. 14:32:22 <kkaempf> :-) 14:32:23 <sgallagh> In fact, that's already covered with the Pegasus audit logging you mentioned. 14:32:36 <tbzatek> ...and you could audit non-OpenLMI providers as well 14:32:43 <sgallagh> We're talking about what we should do in addition 14:33:18 <kkaempf> I can't imagine requirements on top of that currently. 14:33:45 <sgallagh> kkaempf: Yes, but you're also planning to use primarily the read-only functions of OpenLMI if I remember our earlier discussions. 14:33:54 <sgallagh> monitoring and querying. 14:34:21 <sgallagh> At that level, certainly the CIMOM auditing is going to be enough, since there should not be any potential damage 14:37:14 <sgallagh> Does anyone have anything to add on the topic of auditing, or shall I take an action item to drag mitr/sgrubb into our meeting next week to discuss it further? 14:37:51 <sgallagh> I think we can at least all agree that we should be using the Pegasus audit logging, yes? 14:37:56 <rdoty> Can we invite someone from the OpenPegasus team to join the meeting? 14:38:07 <sgallagh> rdoty: Of course 14:38:33 <sgallagh> Several of them should be aware of this meeting, it was announced to the public list 14:39:06 <rdoty> It doesn't look like any of them are here; time for a specific invitation and telling them what the subject is. 14:39:41 <stefw> sgallagh, in general, the tihng the provider is calling already does the access/policykit/auditing logic 14:39:56 <stefw> if we manage to calli t in the right loginuid/setuid context, then it'll just work 14:40:01 <stefw> but not sure how easy that is 14:40:08 <stefw> but that's why your solution above sounded so appealing 14:40:21 <stefw> it would work,even without significant auditing changes and polkit calls in the providers 14:40:32 <stefw> it just means running the providers in the right security context 14:40:40 * sgallagh nods 14:40:55 <sgallagh> Ok, so we should probably investigate how to accomplish that 14:42:20 <sgallagh> stefw: Do you have some thoughts on where to start? 14:42:34 <sgallagh> side-note: 14:42:49 <sgallagh> #agreed CIMOM-level audit logging is important. We should enable the built-in Pegasus audit log. 14:44:19 <stefw> sgallagh, like wihch provider to start with? 14:44:29 <stefw> or how to separate stuff into different security contexts? 14:44:34 <sgallagh> stefw: the latter 14:44:40 <sgallagh> And how to get it right 14:45:00 <stefw> i guess for me unfamiliar with the CIMOM internals, the question i would ask is: can we run providers in a child process of the cimom? 14:45:07 <stefw> for a given user logged in 14:45:14 <stefw> would fork a child process and assume appropriate security context 14:45:18 <stefw> and then run requests through there. 14:45:26 <sgallagh> stefw: We *can*, but do not currently. 14:45:27 <stefw> obviously this is HTTP requests 14:45:37 <sgallagh> Right now they are child processes running in the CIMOM's context 14:45:45 <stefw> so mapping a login session start/end isn't as obvious as it is with other transports 14:46:07 <rdoty> kkaempf: are you familiar with the Microsoft WMI security model? How do they handle it? 14:46:12 <kkaempf> stefw: sfcb runs every provider as a separate process 14:46:13 <sgallagh> jsafrane: Please fact-check me if I'm mistaken here 14:46:21 <kkaempf> rdoty: No, I'm not. 14:46:24 <sgallagh> kkaempf: As does Pegasus 14:46:47 <sgallagh> kkaempf: But the question is which session do they run in. In our current model, they run as part of the CIMOM's session/cgroup 14:47:02 <jsafrane> Pegasus can run a providers with UID of the logged-in user, I don't know what happens if two users use the same provider - does it run twice then? 14:47:04 <sgallagh> I think what stefw is looking for is for this fork to become a new user session 14:47:08 <kkaempf> stefw: But its per-provider, not per-user. So if user A issues a cim request and a provider gets loaded due to that request, it would run as user A. 14:47:46 <kkaempf> stefw: if user B issues another request for the same provider, the provider would need to switch its security context. 14:48:08 <sgallagh> right 14:48:33 <kkaempf> sgallagh: sfcb providers run in the cimom's session/cgroup. 14:48:46 <sgallagh> We need to figure out if Pegasus would load a second copy as the other user, take over the first copy and change its session, or (bad) run as the first user. 14:49:15 <stefw> sgallagh, i imagine forking a provider per user would be more appropriate 14:49:20 <stefw> note that this happens after authentication 14:49:33 <sgallagh> stefw: Yes, but that doesn't mean it's what currently happens :) 14:49:45 <stefw> and the use case is to have N users be single digits 14:49:50 <kkaempf> stefw: providers are not supposed to run concurrently afaik. 14:49:59 <stefw> only one provider runs at a time? 14:50:06 <stefw> do they exit after each request? 14:50:08 <sgallagh> vcrhonek: If you're around, as our resident Pegasus maintainer, could you look into this for next week? 14:50:12 <stefw> that seems pretty intense :) 14:50:19 <kkaempf> stefw: one copy of each provider runs at any time. 14:50:25 <sgallagh> stefw: They don't exit immediately 14:50:31 <kkaempf> sfcb implements a LRU sheme 14:50:37 <stefw> ah, okay, makes sense 14:50:38 <sgallagh> They stay alive for a few (configurable) minutes IIRC 14:51:11 <kkaempf> providers can also indicate 'don't unload' and can stay loaded/running as long as the cimom runs. 14:51:25 <stefw> so only curretnly the assumption that only one copy of a provider is running on a single kernel? 14:51:25 <sgallagh> tbzatek: Or if you want to look into it, that's cool too :) 14:51:29 <stefw> or it it more of a CIMOM API limitation 14:52:43 <kkaempf> stefw: a provider instruments a resource, you only want one copy to be running at any time. 14:52:57 <rdoty> Is there anything today that keeps two people from running fdisk at the same time from the command line? 14:52:58 <jsafrane> stefw: I need to check with Pegasus what it really does, sfcb has only one process per provider 14:53:06 <kkaempf> otherwise you'll get into all kinds of synchronization problems. 14:53:42 <sgallagh> #action jsafrane to dig into Pegasus and see how it handles per-user provider invocation 14:53:43 <kkaempf> rdoty: providers can spawn long running tasks, this is supported in cmpi. 14:53:44 <tbzatek> sgallagh: umm, integrating PK into Pegasus? I'm not aware of any its internals at all... 14:53:45 <stefw> well if the provider is not syncing with the state of the system, we'll have more problems. 14:53:52 <sgallagh> tbzatek: Not what I was talking about 14:54:08 <sgallagh> I was talking about figuring out how Pegasus handles providers running as a user, but jsafrane just volunteered 14:54:14 <tbzatek> ah 14:54:40 <stefw> if a provider is implemented in such a way that it thinks it owns the managed resource, then that's a pretty brittle provider, and won't work well for Linux management, where you have many ways to manage a resource (for better or worse). 14:55:20 <stefw> whereas if you have a provider implemented in such a way that it represents the state of the system, then it's not really a big deal to have two of them running, as they'll syncronize eventually/naturally. 14:55:22 <sgallagh> stefw: Well, it's often fair to assume ownership while operating on it at least 14:55:27 <sgallagh> (aka locking) 14:55:28 <stefw> sgallagh, true 14:55:38 <stefw> but that's expected with multiple users anyway 14:56:07 <stefw> but that locking generally should happen in a way that other management methods also respect the lock. 14:56:15 <stefw> ie: at a lower level than the provider itself 14:56:59 <stefw> and it's totally expected that a second user performing an exclusive action on a resource already involved in such an action will get an error message. 14:57:37 <sgallagh> stefw: It would be nice if every subsystem understood that. 14:58:57 <jsafrane> stefw: that's impossible, anybody can format a disk or edit network-scripts... running any of our providers twice with different UIDs should be ok, but nobody has ever tested that 15:00:17 <stefw> jsafrane, yeah it is more of a goal, rather than a hard requirement 15:00:37 <stefw> but providers should be somewhat robust in such cases 15:00:45 <stefw> and not segfault or deadlock if at all possible 15:00:59 <stefw> and i imagine they already are 15:01:59 * sgallagh notes we're over time. 15:02:23 <sgallagh> Shall we pick this up next week, hopefully with Jan providing us with details on Pegasus? 15:02:28 * stefw nods 15:03:52 <jsafrane> ok 15:04:38 <sgallagh> #action Resume discussion of process separation next week 15:04:47 <sgallagh> Thank you for participating, everyone! 15:04:50 <sgallagh> #endmeeting