19:03:07 #startmeeting Ansible Core Meeting 19:03:07 Meeting started Tue Feb 19 19:03:07 2019 UTC. 19:03:07 This meeting is logged and archived in a public location. 19:03:07 The chair is nitzmahone. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:03:07 Useful Commands: #action #agreed #halp #info #idea #link #topic. 19:03:07 The meeting name has been set to 'ansible_core_meeting' 19:03:20 #chair sdoran 19:03:20 Current chairs: nitzmahone sdoran 19:04:18 * shertel waves 19:04:19 * nitzmahone digs around for agenda ticket 19:04:31 https://github.com/ansible/community/issues/436 19:04:44 ty! 19:04:58 #info agenda https://github.com/ansible/community/issues/436 19:05:22 #chair shertel 19:05:22 Current chairs: nitzmahone sdoran shertel 19:05:55 * mattclay waves 19:06:02 #chair mattclay bcoca 19:06:02 Current chairs: bcoca mattclay nitzmahone sdoran shertel 19:06:48 OK, I think the first one is: https://github.com/ansible/ansible/pull/44067 19:06:51 #topic https://github.com/ansible/ansible/pull/44067 19:07:21 Thanks for bringing this one up! 19:07:50 Doesn't look like Alan's here- lemme see if I can get him 19:08:20 alan is having problems 19:08:27 he is sitting in front of me 19:08:44 he is going to go back to his desk because the durham wifi is busted 19:08:50 joy 19:09:01 OK, let's go to the next one and we'll come back to it 19:09:29 the one he cares about today is https://github.com/ansible/ansible/pull/52045 19:09:36 We were going to include a backport of ordereddict at one point 19:09:44 I'm gonna save Proposal 154 for the end, since I suspect it could be long 19:10:00 also, the controller only has to support python-2.7+ and I think ordereddict is available in python-2.7. 19:10:05 hello, this is actually Alan 19:10:23 Oh cool- let's talk over 44067 then 19:10:33 yeah, I didn't think OrderedDict would really work 19:11:04 Yeah, I suspect there are a number of places where it would *kinda* work, but then we'd transition it through something else and lose that 19:11:15 Yes, OrderedDict is available in Python 2.7+, so it will be available for use once we've dropped Python 2.6 testing for the controller. 19:11:34 It could probably be made to work, but we'd need a lot of test coverage to flush out all the places where we'd lose the order 19:12:15 sorry im late, long line 19:12:20 There are also some interesting corner cases that would need to be decided 19:12:36 bcoca: no worries, but you can take over for the next issue ;) 19:13:05 I'm also not really opposed to 44067 in mostly its current form, there are some things that could maybe be done better, I could test the larger inventories. I understand that preserving order is important. 19:13:12 fyi, i had orderdict working at that point, but it didnt matter as we 'unorderd' at subsequent steps due to 'set' 19:13:47 AlanCoding: the one that immediately comes to mind is "composed" inventory, where a host can be added more than once from different sources. Do we keep the original order, or reset on the second add? 19:14:13 But there are probably others as well 19:14:29 I'm seeing it only get added on first sighting 19:14:40 if the host already belongs to the group, it looks like add_host doesn't do anything 19:14:48 nitzmahone: add_host already deals with that, original order as no 'dupe' is added 19:14:54 you merge to existing host or create new one 19:15:36 Right, and in most cases that's probably what people would expect- just wanting to make sure we're making that decision conciously and not by accident 19:16:12 * jimi|ansible here now too 19:16:14 i did that conciouslly on inventory revamp .. aslo happend to work the same as it did before 19:16:43 Pro tip for finding agenda github.com/ansible/community/label/core (or other WG name) will take you to the right place. Also Core agenda is pinned from github.com/ansible/community/issues 19:16:49 * nitzmahone is talking specifically about the ordering thing, not the "only one host"- that's absolutely the right behavior ;) 19:17:06 #chair jimi|ansible mattclay gundalow AlanCoding 19:17:06 Current chairs: AlanCoding bcoca gundalow jimi|ansible mattclay nitzmahone sdoran shertel 19:18:02 https://github.com/ansible/community/issues?q=is%3Aopen+is%3Aissue+label%3Acore+label%3Ameeting_agenda 19:18:08 Well, big picture, I'm fine with the change, though I still suspect there are a number of factors downstream that could mangle the order at execution time 19:18:09 * alongchamps is here 19:18:10 ^ gundalow your uri doesnt work 19:18:57 nitzmahone: change doesnt make ordering fixed, it only preserves it if it already was 19:19:42 AlanCoding: is the desire for this around predictably-ordered execution? 19:20:04 eg `hosts: myorderedgroup` will run in the order the hosts were added? 19:20:54 no, it just prevents mangling when that order is given, i.e ini plugin keeps order, yaml does not (python dictionary) 19:20:56 (and with all the other caveats laid out in https://github.com/ansible/ansible/issues/44065#issuecomment-412515415) 19:21:14 I'm asking: why do we care? 19:21:30 users care and dont want to use 'order:' keyword 19:22:06 I should point out, that if we do nuke the ini plugin, we would lose that anyway 19:22:12 I don't think the others maintain order 19:22:22 I'm one of the watchers on this PR trying to encourage it through the review process - we're trying to upgrade from 2.4 to 2.7 but this is a blocker for us, I'm happy to give the example that we're facing? 19:22:31 depends on the plugin, i do have fix for yaml one to add order, but requires orderdict 19:22:50 marcjay: have you tested the pr? 19:23:04 That's what I'm trying to figure out: I suspect there are *numerous* ways to screw up the order past this, so I'm trying to figure out if that's the goal. If so, I think there's probably a lot more work to do to actually make this work reliably in many cases 19:23:15 marcjay: I think I would be interested in what makes it a blocker 19:23:29 nitzmahone: yes there are, i have 'final solution' but it requires storing inventory as names vs host objects, major changes in accessors 19:23:47 marcjay: yeah, and confirmation that what you actually care about is the host execution order at runtime 19:23:55 I admit I haven't tested the PR, but our inventory looks essentially the same as the test one in the linked issue https://github.com/ansible/ansible/issues/44065 19:24:19 So this is our inventory: 19:24:32 [group_app:children] group_app-eu-west-1a group_app-eu-west-1b [group_app-eu-west-1a] app-eu-west-1a ansible_host=10.xx.xx.xx [group_app-eu-west-1b] app-eu-west-1b ansible_host=10.xx.xx.xx 19:25:05 ^ which also touches on 2nd issue, 'invalid group names' 19:25:07 bcoca inventory isn't stored as names already? I thought that was already changed. 19:25:30 shertel: not yet, i need it also for 'intermediate source failure rollback' 19:25:40 (sorry it's mangled here). We run a singleton service on the one of the x number of hosts in the group, and want to run changes against only the active host - the rest are warm standbys 19:25:42 due to serialization issues we cannot use deepcopy on host objects 19:26:07 marcjay: seems like there are much better ways to solve that than relying on the inventory order 19:26:24 run_once against the hosts works when order is deterministic, but with 2.7 it will pick random hosts to go first. hosts: group_app[0] has the same issue 19:26:26 ^ agreed, i would use 'active=true/false' var 19:26:36 (eg a separate play targeting the active host once you've determined which one it is) 19:26:47 +1 19:26:55 or when: ismaster|bool 19:27:05 If it worked before, I think it was basically accidental 19:27:25 I did look at that, and using host inventory vars to make it active, but saw in the doc that if you use when: to conditionally run a play, if the first host evaluates to false, it will give up and not try the rest 19:27:29 nitzmahone: it was, but when alan introduced _walk function it did break that behaviour 19:27:42 'undocumented' behaviour though ... 19:27:56 was not an expectation, though many users do want it to be 'predictable' 19:27:57 ... and I suspect there were numerous other ways to break it even before that 19:28:12 yes, even after this PR, there are still ways 19:28:44 i'm not against merging PR, i was mostly trying to make time to revamp InventoryData to use names vs host objects 19:28:51 It seems like `order` should be our only supported mechanism for this 19:29:12 nitzmahone: officially, it is, but order defaults to 'invetnory order', which this pr 'helps' preserve 19:29:27 Ok, if the answer is it's not necessarily meant to be as per that order, then we'll go away and find another approach. It was the when: only considering the first host that made me think another approach wouldn't suit, but we can give it some more thought 19:29:49 this PR fixes 'unintended' issue introduced with _walk function, it altered inventory order 19:30:08 Yeah, lots of ways to work around that; separate play targeting only the master makes most sense to me, but many other things could work too 19:30:12 as defined by plugin, this function should not do that, irrespective of 'definition order' being followed by plugins 19:30:29 nitzmahone: yes, people should not rely on this, does not mean PR is wrong 19:30:40 I'm also not opposed to merging this, I just don't want to set the expectation that inventory order is fixed or guaranteed, as that may tie our hands on a lot of things 19:30:48 Gotcha, full understand 19:30:48 agreed 19:30:54 fully* 19:31:11 so rebuild_merge? anyone against? 19:31:41 sdoran: did you not cross out previous submissions? 19:31:48 i cannot tell where you left off/what was discussed 19:32:07 I checked them off and marked the comments as resolved. 19:32:42 eg, I *really* never want to see a doc page that outlines exactly what "inventory order" means; we should keep that as an implementation detail. Deterministic for a given configuration/version is OK, but shouldn't be set in stone. 19:33:05 nitzmahone: should be 'per inventory plugin' to define 19:33:10 yep 19:33:24 (well, and subject to munging by the manager as well) 19:33:27 but it should be preserved as per plugin output, which this pr does (since this function did mangle it) 19:33:38 +1 19:33:55 OK, sounds like we're good to merge that one 19:34:01 inventory manager should not alter order, inventory data 'sets it' as plugins invoke add_host/group 19:34:04 k 19:34:20 * nitzmahone hands reins to bcoca 19:34:38 I think there was one other for Alan 19:34:45 sdoran: so dag's stuff was never discussed? 19:34:54 Many thanks for your time all 19:34:56 https://github.com/ansible/ansible/pull/50035 19:35:07 #topic https://github.com/ansible/ansible/pull/50035 19:35:38 No, that one wasn't discussed. 19:35:55 @dag? 19:36:17 yes, we are very strongly interested in https://github.com/ansible/ansible/pull/52045, because of the task of migrating from the contrib scripts to the plugins. 19:36:32 +1 to this; this is one of the primary problems with a hypothetical move to YAML 1.2 19:36:46 +1 to https://github.com/ansible/ansible/pull/52045 19:37:00 wait to vote to topic being called 19:37:05 ack 19:37:08 we are on 50035 now 19:37:29 #topic https://github.com/ansible/ansible/issues/15484 19:37:43 ^ we started discussing this a while back, but never got resolution 19:37:58 I missed those in the meetings last week. Sorry about that. 19:38:19 sdoran: no worries, i do like your resolved/checkboxes better than 'strikethrough' 19:38:22 using that now 19:38:39 run_once behaviour on handlers 19:38:49 urg 19:38:50 anyone have new data to add? 19:39:10 nitzmahone: yep, this one is doozie 19:39:15 Other than climbing in a window because no matter what we do, someone's going to be at our door with pitchforks? 19:39:57 nitzmahone: right now its 'undefined' and that opens to 'all pitchforks' i think we should define AND document expected behaviour (and fix if it isn't the current) 19:40:04 then we can handle the 'pitchfork subset' 19:40:56 I'd tend to say the handler tasks/blocks should be run_once if you want that behavior, otherwise existing behavior 19:41:19 (ie the handler should specify directly) 19:41:42 what happens when handler is run_once buy you notify several times? 19:41:57 To do this "properly" really needs a very different notification/subscription style IMO 19:41:58 yes, this gets very complicated 19:42:02 multiple notifications don't stack, unless it runs between notifications 19:42:10 jimi|ansible: they do, for diff hosts 19:42:30 notify: for hostA and hostB, handler rusn for both , but if you have run_once: true on handler? 19:42:42 But even if it preserved that, which host does it run under? 19:42:47 exactly 19:42:56 Is it the first host that notified? The last? 19:43:08 should be whatever's first in the list 19:43:10 nitzmahone: fun stuff, also run_once has 'secondary effects', like set results/facts for 'all hosts' 19:43:14 (or "undefined", which I suspect is what would happen if you did the block approach) 19:43:23 I think the handler should only run on host where the task that reported the change ran. 19:43:23 I say just document what it does right now :) 19:43:38 Which is basically "gonna run for all hosts that notified", right? 19:43:46 sdoran: that is normal handler behaviour, the problem is when you notify from multiple hosts and have run_once 19:44:02 default handler behaviour: run for hosts that notified it 19:44:24 task + run_once + notify=> should it run for all hosts? 19:44:50 task +host a => notify, task + hostb => notify, handler + run_once? should it run just for first host notified? 19:45:19 run_once being a 'misnomer' .. really is 'run_on_first_host_apply_to_all' 19:46:16 the problem being 'notify' is not applied to all, just register and ansible_facts and 'task status' 19:46:47 i can see choosing any of many behaviours, but i think we need to map them all out, then vote and decide, then clearly DOCUMENT 19:46:53 run_once on a handler is tough 19:47:24 sdoran: he, hence me punting fromticket to this meeting, i can see how expectations can go many ways and even 'useful functionality' also go many ways 19:47:27 The case the OP brings up seems like another case for a second play that references output from the first, rather than a handler at all (since it's a 1:M action) 19:47:27 I was thinking run_once on a play task. That seems more straightforward, and the 2.0 behavior, as I understand it, seems correct. 19:47:57 It's like, "yeah, you *could* do that with a handler", but it sorta worked again by accident in 1.x 19:47:57 nitzmahone: yes, but it also opened my eyes to many behaviours we can have with run_once combos we have not defined 19:48:09 I think from the issue, I believe the current behavior to be correct. `run_once` only runs on 1 host, applies the result to all. And handlers inherit this from the calling task 19:48:16 nitzmahone: by accident or not, the problem is that it was and is 'undefined' behaviour 19:48:36 sivel: handlers dont inherit from caller, they get notified 19:48:36 I'd just write a test for it, and close the issue as expected 19:48:44 I'm +1 for "document existing behavior, add tests to keep it that way" 19:48:45 sivel: i.e caller has become:false 19:49:00 It seems that if you want granular control over what task runs when and where, using regular tasks with conditionals is more appropriate. 19:49:13 sdoran: +1000 19:49:17 sdoran: totally agree 19:49:18 if the task only executed on 1 host, the handler should do the same 19:49:29 handlers are another one of those "cute" features that kinda suck in many real-world applications 19:49:39 so document, write an intg test, close the issue 19:49:40 Handlers are a nice convenience mechanism. I think it's ok to have that convenience mechanisms be for small and targeted use cases. 19:49:42 nitzmahone: soooo true 19:50:08 +1 to document existing, add tests, and close issue 19:50:13 k, we seem to have some consensus, keep things simple, document current behaviour 19:50:21 + tests to verify 19:50:38 9 minutes to close here 19:50:41 sdoran: you wanna write that up on that ticket and P3 it? 19:50:56 * nitzmahone suspects this one is going long; we should move on if dw is back 19:50:58 #topic https://github.com/ansible/proposals/issues/154 19:51:00 dw? 19:51:16 im here 19:51:44 we only have about 8 minutes left, before we are needed elsewhere 19:51:55 sigh .. need 5h meetings for these ... 19:51:56 I can hang out longer 19:52:09 nitzmahone: you have 2 meetings 19:52:15 sure that 5h is enough? ;) 19:52:24 I know, but others can cover those 19:52:33 so 2 things.. this issue of per-task variables somehow making it into the connection plugin, how should that work? and this issue of choosing which plugin to run within ansible-connection 19:52:38 got a meeitng till 5pm, but im open to discuss this and other topics later 19:53:07 dw: executor has a function that deasl with that, gets the 'plugin specific vars' and hands them off 19:53:28 plugins (via config) now list the vars they care about 19:53:35 ansible_python_interpreter will also be an interesting problem, since we're adding interpreter discovery 19:53:55 ^ that one being an 'exception' as actionplugin/execute module deal with it directly 19:53:56 Yeah, the trick is if they change (which is legit and used by tests) from task to task when the interepreter is persistent 19:54:38 When we do persistent connections "properly", we need to just hash the connection vars when looking up an existing connection 19:54:38 nitzmahone: we can easily add an ansilbe_ _interpreter inclusion for the plugin vars filter (i believe we already do) 19:54:57 nitzmahone: that was the idea, just hash 'self._options' 19:55:21 but now its 'self._optoins + self._become + self._shell + self._terminal (networking)' 19:55:23 That's not the only problem there though with interpreter discovery; we'd want that to work for mitogen as well, but I haven't come up with a great way to do that without hooking our connection setup into mitogen's bootstrap process 19:55:51 nitzmahone: is there a ticket i can read about interpreter discovery? presumably it's about this upcoming redhat no /usr/bin/python thing 19:55:54 that is why i thought mitogen should be a connection plugin itself 19:56:10 dw: related, but no, its more broad than that 19:56:15 dw: yeah, that's what pushed it over the edge, but sans py3 distros are getting pretty common now 19:56:32 s/py3/py2/ 19:56:55 dw: here's the WIP PR for it https://github.com/ansible/ansible/pull/50163 19:57:00 salt-ssh has quite an elaborate snippet for handling this. mitogen's plan was to include such a snippet as a connection option (specify list of paths), but maybe it can weave with ansible wants 19:57:16 *with what ansible needs/wants 19:57:33 dw: basically ansible does simlar but not in the connection plugin itself 19:58:43 Yeah, since it can be set per-task, there are a couple different code paths that might be involved if we want it to work transparently 19:59:11 I don't want to derail on that one though 19:59:33 i'll review the PR. guess it just uses exec_Command() i.e. "raw-like" to run a snippet? 19:59:39 nitzmahone: its pretty crucial to mitogen reuse of same instance 19:59:46 (I have ideas for how a mitogen-aware module wrapper could handle that pretty seamlessly for the "connection already exists" case) 20:00:00 bcoca: its possible if it's just "raw like", then mitogen's implementing raw will just fix it naturally 20:00:01 yes 20:00:13 if ansible sets a varying interpreter in followup tasks, its just like the user doing the same 20:00:17 i dont see it as a big problem, but an issue that MUST be resolved 20:00:39 dw: ansible should not, discovery only happens if interepreter is not set already 20:00:45 once ansible sets it, it is cached 20:00:54 Yeah, it's discovering the initial interpreter that will be "fun" with mitogen 20:01:28 (basically we'd need raw to work one way if mitogen's not bootstrapped yet, and another way if it was) 20:01:45 nitzmahone: i already have 'implement raw' on the hit list, using a heuristic where it just formats command lines like ansible does if no connection exists, but done generally so it works with proxying (e.g. exec_command() on a mitogen_Via= will run the plain old command using the parent context's interpreter, not directly, etc) 20:02:04 nitzmahone:worst case, autodiscovery just doesnt work with mitogen, user must set 20:02:17 so we can use the raw connection/command layer to figure out which interpreter mitogen should use, then spin it up; future changes would require a subprocess and/or a separate channel 20:02:23 nitzmahone: yep, you've had the same thoughts by the sound of it :) 20:02:54 that sounds like good solution 20:02:55 I figure a similar approach would work for become; I think we can still keep the performance benefits of the persistent interpreter 20:03:09 (it'd just potentially be N of them) 20:03:26 but using a local transport so we only have one connection to the host instead of muxing 20:03:32 the existing extension already deals with varying interpreters, i guess for first cut, all that logic will still live in connection plugin until ansible has a general mechanism for something similar 20:03:58 Would probably be the simplest 20:04:03 cool 20:04:27 regarding second second issue.. bcoca had thoughts on a restructuring of connection plugins to make persistent a plugin-level property 20:04:44 is that something that can be avoided for the time being? it seems like a great way to snowball things 20:05:05 would also require restructuring executor, the 'hash' idea above was a way to dot his, keep connection hash in dict cache pointitng to the objects for reuse 20:05:15 I've had the same thought, and I think we can get away without it for now, but long-term, absolutely, that needs to be something that the connection (and other layers) can opt into/out of 20:06:04 ok. so, any thoughts on how we should select the mitogen plugin for now? i was thinking adding command line param to ansible-connection for it, and updating persistent plugin 'somehow' to set that ocmmand line 20:06:15 nitzmahone: i think we all have a similar general approach, need a tardis and cloning vat to get it done 20:06:20 eg, a persistent connection/interpreter needs to exec a module that is incompatible with that (think `pip`); ideally the module isn't responsible for respawning itself, that should be a service provided by the execution env 20:06:58 dw for now i would ignore ansible-connection, try to get the implementation just to work with mitogen executor on the remote side, we can add persistence a bit later 20:06:59 (just a declarative "I don't work and play well with others" marker in the module metadata to trigger that under a persistent interpreter) 20:07:54 If we make it dependent on the threading branch, we can sidestep the whole ansible-connection mess 20:08:07 bcoca: that'll make one hell of a slow plugin :) but it does reduce the scope of the PR significantly. i like it 20:09:32 but yeah, our big things *other* than classic SSH all need persistence too (powershell, networking, etc), so that's a problem we should solve generally instead of hacking something together just for this 20:09:50 agreed 20:10:17 dw: baby steps make it a lot easier to integrate and flush out the issues 20:10:21 The ansible-connection way probably wouldn't be too hard to hack together anyway (as a stopgap) if you get the other stuff solved 20:10:35 But maybe don't start there 20:10:35 ok the focus is /just/ on the executor 'short circuit' in this case. it sounds like re: ansible_*_interpreter, the executor already knew about task_vars anyway, so there is little harm in including it as part of the interface. i think? 20:10:38 not saying it willb e final form, but good enough to get 'working implementation' 20:11:13 i'll just leave this here then... https://github.com/ansible/ansible/pull/44280 20:11:31 ^ (threading branch, if anyone wants to test it out) 20:12:31 Who's on the hook for the connection exec_module thing; is dw doing it, or one of us? 20:12:57 * nitzmahone wants a lot of caution tape on that to dissuade others from relying on it 20:13:47 my hands are full for at least the coming weeks, but happy to look at it if there is a clear design (sounds like we have one here) 20:14:04 any other comments on the proposal? these are just the issues i knew about 20:14:23 dw: i' might update later 20:14:38 for now im going to end the meeting, keeping this open for more discussion at a later date 20:14:40 bearing in mind it sounds like a lot of it can be deleted now :) which is great 20:14:51 * nitzmahone is pretty slammed until 2.8 freeze ~ March 14, but would like to work on this afterward 20:14:52 he 20:16:28 dw: I think having something that lives only in the connection box with the changes we've discussed will be a great place to start figuring out what else we don't know (plus the "really raw" vs "mitogen-is-up-raw" thing we talked about earlier) 20:16:43 yes, it really is 20:16:57 raw support is easy to do in mitogen, but not without if/else spaghetti. its pending a big refactor i don't want to touch 20:17:06 heh 20:17:11 story of our lives ;) 20:17:38 big feature, deadline, refactor or spaghetti for dinner? 20:18:06 :) all hell is going to break lose in a dev branch real soon, just avoiding it for as long as i can because maintenance cost might double 20:18:29 * nitzmahone is sometimes envious of green fields ;) 20:18:54 OK, anything else we need to discuss right now? 20:18:54 he 20:19:00 im happy 20:19:09 #endmeeting