19:07:12 <bcoca> #startmeeting ansible core irc public meeting 19:07:12 <zodbot> Meeting started Tue Jul 24 19:07:12 2018 UTC. 19:07:12 <zodbot> This meeting is logged and archived in a public location. 19:07:12 <zodbot> The chair is bcoca. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:07:12 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic. 19:07:12 <zodbot> The meeting name has been set to 'ansible_core_irc_public_meeting' 19:07:41 <bcoca> #topic https://github.com/ansible/ansible/pull/42513 19:07:48 <jtanner> yo 19:08:03 <shertel> hi 19:08:06 <bcoca> i for one, would just remove it, if dev consideres traceback appropriate, he should include it in fail_json 19:08:14 <bcoca> #chair dag jtanner shertel 19:08:14 <zodbot> Current chairs: bcoca dag jtanner shertel 19:09:14 * ryansb waves 19:09:14 <abadger1999> I think the relative frequency is backwards. 19:09:26 <abadger1999> It's most often appropriate and sometimes inappropriate. 19:09:31 <bcoca> #chair ryansb abadger1999 19:09:31 <zodbot> Current chairs: abadger1999 bcoca dag jtanner ryansb shertel 19:10:04 <bcoca> i dont think it is inappropriate, when explicit, when implicit it is most of the time redundant 19:10:20 <bcoca> when not misleading as teh case dag is trying to avoid 19:10:25 <abadger1999> I do like the python3 check seems to be appropriate neraly 100% of the time but tracebacks are just too important for debugging for me to want to abandon them 19:10:49 <dag> o/ 19:10:57 <maxamillion> .hello2 19:10:58 <zodbot> maxamillion: maxamillion 'Adam Miller' <maxamillion@gmail.com> 19:11:12 <abadger1999> So I think I'm -1 on this change.... it takes us back to debugging being awful as before bcoca added this. 19:11:43 <bcoca> i added this? i was pretty sure i always thought it was not needed 19:11:47 <dag> bcoca: I don't disagree, in any case providing a traceback that is unrelated to fail_json() must be avoided, that's why I would disable it at least for Python2 19:11:51 <abadger1999> Yeah, you added it. 19:11:57 <abadger1999> We all were happy when you did. 19:12:15 <dag> abadger1999: This is not about removing it, it's about removing unrelated tracebacks (that don't even represent an error) 19:12:40 <dag> abadger1999: Currently on python2 you get the last (caught) exception being provided with every fail_json 19:12:42 <abadger1999> dag: You don't have a heuristic for when the traceback is unrelated. 19:12:51 <abadger1999> yes. 19:12:52 <dag> abadger1999: not in python2 19:13:17 <dag> abadger1999: in python3 the traceback is only from the exception-block, so it will always be related 19:14:06 <abadger1999> Right.... as I said, I like that yo've found that in python3 the traceback would always be related. 19:14:14 <abadger1999> But it still removed the traceback in python2. 19:14:40 <abadger1999> I think that the frequency of unrelated tracebacks :: related tracebacks is the opposite of what you think. 19:14:55 <abadger1999> So I'd rather have a few unrelated tracebacks than not having it at all. 19:15:01 <dag> abadger1999: I have been investigating module problems various times, and the incorrect traceback made it a confusing and timewasting experience 19:15:17 <dag> abadger1999: *every* fail_json will have an unrelated traceback 19:15:29 <dag> the traceback could be from anywhere in the previous code 19:15:32 <abadger1999> <nod> Yeah. But prior to bcoca adding this, we had much timewasting due to not having the traceback information. 19:15:34 <bcoca> abadger1999: it was trishna, not me that added it 19:15:40 <dag> so I would say, most will be unrelated 19:16:12 <abadger1999> bcoca: Welll..... git log gives you the credit... 19:16:18 <abadger1999> commit d5f4c2a54b713da84aa9a17eacb294b0e55cdc30 19:16:19 <abadger1999> Author: Brian Coca <bcoca@users.noreply.github.com> 19:16:21 <abadger1999> Date: Wed Jun 28 10:08:38 2017 -0400 19:16:22 <abadger1999> auto tb on vvv+ or debug (#26106) 19:16:24 <abadger1999> 19:16:26 <abadger1999> * auto tb on vvv+ or debug 19:16:27 <abadger1999> 19:16:28 <abadger1999> * updated as per feedback 19:16:57 <dag> fail_json() doesn't mean there was an actual problem, why then even add a traceback 19:17:08 <bcoca> that waas my point 19:17:12 <dag> I don't understand why we would be confusing users (on python2) on purpose 19:17:14 <bcoca> fail_json != exception 19:17:32 <dag> in most cases where the exception is relevant, we should add the traceback ourselves 19:17:38 <dag> and in various places that's done 19:17:47 <abadger1999> dag: The majority of tracebacks are correct, so it helps us debug much quicker. 19:17:59 <dag> abadger1999: I would contest that 19:18:05 <abadger1999> Yes, we should add the tracebacks ourselves... but module authors have not done that. 19:18:05 <dag> most of the tracebacks are incorrect 19:18:20 <abadger1999> Which is why we have problems. 19:18:34 <dag> which is why I have spend hours looking for the wrong thing 19:18:37 <Pilou> traceback isn't always added but only when debug is enabled or vvv 19:18:37 <dag> and gave up 19:18:50 <bcoca> neither of you knows the exact ration tb to false positive, so i don think either assertion is correct 19:19:05 <abadger1999> Pilou: that is true. I would be okay with a higher v level.... 19:19:08 <dag> Pilou: true, we're talking about those cases 19:19:21 <abadger1999> bcoca: rather... one of the assetions is correct but we don't know which one ;-) 19:19:57 <bcoca> ok, rephrase, neither assertion is based on facts we can check, i suspect they are mostly based on personal experience, which does not help for the broad decision to be made 19:20:16 <abadger1999> We could check. 19:20:16 <dag> bcoca: adding incorrect information to an error is the wrong thing to do 19:20:24 <Pilou> in the given example 'msg' allows to diagnose error 19:20:26 <bcoca> i dont disagree 19:20:36 <abadger1999> But someone would have to write the scripts. 19:20:51 <abadger1999> msg only helps when msg is non-generic. 19:21:08 <Pilou> using python2 a prefix could be added, something like: "this information might help to understand the failure" 19:21:20 <abadger1999> People like to write catch-all error handlers, for instance, which makes the msg meaningless. 19:21:27 <abadger1999> <nod> 19:21:31 <bcoca> abadger1999: i believe my contribution was to gate it behind vvv 19:21:46 * dag thinks this feature was designed with Python3 only 19:22:06 <bcoca> abadger1999: those are bad modules and clearly stated against in our guidelines, it should be part of review ot avoid catchalls 19:22:15 <dag> and there it works much better, because outside an except-statement it doesn't do anything 19:22:28 <abadger1999> @bcoca Perhaps... I'd have to find the PR to see if it only shows you because it was squashed... in git log it just shows that you added all the relevant code. 19:22:31 <bcoca> dag: iirc this was added mostly when py2 was our main target 19:22:42 <abadger1999> @bcocaand yet we have them and we add more all the time. 19:22:48 <dag> bcoca: that doesn't mean it was written and tested with python3 mostly 19:23:02 <bcoca> abadger1999: then that is a problem with our enforcement, we should fix that, not ignore it 19:23:07 <abadger1999> Getting tracebacks cuts debugging time down vastly. 19:23:32 <dag> abadger1999: well, in my cases it made it impossible to understand the error 19:23:37 <dag> so no, it does not cut down vastly 19:23:41 <bcoca> dag: yes, at least my side was, but only with exceptions i forced, so i did not check if 'unexpected exceptions' were added as i didnt know py2 had that particular behaviour 19:23:45 <dag> (unless on Python 3) 19:24:13 <bcoca> very hard to test a case you are unaware of 19:24:26 <abadger1999> dag: For the most part, it does cut down vastly. You were confused by the side effects in a few cases. 19:24:45 <abadger1999> That doesn't mean that we should throw the baby out with the bathwater. 19:24:55 <dag> abadger1999: I always look at all the information provided with the error, especially if messages are meaningless (which they often are- 19:25:01 * bcoca is not fan of babies, they cry and poop and cry again 19:25:22 <dag> abadger1999: I don't want to throw it away, in fact I keep it in Python 3 19:25:33 <bcoca> dag: again, that is a problem with the module quality and should be taken up in review, or even put as a bug against the module 19:25:39 <dag> abadger1999: but *any* fail_json() with -vvv will gave a traceback 19:25:40 <abadger1999> bcoca: https://github.com/ansible/ansible/pull/26106 so it was all you ;-) 19:25:41 <bcoca> messages should be clear and helpful 19:25:45 <dag> on python2 19:26:01 <abadger1999> dag: not any 19:26:10 <dag> abadger1999: try it 19:26:26 <dag> abadger1999: if there was an exception-handling block prior 19:26:37 <dag> and we have quite a few of them in module_utils code 19:26:51 <bcoca> abadger1999: i ended up writing the version we merged, but source was not me, started in networking, also in aws 'exceptions' and the first attempt was adding it to EVERY return, so i added the -vvv and put the pr up 19:26:55 <abadger1999> dag: ansible localhost -m command -a '/bin/false' -vvv 19:27:01 <dag> my first attempt to fix this was actually replacing try-except blocks with proper checks 19:27:18 <dag> abadger1999: that's too simple a case, there's no parameter-type checking etc. 19:27:35 <bcoca> ok, so im going to stop it now, i don think either of you is going to convince the other, lets hear the rest of the room 19:27:41 <shertel> Most AWS modules use traceback appropriately. The ones that use AnsibleAWSModule and fail_json_aws() do by default. 19:28:24 <dag> shertel: even if they do it properly, on python2 when you don't add traceback, the code will do it for you, something unrelated 19:28:25 <abadger1999> dag: ansible localhost -m ping -a 'unknown_param=test' 19:28:30 <abadger1999> dag: ansible localhost -m ping -a 'unknown_param=test' -vvv 19:28:37 <bcoca> yes, but that is specific call, vs this one that applies globally to all with the 'mandated exit error method' 19:29:03 <bcoca> ^ shertel, we are no discussing changing aws nor networking versions, just the general one 19:29:48 <abadger1999> I think shertel is properly pointing out that a large number of modules do set traceback explicitly. 19:30:19 * shertel nods 19:30:20 <bcoca> or they call a function that does it for them 19:30:30 <bcoca> vs fail_json which did not do this automatically until the change 19:30:35 <bcoca> fail_json_with_tb 19:30:44 <bcoca> ^ would be analogous in this case 19:31:06 <abadger1999> yes, I lump those two (calling a function which does it and calling fail_json directly and doing it) together. 19:31:37 <bcoca> im not and i dont think dag is either 19:32:04 <abadger1999> Heh 19:32:17 <abadger1999> Although, opened a random amazon module and it doesn't. 19:32:24 <bcoca> not all do 19:32:24 <abadger1999> except ClientError as e: 19:32:26 <abadger1999> module.fail_json(msg="Can't authorize connection - " + str(e)) 19:32:32 <abadger1999> data_pipekline.py 19:32:35 <bcoca> also many predate the function 19:32:45 <bcoca> i don think anyone has gone back and handled all of those 19:33:08 <abadger1999> Looks like that's been cargo culted around a lot. 19:33:14 <ryansb> abadger1999: FWIW modern amazon modules do use `fail_json_aws` which includes traceback 19:33:22 <ryansb> automatically 19:33:28 <abadger1999> However, in most aws modules, authorization only happens in one place so it's not too bad. 19:33:35 <abadger1999> ryansb: ahhh... 19:33:41 <mattclay> What if we differentiate between an explicitly set traceback in Python 2 vs one that may (or may not) be unrelated, so at least when viewing it you know if it's possibly suspect. 19:33:50 <ryansb> and in newer AnsibleAWSModule, the auth stuff is all in `module.client(....)` 19:33:57 <ryansb> so it's now centralized, thankfully 19:33:57 <bcoca> mattclay++ 19:34:01 <abadger1999> ryansb: Add... So that would mean that we're now relying on the automatic addition of traceback? 19:34:19 <abadger1999> s/Add/Ahh/ 19:34:21 <ryansb> nono, AnsibleAWSModule.fail_json_aws will add it 19:34:30 <ryansb> so we are not relying on it in fail_json_aws 19:34:34 <abadger1999> <nod> Cool. that's what we thought before. 19:34:41 <ryansb> but some other modules may, if they use fail_json, as you say 19:34:42 <abadger1999> ? 19:34:50 <abadger1999> Did you mean 19:35:01 <abadger1999> so we are not relying on it in fail_json() ? 19:35:16 <dag> abadger1999: ansible localhost -u dag -m stat -a 'path=/root/.bashrc' -vvvv 19:35:24 <dag> gives me 19:35:25 <dag> The full traceback is: 19:35:25 <dag> File "/tmp/ansible_qQWM79/ansible_module_stat.py", line 476, in main 19:35:25 <dag> st = os.lstat(b_path) 19:35:43 <ryansb> all AWS modules that use `AnsibleAWSModule` do not rely on it. There *may* be older AWS modules that use basic.AnsibleModule and expect the traceback included 19:35:58 <bcoca> abadger1999: the aws function predates the inclusion in fail_json, so it should have never relied on it 19:36:17 <abadger1999> dag: Yes. I know that you can get false poities... the question is whether the false positivies or the false negatives outweight each other. 19:36:42 <dag> abadger1999: if you're providing error information that may be false, I wouldn't add it 19:36:49 <dag> that's all I am saying 19:36:55 <bcoca> i think matt brought up great point, just add another field when the exception is auto added 'exception_auto_added' true/false 19:37:06 <bcoca> ^ that should at least HINT that it is not the dev doing explicitly 19:37:33 <bcoca> would that 'split the baby'? 19:38:16 <abadger1999> ryansb: Right... I think we're all saying the same thing but confusing each other.... If you call fail_json_aws() then it adds the traceback. If you call fail_json() then it depends o nthe automatic addition to add the traceback. 19:38:18 <abadger1999> Correct? 19:38:27 <ryansb> Yeah, that's right 19:38:38 <abadger1999> bcoca: yep, that would work for me. 19:38:42 <bcoca> dag? 19:40:02 <bcoca> anyone else have input? 19:40:03 <shertel> abadger1999 It may depend on the automatic addition in fail_json if it's an older module. Newer modules should not and I haven't noticed any that do. Newer modules use fail_json when there isn't a traceback. 19:40:20 <dag> ryansb: fine for me, but it may still be confusing, even if it adds 'hint' 19:40:22 <shertel> s/modules/AWS modules/ 19:40:33 <bcoca> shertel: older do so too, only some of fail_json calls correspond to a traceback having happened 19:41:07 <bcoca> dag: so if you add back the tb on py2 but use the code to add a field warning about the traceback might be unreleated, i think we can go forward 19:41:49 <bcoca> or just add the hint to the traceback itself? 19:41:59 <bcoca> to avoid 'non display of other field' issues 19:42:43 <bcoca> "ansible automatically added this traceback, it might not be related to the current failure:' 19:43:27 <abadger1999> shertel: <nod> 19:43:47 <bcoca> ok, if nothing else, i'll close this issue 19:44:24 <bcoca> #topic https://github.com/ansible/ansible/pull/39913 19:44:37 <bcoca> @jimi|ansible want to revisit the past? 19:44:48 <dag> bcoca: ok 19:45:25 <dag> bcoca: the advantage is that python3 users won't be living in uncertainty either ;-) 19:45:33 <abadger1999> #action PR to be updated to add warning that traceback may be irrelevant instead of omitting traceback 19:45:38 <bcoca> abadger1999, dag one thing to note, you both seem to consider the module authors not giving good quality error messages, i suggest that we concentrate on catching those things on review 19:45:46 <abadger1999> dag: <nod>, yeah, I like that. 19:45:53 <bcoca> dag: why i thought the py2 detection was good place for this warning 19:46:18 <bcoca> anyone want to review cobbler module? 19:47:24 <dag> it was already reviewed by jimi-c in a previous meeting, and he privately asked me to change something (see last commit) 19:48:46 <bcoca> dag: i'll try to get you hooked up, moving to next topic 19:49:01 <bcoca> #topic https://github.com/ansible/ansible/pull/38325 19:49:03 <bcoca> +1 from me 19:49:14 <bcoca> not sure how we did not have this already 19:49:27 <jimi|ansible> +1'd that cobbler_system module 19:49:40 <jimi|ansible> dag: i added an "official" review now 19:49:52 <dag> jimi|ansible: thanks ! 19:50:07 <bcoca> 'jimi-c approved these changes 12 hours from now' 19:50:16 <bcoca> gimme back my tardis!!! 19:50:39 <bcoca> anyone else on the 'finished' test? 19:51:09 <agaffney> +1 on 'finished' 19:52:15 <jhawkesworth_> I still think it looks good to merge :-) 19:52:24 <bcoca> closgin vote in 10s +4,-0 19:52:40 <bcoca> done! 19:52:50 <bcoca> will merge once tests pass 19:53:07 <bcoca> #topic https://github.com/ansible/ansible/pull/43128 19:53:08 <dag> thanks everyone ! 19:53:13 <bcoca> thank you 19:53:36 <bcoca> ^ PR that removes the need to use until+register and allow to do 'retries' directly 19:54:09 <bcoca> mostly want to vote on functionality, we have some comments on implementation but not worth giving until we first decide on wanting teh feature at all 19:54:28 <bcoca> https://github.com/ansible/proposals/issues/132 <= related proposasl 19:54:52 <abadger1999> +1 finished 19:55:19 <bcoca> +5, -0 19:56:10 <jhawkesworth_> yeah I'd like this, I recall getting stuck on this same issue long ago (until sivel pointed out fix). 19:56:11 <agaffney> being able to use 'retries' by itself makes a lot of sense 19:56:35 <bcoca> anyone against? 19:56:40 <jhawkesworth_> trying to think if it might break things 19:56:53 <jimi|ansible> I'm +1 to the idea, though I said earlier I wasn't a fan of some of the internal changes to the retries counter 19:56:55 <bcoca> my only concern is if people rely on until: '{{{ |default(omit)}}' might break 19:57:00 <agaffney> one problem I see with that implementation is that anyone currently relying on the default of 'retries: 3' will no longer get retries 19:57:02 <jimi|ansible> it was already hard to understand a bit, and this makes it worse 19:57:19 <jimi|ansible> agaffney: i don't think so, because it still defaults to 3 if the value is None 19:57:27 <bcoca> ^ i did check taht 19:57:29 <agaffney> ah, yeah, just saw that 19:57:34 <bcoca> but implementation aside, anyone against the 'feature'? 19:57:51 <maxamillion> nope, I like it 19:58:20 <bcoca> ok, giving featuer a go ahead (i'll update proposal_) and we can now nitpick the PR 19:58:42 <dag> I am surprised nobody thinks this is magic, or too implicit ;-) 19:58:45 <ryansb> 2 minutes until Windows Working Group is scheduled to meet 19:59:17 <bcoca> abadger1999: want to announce ? 19:59:18 <jimi|ansible> dag: well you still have to specify `retries:` on the task, but now people are going to want that on blocks too... 19:59:22 <agaffney> dag: eh, I've always through it weird that you had to do 'until: foo is not failed', when that's a perfectly reasonably default for retrying a task 19:59:28 <jimi|ansible> one of the few downsides I've thought of 19:59:39 <dag> agaffney: I don't disagree :-) 19:59:41 <bcoca> #topic schedule announce ment by abadger1999 19:59:54 <dag> jimi|ansible: ;-) 19:59:58 <abadger1999> Ansible-2.7 is going to come out on 10-04 20:00:21 <agaffney> is that ~6 weeks? 20:00:25 <abadger1999> We had a tentative schedule in the roadmap with freeze dates and such but it did not have an room for slippage. 20:00:39 <agaffney> oh, no, 10 weeks 20:01:20 <bcoca> @sylr? 20:01:23 <abadger1999> I've now updated the dates to give us some time for extra rcs... unfortunately, this cuts two weeks off before the first freeze date. 20:01:35 <dag> it's a problem for $CUSTOMER :-( 20:01:43 <bcoca> ^ #ansible-devel and the ML should get notice also 20:01:44 <abadger1999> https://github.com/ansible/ansible/blob/devel/docs/docsite/rst/roadmap/ROADMAP_2_7.rst#expected 20:01:52 <dag> it means some stuff will have to wait for v2.8 20:01:58 <abadger1999> 2018-08-23 Core Freeze (Engine and Core Modules/Plugins) 20:02:09 <abadger1999> 2018-08-30 Community Freeze (Non-Core Modules/Plugins) 20:02:21 <dag> due to holidays-season etc. 20:02:39 <abadger1999> Some features that we've proposed for the 2.7 release may slip to 2.8 as well due to this. 20:02:41 <bcoca> hopefully by then using custom content will also be much easier 20:03:14 <abadger1999> We are going to have an internal meeting on friday to see if we can change the release date but so far all induications are that the releasze date is set in stone. 20:03:48 <abadger1999> So look for Core Freeze on 8-23 and Community Freeze on 8-30. 20:05:26 <abadger1999> That's all from me. 20:05:28 <bcoca> with that, calling meeting to end, one item left, but it will be aved for thursday 20:05:31 <bcoca> #endmeeting