19:00:01 <nitzmahone> #startmeeting Ansible Core Public IRC Meeting
19:00:01 <zodbot> Meeting started Tue Apr  6 19:00:01 2021 UTC.
19:00:01 <zodbot> This meeting is logged and archived in a public location.
19:00:01 <zodbot> The chair is nitzmahone. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:01 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
19:00:01 <zodbot> The meeting name has been set to 'ansible_core_public_irc_meeting'
19:00:10 <nitzmahone> #chair mattclay
19:00:10 <zodbot> Current chairs: mattclay nitzmahone
19:00:24 * mattclay waves
19:00:29 <nitzmahone> #info agenda: https://github.com/ansible/community/issues/605
19:01:18 <nitzmahone> #info ansible-core 2.11.0rc1 went out last night, rc2 is likely to follow today for a bugfix
19:01:45 <nitzmahone> still looking on track for late April 2.11.0 GA
19:02:21 <sdoran> \o
19:02:40 <nitzmahone> There's one item on the agenda- not sure if we've got a quorum to discuss
19:03:06 <nitzmahone> #topic https://github.com/ansible/ansible/issues/32379 (reconsider host failure behavior for delegated host)
19:04:20 <nitzmahone> bcoca documented some initial thoughts- I could see people expecting any number of different behaviors for this, personally...
19:04:25 <sdoran> Is this asking to _not_ delegate to hosts we know have failed/are unreachable?
19:05:26 <nitzmahone> Yeah, IIUC basically if the delegation target host has failed, we'll keep trying for other hosts/tasks that delegate to the same host
19:05:39 <sdoran> Delegated hosts are usually outside the play hosts, but I guess checking if the delegated host is in the current list of hosts we know is unreachable does seem reasonable.
19:05:43 <nitzmahone> Which doesn't match the direct exec behavior
19:05:54 <nitzmahone> Yeah
19:06:51 <nitzmahone> But that depends on the failure- if we know it was a transport-level failure, that makes sense, but if there's some other failure that might've been caused by the source host config or something, penalizing all the other delegators doesn't make sense IMO... It's a tricky one
19:07:03 <sdoran> But I can see why it is the way it is currently: delegated hosts were meant to be outside the play.
19:07:27 <sdoran> For connection errors, I like the idea of handling that.
19:07:55 <sdoran> But for tasks that failed on a delegated host... that should not preclude all subsequent tasks delegated to that host from running.
19:07:58 <nitzmahone> Yeah, if we limited it to just `unreachable`, I'd be +1
19:08:01 <nitzmahone> agreed
19:08:05 <sdoran> Yeah, I like that.
19:08:28 <sdoran> Currently the behavior is just pretty naive. It should get a little more robust.
19:08:40 <sdoran> But it needs docs. :)
19:08:59 <nitzmahone> and for the last point bcoca mentions in https://github.com/ansible/ansible/issues/32379#issuecomment-340809171, I'd say "yes"- resetting the host states should also reset for delegated hosts, but I bet that gets "interesting" for the bookkeeping
19:09:00 <sdoran> To prevent confusion derived issues.
19:09:21 <nitzmahone> There's also the problem that we often can't tell the difference between some transport/non-transport errors
19:09:40 <nitzmahone> (so things are improperly marked `unreachable` when they shouldn't, and vice versa)
19:10:02 <nitzmahone> A lot of it is up to the connection and/or where in the process it blew up
19:10:40 <sdoran> Yeah, the bookeeping will be very interesting. False positives (connection reporting failure and host being marked unreachable) would be annoying.
19:10:48 <nitzmahone> But I'm not sure that should affect this behavior- basically if we marked the delegated host unreachable, it shouldn't be used for further delegated tasks
19:11:14 <nitzmahone> If that exposes other bugs in the transports or payload infra, we should fix them
19:11:40 <nitzmahone> @bcoca (or anyone else)- anything to add here?
19:12:18 <nitzmahone> I'd have to go look, but I suspect  `meta: clear_host_errors` will need some special-casing to deal with it
19:12:40 <sdoran> Yes, but it should clear errors for delegated hosts as well.
19:12:46 <sdoran> That makes the most sense to me.
19:12:53 <sdoran> (At this moment in time ;) )
19:13:05 <nitzmahone> yeah- that'd be the part that'd probably need to be added, since I *think* it only uses the current play's host list
19:13:37 <nitzmahone> OK, unless there's anything else on that one...
19:13:42 <nitzmahone> #topic open floor
19:14:03 <sdoran> I'll leave a comment on the issue.
19:14:40 <nitzmahone> cool, thanks
19:14:58 <nitzmahone> We'll close in 3min if no new topics...
19:16:42 <felixfontein> oh, there's actually a topic, and I miss it :D
19:17:04 <nitzmahone> Anything to add on it? ;)
19:17:52 <felixfontein> no, I didn't really think about it yet, so I don't have anything to say :)
19:18:20 <nitzmahone> Feel free to comment on the issue if you come up with anything later :)
19:18:27 <nitzmahone> With that, we'll close til next week. Thanks all!
19:18:31 <nitzmahone> #endmetting
19:18:33 <nitzmahone> #endmeeting