17:25:11 #startmeeting Testing Working Group 17:25:11 Meeting started Thu Mar 30 17:25:11 2017 UTC. The chair is gundalow. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:25:11 Useful Commands: #action #agreed #halp #info #idea #link #topic. 17:25:11 The meeting name has been set to 'testing_working_group' 17:25:20 #chair mattclay 17:25:20 Current chairs: gundalow mattclay 17:25:24 * mattclay waves 17:25:30 hey :) 17:25:37 bjornar_: You got anything 17:25:59 What happened with testing procedures wrt the 2.2.2.0 release? 17:26:18 #topic Testing Procedures 17:26:22 In what regard? 17:26:50 What testing procedures are run before a ansible release? 17:27:02 ah, OK 17:27:16 Tests are run on PRs and after each commit, rather than prior to a release. 17:27:49 So there is no extended testing before a stable release? We are basically the testers? 17:28:23 bjornar_: tests run when we cut release also 17:28:24 RC are also tested by the Ansible Tower Team 17:28:59 Sorry, I didn't release you were referring to manual testing. I was thinking purely of the automated tests. 17:29:07 s/release/realize/ 17:29:34 I am for example thinking about the 2.2.1.0 release, where expansion of vars[] var changed entirely. 17:29:54 Also, the 2.2.2.0 release had issues with executing the simplest playbook (no groups) 17:30:34 We would like to contribute tests for these and perhaps other issues, but need to know how the process is supposed to work. 17:30:43 bjornar_: we dont test 'all behaviours possible' 17:30:43 Do you have links to specific issues so we can look at the details? 17:30:56 first case was CVE and passed tests, just did not have test for your use case 17:30:57 Thanks for your offer to help 17:31:00 At the moment, it feels like we need to both write playbooks/roles and tests for same 17:31:09 bjornar_: that would be fantastic if you could help coverage with automated tests 17:31:12 2nd release also passed tests, we did not have inventories in tests that had the 'ungrouped' problem 17:31:33 as far as procedure is concerned, the tests or submitted as PRs like anything else, on a continuous basis 17:32:35 I am for example talking about this issue from 2.2.0.0 -> 2.2.1.0 (https://github.com/ansible/ansible/issues/22913) 17:32:37 had test case for 2nd in devel but did not 'backport the tests' to 2.2 17:33:37 bjornar_: tests live in the test/ subdirectory of the repo. The units and integration subdirectories are where you'd likely be concentrating on adding more. 17:33:42 bjornar_: understood, but that was security bugfix that happend to clobber a 'feature' that we were not aware of, that will happen when you use it in ways we dont expect 17:34:20 bcoca, I find it also hard to find documentation about how vars[] is supposed to work 17:34:22 ^ recursive templating, being that 'feature' 17:35:10 bjornar_: because 'vars' is internal and was not meant for users, hostvars and 'bar' were 17:35:40 that some people figured out 'vars' works, fine, but i've warned plenty of times to not rely on that when i encountered it 17:36:04 ^ again, we cannot test usage of 'features' we did not document nor expect 17:36:37 bcoca, some times you need to use vars.. and anyway this was a behavior change. I am now wondering: what should the functionality be? 17:37:07 bjornar_: that is better question, i would argue you dont really need to use vars, but that is other issue 17:37:31 bcoca, when in a loop and so on, the only place a var is available unfortunately is vars 17:38:15 https://gist.github.com/bcoca/ddae02808f5cb778351bf2dfeb0b76f2 <= same play as bug, working! 17:38:30 bjornar_: show me example, as i've never encountered that 17:38:43 bjornar_: actually belay that, tangent on testing 17:38:54 diff topic, bring up in core meeting or create proposal 17:39:01 yeah, sure 17:39:14 dont want to hijack this, testing group has plenty on plate 17:39:52 But its related to testing, how can I write and contribute tests that are vital to our quite complex suite of roles/playbooks if we really dont know how it is supposed to work, just that it does ;) 17:40:07 bjornar_: It sounded like you may have some spare cycles to help us extend our testing, which would be great. What are were you interested in 17:40:32 bcoca, seems they ended the meeting before it started, so I dont think its that much, unfortunately 17:40:51 gundalow, My only interest is keeping my own playbooks safe to play, really.. 17:40:51 bjornar_: The meeting is running again now 17:40:56 * gundalow points to the topic 17:41:11 It was killed I was I was sat in this room talking to myself for 15 minutes 17:41:16 And if that means we need to contribute tests, so be it. 17:41:39 OK 17:42:02 bjornar_: tests are one thing, unsupported functionality is another 17:42:03 If you look at test/integration/targets you will see various integration tests 17:42:10 Yup, 17:42:16 docs & tests need to lign up 17:42:20 lighn up* 17:42:22 ffs 17:42:23 line 17:42:26 thanks 17:42:34 The easiest way to add integration tests is to create test roles in `test/integration/targets` as gundalow mentioned. 17:42:34 gundalow: been there, do that every 10s, for once ... ITS NOME!!?!?! 17:43:00 Or add to existing roles if appropriate. 17:43:23 but please, dont add tests for functionality that we dont support https://xkcd.com/1172/ 17:43:47 bcoca, thats the big question, then .. support.. 17:43:50 For example, if you're testing a specific module, you'll want to look for an existing `test/integration/targets/MODULE`. 17:44:58 bjornar_: this was a bug fix, it changed 'unwanted' behaviour, you were just relying on it, that will always happen 17:45:16 ^ the issue is identify wanted/unwanted and then support what is left 17:45:30 bcoca, I would say one could write tests for anything more-or-less as long as it does not fail at the time of writing the test, if it fails at a later time, its a behavior change, documented or not, and should be dealt with either by announcing change and deleting the test or by modifying behavior. Atleast you would know 17:46:11 One must not forget that the stability of ansible is vital to many datacenters these days 17:46:18 i would argue, that decision shoudl be made when accepting the test 'is this behaviour we want' 17:46:30 otherwise you have most people 'obeying the test' even if it is unwanted behaviour 17:47:06 and we protect against that by the fact that the module maintainers usually review tests 17:47:07 if i test that ansible deletes /dev when run as root, should that test be accepted just cause it works? 17:47:14 bcoca, yeah -- perhaps, I dont know, if you change behavior of something that I mean not only I use (vars[]), then that should qualify for a changelog entry and a warning 17:47:19 gundalow: in this case its core 17:47:32 aye, was about to say, for core features that's a bit harder 17:47:49 bcoca, because all my mysql-servers. rabbitmq-servers and so on ended up without an ip and with the string "{{ some_variable_that_used_to_expand }}" 17:47:52 bjornar_: we do that for supported functionality, you just happend to use a hack, we cannot guarantee all of these work the same as we are not aware of them 17:48:46 bcoca, I'll send you the playbook that shows what I do privately, its basically some hacks around lacking func atm, yes, but I mean, its not ugly. 17:49:01 bjornar_: i cannot be responsible for you freezing cause i fixed the 'keyboard overheating' problem 17:49:12 bcoca, cut the crap 17:49:50 its not crap, behaviour WILL change with every patch, we can only be responsible of the behaviour we support, not with unknown hacks X and Y are using 17:50:36 so if you say you dont support vars[] you should not expose it 17:50:43 its not reasonable for us to know in advance all possible permutations of hacks others are engaged in to properly document changes on behaviour we dont know about 17:50:59 bcoca, thats where the tests and this meeting comes in 17:51:22 goes back to my point, if it 'works' does not mean it 'should work' 17:51:44 so accepting the tests blindly is also not an option, we need to decide IF we want it to keep working that way 17:51:46 bcoca, does it really matter if I use a functionality that is exposed "correctly" according to the developers mind, or does it matter that the behavior of the exposed func does not change? 17:52:31 bcoca, Yeah, sure -- I agree to some extent. But still, ...behavior change 17:52:31 if i change a screwdriver grip to deal with sweaty hands, but you complain it doesnt work like a hammer anymore, what do you want me to do? 17:52:40 i think every case will be unique and there's no point in arguing the general case 17:52:45 bcoca, again, cut the crap 17:53:29 either you need to unexpose vars and say its not supported, or you need to keep its behavior unchanged, thats my pov 17:53:34 ok 17:53:39 so how can we move this on 17:53:47 again, you need to remember that stability of ansible over time is the prime target 17:53:49 dunno what "this" is 17:54:20 1) Ansible may be exposing things that people should use, we should maybe look at better naming, such as "_" 17:54:21 At this point, I would claim that this has exceeded the purpose of the ansible-testing meeting, and we should move on and propose a specific discussion for the core meeting 17:54:30 Yup 17:54:39 the vars[] issue is just an example here, so dont get to hung up in it. 17:54:39 * gundalow invokes chair powers 17:55:01 So there are a few issues here 17:55:11 bjornar_: do you have a PR or a proposal open? I can't make a whole lot of sense of what just transpired 17:55:27 1) Why is "{{vars}}" accessable if it isn't meant to be used 17:55:34 jtanner: https://github.com/ansible/ansible/issues/22913 17:55:36 iirc 17:55:44 thanks sivel 17:56:35 2) what can we write tests for, current behavor or the behavior that the devs have inside their head 17:57:11 - I vote current behavior, as long as its exposed without plugins 17:57:46 Answer 2: Tests get created and as part of the review process they get reviewed by the relevant people (so module maintainers, or developers that work on that area of the code) 17:57:54 During the process of creating the test you may find 17:57:59 b) Bugs in teh code 17:58:07 a) Bugs in teh documentation 17:58:22 Before we can decide how to test the behavior, we ahve to determine what the correct behavior is. In this specific case, we need to determine that, and that is likely to be done in the core dev meeting 17:58:29 bjornar_: i think tests are always welcome, as long as they assert desired functionality (like bcoca was metioning). I this case, it seems like the "desired functionality" is the current debate, which isn't really within scope of testing meeting. 17:58:33 gundalow, but not beeing allowed to write tests on current behavior would imply unknown behavior? 17:58:43 sivel jinx 17:58:48 And it's fine to fix (a) and (b) in the same PR that adds test, and get them reviewed as a block 17:59:38 jtanner, I think it is in scope of current behavior vs some undocumented but someone might know what they tought when they programmed-it-behavior 17:59:43 Personally I write tests based on current behaviour. If I spot something that doesn't look right I look at the code and/or speak to someone else 18:00:11 bjornar_: yes, I believe at the moment, all we can say is that it is unknown behavior. But this is not the correct meeting to come to a decission on what the correct behavior is 18:00:12 This is perhaps the political discussion of this 18:00:37 a bug is reported, and no decision has been made around what the correct behavior is at this time 18:00:57 bjornar_: yeah, there's nothing wrong with that statement 18:01:20 once we have come to that decision, we can discuss adding tests to effectively document and ensure the behavior 18:01:21 when important bugs are fixed we try to add in tests to defend that logic 18:01:23 sivel, its not about that single issue, its the general idea of it 18:01:53 I have not seen an a discussion topic proposed that is unlinked from that issue 18:02:05 can you clarify exactly what the discussion topic should be? 18:02:09 bjornar_: in general you write a test and the person who knows that area of the code reviews and decides whether to merge it. 18:02:18 yup 18:02:34 bjornar_: that's the general proceedure... no need to ask permission in general. 18:02:48 What's the Python thing 18:02:56 best to do, then ask for forgiveness 18:03:20 Ok, that's the hour 18:03:26 and I have one other topic on the agenda 18:03:58 ask for forgiveness, not for permission? 18:04:11 sivel: That's the one 18:05:31 ok, as I don't think this topic is progressing I'm moving onto the next item 18:05:36 sounds like time to move on to that next topic? 18:05:42 ............................. 18:06:08 #topic How to deal with PRs that fail CI *after* merge (but passed PR CI) 18:06:16 #info https://github.com/ansible/community/issues/114#issuecomment-289768234 18:06:26 Consider the case: 18:06:26 PR created 18:06:27 Reviews/updates 18:06:27 Still passes CI 18:06:27 Time Passes 18:06:29 CI is made stricter (e.g. pep8 or validate-modules) 18:06:31 PR merged (as it's green) 18:06:34 Merge now fails updated tests 18:06:36 .. 18:06:39 Seen that a few times 18:06:41 mattclay: ^ 18:06:43 The short answer is to fix them quickly. There isn't anything we can do with GH to prevent this from happening. 18:06:59 #info The short answer is to fix them quickly. There isn't anything we can do with GH to prevent this from happening 18:07:23 We can reduce the chances of it occurring by periodically re-running CI if it's "too old", but that only reduces the problem. It won't eliminate it. 18:07:37 yeah, thsi problem is endemic on how GH works 18:07:53 we can narrow (and mattclay has) the window, but not eliminate it 18:08:36 I have on my list to look at automating re-running of CI for PRs which meet some criteria for CI status being too old. It's not very high on my list though. 18:09:11 Apparently their are 829 PRs with status:success 18:09:34 hum, which includes some Travis ones 18:09:36 and a lot of those probably have merge conflicts =) 18:09:40 Every time we add a new test, there's a chance it can cause new failures on any open PR. We have too many PRs to just run CI again every time tests change. 18:10:06 Yeah, if they have merge conflicts, we can't re-run CI. 18:10:23 take away needs_rebase and it's 615 18:10:43 If they're pre-Shippable we can't re-run CI either. 18:10:57 needs_rebase should cover those 18:11:07 Though, rerunning CI, people looking at it, and no one having the time to get it through the PR process isn't going to be any good 18:12:07 Stil seeing PRs from Nov 2014 without needs_rebase 18:12:24 oh, they've been updated 18:12:47 some people do care 18:12:54 ah, it's good all travis seem to have need_rabse 18:12:56 jtanner: I know 18:13:10 if you find any that have travis but not needs_rebase, that's a bot bug 18:13:15 We could have the bot label "stale CI" issues with a `stale_ci` label which would block auto-merge. Humans could use that as a cue to re-run CI before manually merging PRs too. 18:13:49 the only problem with that is how many times it'll get a re-run before someone actually decides to merge it 18:13:56 like that 2014 one ... 18:13:58 i would argue that we should rerun ci before merging if PR is older than a week 18:14:04 ^ just in general 18:14:05 Given how few automerges their currently are I'd be tempted to get the bot to rerun CI before mergeing 18:14:08 bcoca: +1 18:14:31 is the re-run sufficient if they don't also rebase? 18:14:37 +1 to bot rerunning before merge, i would also put time period 18:14:38 jtanner: That's why it should just be a label, rather than actually running CI. Re-running CI would require contributor intervention (new commits) or be done manually prior to merging (ie: there's already intent to merge, pending CI passing again). 18:14:40 to bring in current test 18:15:21 ^ applies to devel of course, as we rarely port tests back to older branches 18:15:32 It's not obvious on GH how old CI is, so having a bot applied label after a set time (bcoca suggests a week), would make it easier to make that determination. 18:15:58 mattclay: its date of last commit 18:16:08 ^ well, normally, unless someone reran manually 18:16:15 yes, manual is the norm 18:16:20 jtanner: Shippable rebases when running CI, so rebase is only needed for pre-Shippable PRs and merge conflict PRs. 18:16:29 if it's a problem, the bot can paste in the ci_status data 18:16:34 jtanner: shouldnt be, sans OSX acting up 18:17:27 i don't normally shy away from bot work ... but i'd like to see this one thought out more before i start writing it 18:17:39 gundalow: Are you thinking that the bot should trigger CI on stale PRs before it auto merges? 18:18:26 mattclay: It was an idea off the top of my head, I hadn't worked the idea through in my mind yet 18:19:41 jtanner: nod, I wouldn't want you to end up implementing something that doesn't make sense 18:20:51 If the main take away from this discussion is: "Rerun CI on any PR that's older that than a week before hitting merge" , then I think that's a good step forwards 18:20:56 oh + "Pay attendion 18:21:08 + "Pay attention to #shippable" 18:21:11 gundalow: +1 18:21:48 s/PR that's older than a week/PR with last commit older than a week/ 18:21:54 ah, yes 18:22:11 Which is a quick way to approximate when CI last ran. 18:22:19 write up feature request for bot please 18:22:39 the bot will know last ci run 18:22:52 and last commit date 18:23:39 #action gundalow to email Core and tell them to check date of last commit before merging, if > 1w, rerun CI. Add to triage 18:24:10 I think we should see how well that manual process works before having the bot help out. 18:24:17 Sure 18:24:33 #agreed rerun CI if last commit is over a week 18:25:00 I'm happy with that 18:25:00 the length of triage responsibilities is already so long that i don't think you are going to see much change 18:25:09 and more people than just the core team can merge PRs 18:25:28 hum 18:25:28 It shouldn't really be a triage thing, since it needs to be done when merging, not triaging. 18:25:33 true 18:25:41 #action gundalow ignore triage comment 18:25:52 gundalow ignores gundalow 18:26:25 jtanner: plan 18:26:28 adding a statle_ci label is fairly easy 18:26:36 One way to make it more automated would be to use bot commands like "merge". The bot could re-run CI if stale, and then merge for you. 18:26:36 stale_ci 18:26:44 Then you could "fire and forget" on a merge. 18:27:01 Instead of re-running CI and coming back to the PR later to make sure it passed before merging. 18:27:05 mattclay: that is true ... it should probably be integrated with "shipit" though 18:27:13 jtanner: Yeah. 18:27:21 stab over IP if CI fails on the merge? 18:27:37 wut? 18:28:55 lol 18:29:13 If you tell the bot to merge and the merge causes CI to fail it should notify you 18:29:15 Sounds like the next hot April 1 RFC that turns into a real product. 18:29:19 though I think that's future work 18:29:32 after merge, the PR is closed ... bot will no longer look at it 18:29:46 oh 18:29:52 SoIP 18:29:53 jtanner: Is there any reason why we couldn't (or shouldn't) support using "shipit" for any PR? 18:29:53 not right now at least 18:30:13 mattclay: i dont understand the question 18:30:17 SAAS: Stabbing as a Service 18:30:35 I believe currently shipit only workson modules, as we know ownership 18:30:56 will expand to plugins once we add ownership/classification 18:30:58 that is true ... 18:31:02 s/will/can/ 18:31:04 jtanner: Could the "shipit" bot command be used to merge non-module/plugin PRs? 18:31:16 i could be, but it's not currently 18:32:04 So if the bot supported "shipit" for all PRs (not just modules/plugins), that would be a start. Then it could auto re-run CI if it's stale before it does a merge. Those two together would help alleviate the problem. 18:33:50 With that you can "shipit" on a PR and forget about it, unless CI fails. In that case the bot will comment on the CI failures. 18:34:04 So you don't have to come back and check on the PR later unless there was a problem. 18:35:01 write it up 18:35:14 it'll still require "education" for our team 18:35:36 Any votes from folks on that appoach? 18:35:42 +1 18:36:49 +1 though some people say they already get too much GH email, so may need to look at Slack poking or somesuch, though that's a future feature 18:37:30 take their merge access away? 18:37:38 >=) 18:38:22 Well, that's +3, -0 (counting me) 18:38:37 bcoca: thoughts? 18:42:16 I can write this up. I think it's two separate bot features: 1) re-run stale CI before auto-merge and 2) extend "shipit" to support all PRs 18:42:25 We'd want to have #1 before #2 18:42:34 Yup, splitting it into two bits would be good 18:42:42 I think more thought needs to go into (2) 18:42:45 it'll be a stale_ci label first 18:43:02 identify first, act second 18:43:04 I thinkthe cost vs beneffit is better on 1 18:43:20 OK, so 3 features: 1) label stale-ci, 2) gate auto-merge on stale_ci, 3) extend "shipit" to all PRs 18:44:33 #action mattclay to submit bot feature requests for 1) labeling stale_ci, 2) gating auto-merge on stale_ci, 3) extending "shipit" to all PRs 18:46:06 As gundalow suggests, we'll need to work out some details on how we want #3 to work. Perhaps at the next TWG. 18:46:33 nod 18:46:44 Thanks for taking offering to write that up mattclay 18:47:34 #info We'll need to work out some details on how we want extend "shipit" to support all PRs. Perhaps at the next TWG. 18:47:54 Anyone got anything else? 18:53:58 #endmeeting