13:53:44 <jimi|ansible> #startmeeting AnsibleFest Developer Conference - Zuul 13:53:44 <zodbot> Meeting started Wed Jun 21 13:53:44 2017 UTC. The chair is jimi|ansible. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:53:44 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic. 13:53:44 <zodbot> The meeting name has been set to 'ansiblefest_developer_conference_-_zuul' 13:54:17 <jimi|ansible> #chairs jimi|ansible mordred thaumos gundalow 13:54:23 <jimi|ansible> #chair jimi|ansible mordred thaumos gundalow 13:54:23 <zodbot> Current chairs: gundalow jimi|ansible mordred thaumos 13:54:35 <jimi|ansible> ^ if anyone else needs it ping me 13:55:24 <jimi|ansible> #topic Zuul deep dive 13:56:16 <jeblair> #link https://public.etherpad-mozilla.org/p/ansible-summit-june-2017-Zuul 13:56:33 <jimi|ansible> #info zuul components - scheduler / nodepool / executors / nodes 13:56:57 <jtanner> i'm listening 13:56:59 <jimi|ansible> #info content is normally run on nodes, not executors 13:57:07 <jimi|ansible> #chair jtanner 13:57:07 <zodbot> Current chairs: gundalow jimi|ansible jtanner mordred thaumos 13:57:14 <jimi|ansible> #chair samdoran 13:57:14 <zodbot> Current chairs: gundalow jimi|ansible jtanner mordred samdoran thaumos 13:57:36 <jimi|ansible> #info jobs can run in two modes: trusted and untrusted 13:57:40 <P-NuT> Any idea if they are going to announce open tower tomorrow at ansiblefest? 13:57:42 <gundalow> For people following remotely https://bluejeans.com/2413805790 13:59:46 <jtanner> P-NuT: i doubt anyone knows that now, if it's true 14:00:46 <jimi|ansible> #info there's a github app to add the zuul integration 14:01:54 <gundalow> Out of interest who is following this remotely? 14:02:06 <jimi|ansible> #info lots of things can trigger zuul jobs - commits, comments, etc 14:03:03 <shertel> gundalow I'm following remotely 14:03:04 <abadger1999> /me hopes that mattclay is following remotely :-) 14:03:17 <jimi|ansible> #info jobs definitions are stored in the repo, and can be run via any instance of zuul 14:03:41 <gundalow> shertel: :) 14:04:53 <abadger1999> #info design philosophy is that you should be able to configure so that only the system merges, not people. 14:05:19 <jlk> #info a combination of human approval and tests passing result in changes merging. 14:06:05 <abadger1999> The more that we can configure into zuul wrt automerge is probably better. 14:06:13 <Shrews> gundalow: i am following, but mostly out of curiosity to see if mordred can limit himself to an hour 14:06:25 <gundalow> Shrews: haha 14:06:34 <jlk> So we can make a "gate" requirement be a review specifically from the ansibot user. 14:06:38 <abadger1999> Shrews: Which bets are you covering? 14:06:40 <abadger1999> ;-) 14:06:53 <jimi|ansible> hah 14:07:23 <bcoca> ~ +20mins .. as long as no one points it out 14:07:31 <bcoca> ^ down for $20 14:08:42 <abadger1999> jlk: what happens when a trigger fails to be sent? 14:08:45 <jtanner> i didn't hear why my name was mentioned 14:09:12 <abadger1999> jtanner: mckerr was asking who on the zuul side he should make sure was talking with you 14:09:18 <jlk> abadger1999: well, if zuul misses an event that github sends, it obviously doesn't do the work 14:09:23 <abadger1999> About how ansibullbot would integrate with zuul 14:09:33 <abadger1999> jlk: k. So there's no way that it catches up? 14:09:37 <jlk> abadger1999: but, we have that comment trigger of "recheck" so that any human can kick it into responding 14:09:40 <gundalow> #action McKerr to get Jessy and jtanner to speak to each other 14:09:46 <abadger1999> jtanner: tentative conclusion was that your counterpart is jlk. 14:09:50 <jlk> abadger1999: not really. Github is a try once and fail. 14:10:04 <jlk> oh we've spoken numerous times about github things :D 14:10:35 <abadger1999> jlk: <nod> Yeah ... I believe tanner has to do a bunch of polling to make sure that we catch up on missed events. 14:10:44 <jlk> abadger1999: the model here is to allow "potential" trigger by many things. Comment, status, review, etc.. But also have a pipeline requirement that certain things are in place. 14:10:50 <jtanner> if zuul remains 100% event/hook based, it'll have many of hte same problems the other ci system have 14:10:59 <jtanner> hooks don't fire or get lost A LOT 14:11:04 <jeblair> and if something systemic goes wrong, there are admin commands to out-of-band trigger events; so you could retrigger all open prs or something... 14:11:15 <jlk> so if Zuul was blocked by having a positive review, and we miss the positive review event, a comment can cause zuul to re-evaluate the change and potentially let it in 14:11:50 <jimi|ansible> as long as zuul triggers on comments and other things, it shouldn't be a problem to retrigger jobs on an as-needed basis 14:11:59 <jimi|ansible> so it must do some kind of polling? 14:12:03 <jimi|ansible> ^ jlk? 14:12:07 <jlk> doesn't poll 14:12:11 <jlk> a comment generates an event 14:12:25 <jimi|ansible> ahh, didn't realize comments generated events 14:12:32 <abadger1999> #info migration proposal: Step 1: have openstack zuul trigger some tests off of ansible/ansible commits. 14:12:32 <jlk> an event causes zuul to query the change 14:13:07 <abadger1999> #info migration proposal Step 1: Jobs would be defined in ansible/ansible repo 14:13:41 <jlk> jtanner: short term, ansibot could evaluate time between something happening in a PR and zuul responding to it, and if the time is too long, it can issue a 'recheck' comment. 14:13:57 <jtanner> we do that now with shippable 14:14:02 <jtanner> it adds a needs_ci label 14:14:13 <jlk> yup, zuul could react to that 14:14:28 <jtanner> but someone still has to go fire a new hook in shippable's case 14:14:32 <jeblair> that could also help us collect metrics on event delivery reliability 14:14:39 <jtanner> click rebuild in the UI OR close/reopen PR 14:14:42 <jlk> ah 14:14:49 <jlk> so the label being applied creates an event 14:14:53 <jlk> zuul could react to the event to retrigger 14:14:57 <jtanner> bot will do it at some point, but requires my free time first 14:15:05 <abadger1999> #info migration proposal: Step 1.5: ansible-container can use bonnieCI (zuul v3 running for ibm) instead of travis. 14:15:52 <jtanner> so one aspect of ansible-test + shippable is that the community can pretty much run an identical test path locally 14:16:00 <abadger1999> #info migration proposal: Step 2: Operations: Who runs zuul instance? Where, zuul control? Where, zuul build resources? When: timeline for migration? 14:16:17 <jtanner> if we add/switch to zuul, will we be able to maintain that local testing path? 14:16:27 <abadger1999> #info migration proposal: Step 2: existing repos for shared jobs 14:16:32 <abadger1999> #undo 14:16:38 <jlk> jtanner: as I understand it yes? Locally people aren't touching shippable, right? 14:16:41 <abadger1999> #info migration proposal: Step 2: existing repos for shared jobs which can help us get started 14:16:45 <jlk> they're just running the test script? 14:16:59 <abadger1999> jimi|ansible: Shoot -- I'm not chaired. 14:17:02 <jlk> depending on _how_ you design your jobs, you should be able to do that with zuul as well. 14:17:02 <jtanner> the test script(s) spin up the same containers + env that shippable uses 14:17:06 <jimi|ansible> gah 14:17:07 <abadger1999> jimi|ansible: so all my #infos haven't been recorded 14:17:08 <jimi|ansible> #chair abadger1999 14:17:08 <zodbot> Current chairs: abadger1999 gundalow jimi|ansible jtanner mordred samdoran thaumos 14:17:22 <jlk> jtanner: that model can be carried over to zuul 14:17:24 <jimi|ansible> copy/paste? 14:17:26 <jtanner> k 14:17:30 <abadger1999> #info design philosophy is that you should be able to configure so that only the system merges, not people. 14:17:37 <abadger1999> #info migration proposal: Step 1: have openstack zuul trigger some tests off of ansible/ansible commits. 14:17:41 <jlk> jtanner: Zuul can give you a VM that has docker in it, and your job is just executing the script 14:17:43 <abadger1999> #info migration proposal Step 1: Jobs would be defined in ansible/ansible repo 14:17:49 <abadger1999> #info migration proposal: Step 1.5: ansible-container can use bonnieCI (zuul v3 running for ibm) instead of travis. 14:17:50 <misc> (can't hear anything from the room) 14:17:57 <jimi|ansible> it'd be nice if zodbot sent you a message when you try and do a # command and aren't chaired 14:17:58 <misc> (well, the person who asked a question) 14:18:00 <abadger1999> #info migration proposal: Step 2: Operations: Who runs zuul instance? Where, zuul control? Where, zuul build resources? When: timeline for migration? 14:18:04 <abadger1999> #info migration proposal: Step 2: existing repos for shared jobs which can help us get started 14:18:20 <jimi|ansible> misc: sorry i missed the question too as i was typing here 14:18:33 <jlk> misc: it was a clarification of the "who" bits 14:18:35 <jlk> which monty restated 14:18:36 <abadger1999> #info Who? Ansible ir RH Software Factory team or partner with IBM (use Bonnie CI)? 14:21:56 <gundalow> misc: Can you hear now? 14:22:47 <misc> gundalow: yup, that was people not speaking in the mic 14:22:52 <misc> I can hear monty fine 14:23:02 <misc> (and yanis I guess before ?) 14:23:10 <gundalow> misc: ah, OK 14:25:28 <misc> lost video :) 14:25:49 <jeblair> so did we 14:25:54 <jeblair> in room 14:26:26 <jtanner> and sound 14:26:33 <jlk> network was dropped 14:26:51 <jtanner> bluejeans rate limiting =P 14:27:11 <jimi|ansible> presenting laptop lost network (and/or bluejeans is having issues) 14:27:21 <misc> yeah, that happen 14:28:44 <Shrews> bummer. the dirty hacks part was the most interesting portion 14:29:02 <jeblair> we are paused while we try to resolve the av issues 14:29:10 <misc> back \o/ 14:29:22 <jtanner> magic 14:29:24 <jlk> can you hear us now? 14:29:30 <shertel> yes 14:29:31 <misc> yes 14:29:31 <Shrews> we hear 14:29:31 <jtanner> yes 14:29:32 <misc> I can hear you 14:29:46 <misc> we do not see 14:30:02 <jlk> the screen sharing appears to be back on 14:30:20 <misc> yeah 14:30:35 <gundalow> . 14:30:50 <gundalow> Can you hear and see now? 14:31:05 <abadger1999> #info dirty hacks (1): log streaming for command/shell tasks 14:31:11 <bcoca> no, but i just closed browser 14:31:35 <jtanner> hah 14:31:39 <misc> meeting end 14:31:39 <jtanner> time's up 14:31:43 <jlk> what the hell... 14:31:45 <misc> because moderator crashed... 14:31:50 <bcoca> ha 14:31:51 <misc> so it stop after 5 minutes 14:32:05 <jlk> finding Robyn 14:32:41 <jtanner> "How would you rate the overall quality of this meeting?" ... "meeting was affected" 14:32:50 <jimi|ansible> rejoin 14:33:19 <gundalow> Can you hear and see now? 14:33:22 <misc> nope 14:33:29 <misc> I can see, but not hear 14:33:40 <misc> you are on mute 14:34:10 <misc> now I can hear 14:34:15 <abadger1999> #info monty explaining that they have to fork a saemon process to stream command output 14:34:59 <abadger1999> #info monty explaining tha the command module has been forked to stream data to zuul_console (the daemon process) 14:36:21 <abadger1999> #info monty explaining that controller side, there's a zuul-stream callback plugin that intercepts stdout from the command and spawns streaming client thread, logs lines. 14:36:50 <bcoca> DO NOT CROSS THE STREAMS! 14:36:57 <abadger1999> #info explaining that the logs can then be streamed to a clientvia a finger protocol. 14:37:00 <bcoca> MT 14:37:36 <jtanner> "forked the command module" ... still not as bad as "we flipped the module + connection plugin relationship" =P 14:37:51 * jtanner looks at networking folks 14:37:55 <jlk> hahaha 14:38:19 <bcoca> jtanner: also flipped action plugin 14:38:30 <jtanner> it's flips all the way down 14:38:53 <jtanner> new module: realtime_command 14:39:21 <jtanner> run_command_live ... used to be a thing 14:40:03 <jtanner> p.communicate() is the first hurdle 14:40:24 <jtanner> p = Popen() 14:40:31 <jtanner> (so, se) = p.communicate() 14:42:03 <jeblair> http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/ansible/library/command.py?h=feature/zuulv3 14:42:11 <jeblair> is the source for the current forked module 14:42:30 <jeblair> searching for 'zuul' should find all the places where things have been changed 14:44:47 <abadger1999> #info log streaming brainstorming: For streaming, implement update_json 14:45:09 <abadger1999> #info log streaming brainstorming: For the foring of comman module, add a parameter to allow run_command to use a single pip for stdout and stderr 14:45:24 <abadger1999> #undo 14:45:24 <zodbot> Removing item from minutes: INFO by abadger1999 at 14:45:09 : log streaming brainstorming: For the foring of comman module, add a parameter to allow run_command to use a single pip for stdout and stderr 14:45:32 <abadger1999> #info log streaming brainstorming: For the forking of command module, add a parameter to allow run_command to use a single pip for stdout and stderr 14:46:09 <abadger1999> #info log streaming brainstorming: Perhaps can implement streaming by modifying what's done with async instead of implementing update_json. 14:46:51 <abadger1999> #info Ansilbe restricted environment 14:46:55 <abadger1999> #undo 14:46:55 <zodbot> Removing item from minutes: INFO by abadger1999 at 14:46:51 : Ansilbe restricted environment 14:47:00 <abadger1999> #info Ansible restricted environment 14:47:36 <jtanner> chroot -> proot -> bubblewrap 14:47:54 <abadger1999> #info zuul uses "bubblewrap" which is a user-space lightweight container without needing to have root to create them. 14:47:54 <jeblair> https://github.com/projectatomic/bubblewrap 14:48:09 <abadger1999> #link https://github.com/projectatomic/bubblewrap 14:48:55 <abadger1999> #chair jeblair jlk bcoca 14:48:55 <zodbot> Current chairs: abadger1999 bcoca gundalow jeblair jimi|ansible jlk jtanner mordred samdoran thaumos 14:48:56 <jtanner> someone in the room needs to turn down their speakers 14:48:57 <jlk> somebody is echoing things back into BJ 14:49:11 <jlk> bcoca: we muted you 14:49:28 <bcoca> im muted on my side? 14:49:38 <jlk> you weren't on the bluejeans side at that time 14:50:53 <abadger1999> #info Look in https://git.openstack.org/cgit/openstack-infra/zuul/zuul/ansible for some of the hacks that are being used. 14:51:15 <bcoca> ^ that and the 'firewall strategy' ... 14:51:53 <jeblair> #undo 14:51:53 <zodbot> Removing item from minutes: INFO by abadger1999 at 14:50:53 : Look in https://git.openstack.org/cgit/openstack-infra/zuul/zuul/ansible for some of the hacks that are being used. 14:52:08 <jeblair> #info Look in http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/ansible?h=feature/zuulv3 for some of the hacks that are being used. 14:52:22 <jeblair> (other url was lacking the v3 branch selection) 14:52:45 <jimi|ansible> boaty mcboatface waits for no one 15:01:54 <bcoca> do untrusted jobs allow envvvars/ansible.cfg? 15:05:01 <jlk> #info PLEASE think of novel ways to break out of an Ansible environment, so that we can evaluate them against zuul protections. 15:05:58 * misc will do 15:06:01 <bcoca> get_url + shell + ansbile-playbook ... 15:07:02 <jlk> I'm pretty sure we prevent you from downloading new content 15:07:12 <jlk> in an untrusted run 15:07:36 <pabelanger> right, it won't work in untrusted 15:08:00 <jlk> unless you can come up with ways we aren't blocking for downloading content 15:08:01 <pabelanger> but we need to write a test in zuul for it still 15:08:01 <bcoca> unarchive from tarball athat is part of job 15:08:09 <abadger1999> jlk: how are you blocking it? 15:08:15 <misc> use dns tunneling 15:08:17 <jeblair> i think we permit downloading content 15:08:33 <misc> but you do not get out of the containment 15:08:34 <jimi|ansible> bcoca: i'm assuming that all jobs use a non-privileged user, so anything run would be the same priv level as the playbook 15:08:43 <jimi|ansible> but that's a good point, testing become escalation 15:08:46 <misc> provided we can already run arbitrary content 15:08:49 <pabelanger> http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/ansible/action/unarchive.py?h=feature/zuulv3 15:08:52 <pabelanger> for unarchive 15:08:59 <bcoca> bitcoin does not need privs to run! 15:09:15 <jlk> everything is timed 15:09:21 <misc> yeah, but if the job are killed after some time 15:09:25 <jeblair> pabelanger: yeah, that's just checking the local path i believe 15:09:31 <pabelanger> yup 15:09:52 <jimi|ansible> does a bitcoin miner need a large block of time? could just test as many hashes as possible and exit 15:09:54 <misc> now, if you get auditing of what happen, people will abuse it, but you will know how 15:09:57 <jtanner> we're gonna need a bigger boat 15:10:12 <bcoca> jlk: not really looking at breaking yoru security, but looking at what would be nice to provide in core for 'secure setups' 15:10:24 <misc> would something like mirai need more of ressources ? 15:10:37 <bcoca> ^ i.e running only 'signed' plugins 15:10:40 <jlk> nod 15:10:41 <misc> would a oneoff exploit that do pown a wordpress remotely be blocked ? 15:10:51 <jlk> we have a list of things we allow or disallow running 15:10:53 <misc> (since that's basically "uri" ) 15:11:25 <bcoca> jlk: once 'signed' you can avoid havin cp ../url library/copy 15:11:28 <jimi|ansible> i think we're proposing things that you could do on travis/shippable/et. al now 15:11:38 <jlk> well 15:11:49 <jlk> the difference is that on travis, all those things happen in the VM they give you 15:12:02 <jlk> we're talking about the things that run ON our control plane, not in a VM we launch for you 15:12:09 <bcoca> well, these features would be for 'centralized/controlled' environments 15:12:18 <misc> (and on travis, that's SEP) 15:12:33 <jlk> ansible-playbook runs on the control node, with the VM as the target 15:12:36 <jimi|ansible> well they're in the bubblewrap environment right? and they're things that impact remotes 15:12:38 <jlk> so we want in untrusted runs to prevent local execution of things 15:12:48 <jimi|ansible> though honestly with the travis/shippable file you can exec commands just as easily 15:12:51 <jlk> yeah they're in bubble wrap 15:13:49 <misc> so, if that run on amazon, it has access to the api ? 15:14:43 <misc> (kinda like https://media.ccc.de/v/33c3-7865-gone_in_60_milliseconds ) 15:15:37 <jlk> so 15:15:39 <jlk> secrets 15:16:04 <jlk> we would design the pipelines so that the secret is not available in the "check" pipeline (the one that automatically runs on PR open) 15:16:32 <misc> (also related: https://hackernoon.com/capturing-all-the-flags-in-bsidessf-ctf-by-pwning-our-infrastructure-3570b99b4dd0 ) 15:16:32 <jlk> and the secret is only available in a pipeline that requires human review before starting 15:16:39 <jlk> so if a human reviews it and missed the fact that it uses the secret to eat all your $$, then sure. 15:16:51 <jlk> anyway we're disconnecting to go drinking. 15:16:54 <bcoca> said human gets his pay docked? 15:17:05 <bcoca> jlk: that is always the correct reason 15:27:35 <gundalow> #endmeeting