#ansible-meeting log

13:53:44 <jimi|ansible> #startmeeting AnsibleFest Developer Conference - Zuul
13:53:44 <zodbot> Meeting started Wed Jun 21 13:53:44 2017 UTC.  The chair is jimi|ansible. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:53:44 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
13:53:44 <zodbot> The meeting name has been set to 'ansiblefest_developer_conference_-_zuul'
13:54:17 <jimi|ansible> #chairs jimi|ansible mordred thaumos gundalow
13:54:23 <jimi|ansible> #chair jimi|ansible mordred thaumos gundalow
13:54:23 <zodbot> Current chairs: gundalow jimi|ansible mordred thaumos
13:54:35 <jimi|ansible> ^ if anyone else needs it ping me
13:55:24 <jimi|ansible> #topic Zuul deep dive
13:56:16 <jeblair> #link https://public.etherpad-mozilla.org/p/ansible-summit-june-2017-Zuul
13:56:33 <jimi|ansible> #info zuul components - scheduler / nodepool / executors / nodes
13:56:57 <jtanner> i'm listening
13:56:59 <jimi|ansible> #info content is normally run on nodes, not executors
13:57:07 <jimi|ansible> #chair jtanner
13:57:07 <zodbot> Current chairs: gundalow jimi|ansible jtanner mordred thaumos
13:57:14 <jimi|ansible> #chair samdoran
13:57:14 <zodbot> Current chairs: gundalow jimi|ansible jtanner mordred samdoran thaumos
13:57:36 <jimi|ansible> #info jobs can run in two modes: trusted and untrusted
13:57:40 <P-NuT> Any idea if they are going to announce open tower tomorrow at ansiblefest?
13:57:42 <gundalow> For people following remotely https://bluejeans.com/2413805790
13:59:46 <jtanner> P-NuT: i doubt anyone knows that now, if it's true
14:00:46 <jimi|ansible> #info there's a github app to add the zuul integration
14:01:54 <gundalow> Out of interest who is following this remotely?
14:02:06 <jimi|ansible> #info lots of things can trigger zuul jobs - commits, comments, etc
14:03:03 <shertel> gundalow I'm following remotely
14:03:04 <abadger1999> /me hopes that mattclay is following remotely :-)
14:03:17 <jimi|ansible> #info jobs definitions are stored in the repo, and can be run via any instance of zuul
14:03:41 <gundalow> shertel: :)
14:04:53 <abadger1999> #info design philosophy is that you should be able to configure so that only the system merges, not people.
14:05:19 <jlk> #info a combination of human approval and tests passing result in changes merging.
14:06:05 <abadger1999> The more that we can configure into zuul wrt automerge is probably better.
14:06:13 <Shrews> gundalow: i am following, but mostly out of curiosity to see if mordred can limit himself to an hour
14:06:25 <gundalow> Shrews: haha
14:06:34 <jlk> So we can make a "gate" requirement be a review specifically from the ansibot user.
14:06:38 <abadger1999> Shrews: Which bets are you covering?
14:06:40 <abadger1999> ;-)
14:06:53 <jimi|ansible> hah
14:07:23 <bcoca> ~ +20mins .. as long as no one points it out
14:07:31 <bcoca> ^ down for $20
14:08:42 <abadger1999> jlk: what happens when a trigger fails to be sent?
14:08:45 <jtanner> i didn't hear why my name was mentioned
14:09:12 <abadger1999> jtanner: mckerr was asking who on the zuul side he should make sure was talking with you
14:09:18 <jlk> abadger1999: well, if zuul misses an event that github sends, it obviously doesn't do the work
14:09:23 <abadger1999> About how ansibullbot would integrate with zuul
14:09:33 <abadger1999> jlk: k.  So there's no way that it catches up?
14:09:37 <jlk> abadger1999: but, we have that comment trigger of "recheck" so that any human can kick it into responding
14:09:40 <gundalow> #action McKerr to get Jessy and jtanner to speak to each other
14:09:46 <abadger1999> jtanner: tentative conclusion was that your counterpart is jlk.
14:09:50 <jlk> abadger1999: not really. Github is a try once and fail.
14:10:04 <jlk> oh we've spoken numerous times about github things :D
14:10:35 <abadger1999> jlk: <nod> Yeah ... I believe tanner has to do a bunch of polling to make sure that we catch up on missed events.
14:10:44 <jlk> abadger1999: the model here is to allow "potential" trigger by many things. Comment, status, review, etc..  But also have a pipeline requirement that certain things are in place.
14:10:50 <jtanner> if zuul remains 100% event/hook based, it'll have many of hte same problems the other ci system have
14:10:59 <jtanner> hooks don't fire or get lost A LOT
14:11:04 <jeblair> and if something systemic goes wrong, there are admin commands to out-of-band trigger events; so you could retrigger all open prs or something...
14:11:15 <jlk> so if Zuul was blocked by having a positive review, and we miss the positive review event, a comment can cause zuul to re-evaluate the change and potentially let it in
14:11:50 <jimi|ansible> as long as zuul triggers on comments and other things, it shouldn't be a problem to retrigger jobs on an as-needed basis
14:11:59 <jimi|ansible> so it must do some kind of polling?
14:12:03 <jimi|ansible> ^ jlk?
14:12:07 <jlk> doesn't poll
14:12:11 <jlk> a comment generates an event
14:12:25 <jimi|ansible> ahh, didn't realize comments generated events
14:12:32 <abadger1999> #info migration proposal: Step 1: have openstack zuul trigger some tests off of ansible/ansible commits.
14:12:32 <jlk> an event causes zuul to query the change
14:13:07 <abadger1999> #info migration proposal Step 1: Jobs would be defined in ansible/ansible repo
14:13:41 <jlk> jtanner: short term, ansibot could evaluate time between something happening in a PR and zuul responding to it, and if the time is too long, it can issue a 'recheck' comment.
14:13:57 <jtanner> we do that now with shippable
14:14:02 <jtanner> it adds a needs_ci label
14:14:13 <jlk> yup, zuul could react to that
14:14:28 <jtanner> but someone still has to go fire a new hook in shippable's case
14:14:32 <jeblair> that could also help us collect metrics on event delivery reliability
14:14:39 <jtanner> click rebuild in the UI OR close/reopen PR
14:14:42 <jlk> ah
14:14:49 <jlk> so the label being applied creates an event
14:14:53 <jlk> zuul could react to the event to retrigger
14:14:57 <jtanner> bot will do it at some point, but requires my free time first
14:15:05 <abadger1999> #info migration proposal: Step 1.5: ansible-container  can use bonnieCI (zuul v3 running for ibm) instead of travis.
14:15:52 <jtanner> so one aspect of ansible-test  + shippable is that the community can pretty much run an identical test path locally
14:16:00 <abadger1999> #info migration proposal: Step 2: Operations: Who runs zuul instance?  Where, zuul control? Where, zuul build resources?  When: timeline for migration?
14:16:17 <jtanner> if we add/switch to zuul, will we be able to maintain that local testing path?
14:16:27 <abadger1999> #info migration proposal: Step 2: existing repos for shared jobs
14:16:32 <abadger1999> #undo
14:16:38 <jlk> jtanner: as I understand it yes? Locally people aren't touching shippable, right?
14:16:41 <abadger1999> #info migration proposal: Step 2: existing repos for shared jobs which can help us get started
14:16:45 <jlk> they're just running the test script?
14:16:59 <abadger1999> jimi|ansible: Shoot -- I'm not chaired.
14:17:02 <jlk> depending on _how_ you design your jobs, you should be able to do that with zuul as well.
14:17:02 <jtanner> the test script(s) spin up the same containers + env that shippable uses
14:17:06 <jimi|ansible> gah
14:17:07 <abadger1999> jimi|ansible: so all my #infos haven't been recorded
14:17:08 <jimi|ansible> #chair abadger1999
14:17:08 <zodbot> Current chairs: abadger1999 gundalow jimi|ansible jtanner mordred samdoran thaumos
14:17:22 <jlk> jtanner: that model can be carried over to zuul
14:17:24 <jimi|ansible> copy/paste?
14:17:26 <jtanner> k
14:17:30 <abadger1999> #info design philosophy is that you should be able to configure so that only the system merges, not people.
14:17:37 <abadger1999> #info migration proposal: Step 1: have openstack zuul trigger some tests off of ansible/ansible commits.
14:17:41 <jlk> jtanner: Zuul can give you a VM that has docker in it, and your job is just executing the script
14:17:43 <abadger1999> #info migration proposal Step 1: Jobs would be defined in ansible/ansible repo
14:17:49 <abadger1999> #info migration proposal: Step 1.5: ansible-container  can use bonnieCI (zuul v3 running for ibm) instead of travis.
14:17:50 <misc> (can't hear anything from the room)
14:17:57 <jimi|ansible> it'd be nice if zodbot sent you a message when you try and do a # command and aren't chaired
14:17:58 <misc> (well, the person who asked a question)
14:18:00 <abadger1999> #info migration proposal: Step 2: Operations: Who runs zuul instance?  Where, zuul control? Where, zuul build resources?  When: timeline for migration?
14:18:04 <abadger1999> #info migration proposal: Step 2: existing repos for shared jobs which can help us get started
14:18:20 <jimi|ansible> misc: sorry i missed the question too as i was typing here
14:18:33 <jlk> misc: it was a clarification of the "who" bits
14:18:35 <jlk> which monty restated
14:18:36 <abadger1999> #info Who?  Ansible ir RH Software Factory team or partner with IBM (use Bonnie CI)?
14:21:56 <gundalow> misc: Can you hear now?
14:22:47 <misc> gundalow: yup, that was people not speaking in the mic
14:22:52 <misc> I can hear monty fine
14:23:02 <misc> (and yanis I guess before ?)
14:23:10 <gundalow> misc: ah, OK
14:25:28 <misc> lost video :)
14:25:49 <jeblair> so did we
14:25:54 <jeblair> in room
14:26:26 <jtanner> and sound
14:26:33 <jlk> network was dropped
14:26:51 <jtanner> bluejeans rate limiting =P
14:27:11 <jimi|ansible> presenting laptop lost network (and/or bluejeans is having issues)
14:27:21 <misc> yeah, that happen
14:28:44 <Shrews> bummer. the dirty hacks part was the most interesting portion
14:29:02 <jeblair> we are paused while we try to resolve the av issues
14:29:10 <misc> back \o/
14:29:22 <jtanner> magic
14:29:24 <jlk> can you hear us now?
14:29:30 <shertel> yes
14:29:31 <misc> yes
14:29:31 <Shrews> we hear
14:29:31 <jtanner> yes
14:29:32 <misc> I can hear you
14:29:46 <misc> we do not see
14:30:02 <jlk> the screen sharing appears to be back on
14:30:20 <misc> yeah
14:30:35 <gundalow> .
14:30:50 <gundalow> Can you hear and see now?
14:31:05 <abadger1999> #info dirty hacks (1): log streaming for command/shell tasks
14:31:11 <bcoca> no, but i just  closed browser
14:31:35 <jtanner> hah
14:31:39 <misc> meeting end
14:31:39 <jtanner> time's up
14:31:43 <jlk> what the hell...
14:31:45 <misc> because moderator crashed...
14:31:50 <bcoca> ha
14:31:51 <misc> so it stop after 5 minutes
14:32:05 <jlk> finding Robyn
14:32:41 <jtanner> "How would you rate the overall quality of this meeting?" ... "meeting was affected"
14:32:50 <jimi|ansible> rejoin
14:33:19 <gundalow> Can you hear and see now?
14:33:22 <misc> nope
14:33:29 <misc> I can see, but not hear
14:33:40 <misc> you are on mute
14:34:10 <misc> now I can hear
14:34:15 <abadger1999> #info monty explaining that they have to fork a saemon process to stream command output
14:34:59 <abadger1999> #info monty explaining tha the command module has been forked to stream data to zuul_console (the daemon process)
14:36:21 <abadger1999> #info monty explaining that controller side, there's a zuul-stream callback plugin that intercepts stdout from the command and spawns streaming client thread, logs lines.
14:36:50 <bcoca> DO NOT CROSS THE STREAMS!
14:36:57 <abadger1999> #info explaining that the logs can then be streamed to a clientvia a finger protocol.
14:37:00 <bcoca> MT
14:37:36 <jtanner> "forked the command module" ... still not as bad as "we flipped the module + connection plugin relationship" =P
14:37:51 * jtanner looks at networking folks
14:37:55 <jlk> hahaha
14:38:19 <bcoca> jtanner: also flipped action plugin
14:38:30 <jtanner> it's flips all the way down
14:38:53 <jtanner> new module: realtime_command
14:39:21 <jtanner> run_command_live ... used to be a thing
14:40:03 <jtanner> p.communicate() is the first hurdle
14:40:24 <jtanner> p = Popen()
14:40:31 <jtanner> (so, se) = p.communicate()
14:42:03 <jeblair> http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/ansible/library/command.py?h=feature/zuulv3
14:42:11 <jeblair> is the source for the current forked module
14:42:30 <jeblair> searching for 'zuul' should find all the places where things have been changed
14:44:47 <abadger1999> #info log streaming brainstorming: For streaming, implement update_json
14:45:09 <abadger1999> #info log streaming brainstorming: For the foring of comman module, add a parameter to allow run_command to use a single pip for stdout and stderr
14:45:24 <abadger1999> #undo
14:45:24 <zodbot> Removing item from minutes: INFO by abadger1999 at 14:45:09 : log streaming brainstorming: For the foring of comman module, add a parameter to allow run_command to use a single pip for stdout and stderr
14:45:32 <abadger1999> #info log streaming brainstorming: For the forking of command module, add a parameter to allow run_command to use a single pip for stdout and stderr
14:46:09 <abadger1999> #info log streaming brainstorming: Perhaps can implement streaming by modifying what's done with async instead of implementing update_json.
14:46:51 <abadger1999> #info Ansilbe restricted environment
14:46:55 <abadger1999> #undo
14:46:55 <zodbot> Removing item from minutes: INFO by abadger1999 at 14:46:51 : Ansilbe restricted environment
14:47:00 <abadger1999> #info Ansible restricted environment
14:47:36 <jtanner> chroot -> proot -> bubblewrap
14:47:54 <abadger1999> #info zuul uses "bubblewrap" which is a user-space lightweight container without needing to have root to create them.
14:47:54 <jeblair> https://github.com/projectatomic/bubblewrap
14:48:09 <abadger1999> #link https://github.com/projectatomic/bubblewrap
14:48:55 <abadger1999> #chair jeblair jlk bcoca
14:48:55 <zodbot> Current chairs: abadger1999 bcoca gundalow jeblair jimi|ansible jlk jtanner mordred samdoran thaumos
14:48:56 <jtanner> someone in the room needs to turn down their speakers
14:48:57 <jlk> somebody is echoing things back into BJ
14:49:11 <jlk> bcoca: we muted you
14:49:28 <bcoca> im muted on my side?
14:49:38 <jlk> you weren't on the bluejeans side at that time
14:50:53 <abadger1999> #info Look in https://git.openstack.org/cgit/openstack-infra/zuul/zuul/ansible for some of the hacks that are being used.
14:51:15 <bcoca> ^ that and the 'firewall strategy' ...
14:51:53 <jeblair> #undo
14:51:53 <zodbot> Removing item from minutes: INFO by abadger1999 at 14:50:53 : Look in https://git.openstack.org/cgit/openstack-infra/zuul/zuul/ansible for some of the hacks that are being used.
14:52:08 <jeblair> #info Look in http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/ansible?h=feature/zuulv3 for some of the hacks that are being used.
14:52:22 <jeblair> (other url was lacking the v3 branch selection)
14:52:45 <jimi|ansible> boaty mcboatface waits for no one
15:01:54 <bcoca> do untrusted jobs allow envvvars/ansible.cfg?
15:05:01 <jlk> #info PLEASE think of novel ways to break out of an Ansible environment, so that we can evaluate them against zuul protections.
15:05:58 * misc will do
15:06:01 <bcoca> get_url + shell + ansbile-playbook ...
15:07:02 <jlk> I'm pretty sure we prevent you from downloading new content
15:07:12 <jlk> in an untrusted run
15:07:36 <pabelanger> right, it won't work in untrusted
15:08:00 <jlk> unless you can come up with ways we aren't blocking for downloading content
15:08:01 <pabelanger> but we need to write a test in zuul for it still
15:08:01 <bcoca> unarchive from tarball athat is part of job
15:08:09 <abadger1999> jlk: how are you blocking it?
15:08:15 <misc> use dns tunneling
15:08:17 <jeblair> i think we permit downloading content
15:08:33 <misc> but you do not get out of the containment
15:08:34 <jimi|ansible> bcoca: i'm assuming that all jobs use a non-privileged user, so anything run would be the same priv level as the playbook
15:08:43 <jimi|ansible> but that's a good point, testing become escalation
15:08:46 <misc> provided we can already run arbitrary content
15:08:49 <pabelanger> http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/ansible/action/unarchive.py?h=feature/zuulv3
15:08:52 <pabelanger> for unarchive
15:08:59 <bcoca> bitcoin does not need privs to run!
15:09:15 <jlk> everything is timed
15:09:21 <misc> yeah, but if the job are killed after some time
15:09:25 <jeblair> pabelanger: yeah, that's just checking the local path i believe
15:09:31 <pabelanger> yup
15:09:52 <jimi|ansible> does a bitcoin miner need a large block of time? could just test as many hashes as possible and exit
15:09:54 <misc> now, if you get auditing of what happen, people will abuse it, but you will know how
15:09:57 <jtanner> we're gonna need a bigger boat
15:10:12 <bcoca> jlk: not really looking at breaking yoru security, but looking at what would be nice to provide in core for 'secure setups'
15:10:24 <misc> would something like mirai need more of ressources ?
15:10:37 <bcoca> ^ i.e running only 'signed' plugins
15:10:40 <jlk> nod
15:10:41 <misc> would a oneoff exploit that do pown a wordpress remotely be blocked ?
15:10:51 <jlk> we have a list of things we allow or disallow running
15:10:53 <misc> (since that's basically "uri" )
15:11:25 <bcoca> jlk: once 'signed' you can avoid havin cp ../url library/copy
15:11:28 <jimi|ansible> i think we're proposing things that you could do on travis/shippable/et. al now
15:11:38 <jlk> well
15:11:49 <jlk> the difference is that on travis, all those things happen in the VM they give you
15:12:02 <jlk> we're talking about the things that run ON our control plane, not in a VM we launch for you
15:12:09 <bcoca> well, these features would be for 'centralized/controlled' environments
15:12:18 <misc> (and on travis, that's SEP)
15:12:33 <jlk> ansible-playbook runs on the control node, with the VM as the target
15:12:36 <jimi|ansible> well they're in the bubblewrap environment right? and they're things that impact remotes
15:12:38 <jlk> so we want in untrusted runs to prevent local execution of things
15:12:48 <jimi|ansible> though honestly with the travis/shippable file you can exec commands just as easily
15:12:51 <jlk> yeah they're in bubble wrap
15:13:49 <misc> so, if that run on amazon, it has access to the api ?
15:14:43 <misc> (kinda like https://media.ccc.de/v/33c3-7865-gone_in_60_milliseconds )
15:15:37 <jlk> so
15:15:39 <jlk> secrets
15:16:04 <jlk> we would design the pipelines so that the secret is not available in the "check" pipeline (the one that automatically runs on PR open)
15:16:32 <misc> (also related: https://hackernoon.com/capturing-all-the-flags-in-bsidessf-ctf-by-pwning-our-infrastructure-3570b99b4dd0 )
15:16:32 <jlk> and the secret is only available in a pipeline that requires human review before starting
15:16:39 <jlk> so if a human reviews it and missed the fact that it uses the secret to eat all your $$, then sure.
15:16:51 <jlk> anyway we're disconnecting to go drinking.
15:16:54 <bcoca> said human gets his pay docked?
15:17:05 <bcoca> jlk: that is always the correct reason
15:27:35 <gundalow> #endmeeting