18:00:57 <nirik> #startmeeting Infrastructure (2016-04-07)
18:00:57 <zodbot> Meeting started Thu Apr  7 18:00:57 2016 UTC.  The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:00:57 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
18:00:57 <zodbot> The meeting name has been set to 'infrastructure_(2016-04-07)'
18:00:57 <nirik> #meetingname infrastructure
18:00:57 <nirik> #topic aloha
18:00:57 <nirik> #chair smooge relrod nirik abadger1999 lmacken dgilmore threebean pingou puiterwijk pbrobinson
18:00:57 <nirik> #topic New folks introductions / Apprentice feedback
18:00:57 <zodbot> The meeting name has been set to 'infrastructure'
18:00:57 <zodbot> Current chairs: abadger1999 dgilmore lmacken nirik pbrobinson pingou puiterwijk relrod smooge threebean
18:01:37 <puiterwijk> Hi
18:01:42 * aikidouke here
18:02:05 <dgilmore> heya
18:02:07 <nirik> any new folks like to give a short one line into? (I think there was a new person on the list, but not sure if they were going to make the meeting)
18:02:16 <nirik> or apprentices with any questions or comments?
18:02:21 <smdeep> .hello smdeep
18:02:22 <zodbot> smdeep: smdeep 'Sudeep Mukherjee' <smdeep@gmail.com>
18:02:38 * doteast present
18:02:43 <threebean> hey hey :)
18:03:30 * mirek-hm is here
18:04:06 * lmacken 
18:04:57 * nirik will wait a minute or two more for new folks/apprentice questions.
18:05:25 <nirik> alright, lets go on to status/info. hold on to your irc client...
18:05:35 <nirik> #topic announcements and information
18:05:35 <nirik> #info Production to staging FAS has been synced - kevin
18:05:35 <nirik> #info retrace and faf staging hosts are live - kevin
18:05:35 <nirik> #info postgresql tuning on non dedicated db hosts done - kevin
18:05:36 <nirik> #info bunch of ansible cleanup patches from misc. Thanks misc!
18:05:37 <nirik> #info kojira/koji db issues continuing, needs more investigation - kevin
18:05:38 <nirik> #info epel5 repodata issue, hopefully solved now - kevin
18:05:40 <nirik> #info Outages next week likely tue/wed for updates/reboot cycle - kevin
18:05:42 <nirik> #info outage friday for openstack cloud - patrick
18:05:46 <nirik> #info Basset updated with more rules - patrick
18:05:55 <nirik> anything there anyone wants to expand on or add/
18:06:26 <nirik> #topic new rsync setup soon - smooge / kevin
18:06:46 <nirik> so, someone on the mirror list pointed out a nice way to do rsyncs for mirrored content that we want to try out.
18:07:01 <nirik> Hopefully we can make it easy and get everyone to do things that way
18:07:22 <nirik> basically we make a file that has timestamp info and files
18:07:39 <nirik> mirrors then check when they last synced, and only sync files newer than that from the file list
18:07:52 <nirik> which avoids the massive metadata overhead.
18:07:59 <aikidouke> nice
18:08:00 <threebean> oh, nice!
18:08:15 <nirik> anyhow, I just thought I would mention it and if anyone wants to help test, smooge was going to work on it I think...
18:08:21 <puiterwijk> That does mean though that the mirrors need to run a script written by us... I wonder if we have any idea on how much people are going to have a problem with that?
18:08:34 <nirik> puiterwijk: well, it's a pretty basic shell script...
18:09:07 <puiterwijk> nirik: sure. I'm not saying they have a problem with it. Just asking if we tried asking some of they do
18:09:11 <nirik> I think most will likely how much faster it is. Of course they wll have to run hardlink on their own after
18:09:14 <smooge> well I don't know if it is a basic shell script :)
18:09:28 <nirik> I cannot imagine it being that complex.
18:09:54 <nirik> The main thing would be date math...
18:10:10 <aikidouke> i think what smooge is saying is that he put some elbow grease to it ;)
18:10:35 <aikidouke> so it is of course an amazingly thought out algorythm ;)
18:11:48 <nirik> anyhow, hopefully it will work out and be nice. ;)
18:12:00 <nirik> anything else on this/
18:12:01 <puiterwijk> Anyway, the idea is nice
18:12:40 <nirik> #topic $PS1 testing change - aikidouke
18:12:52 <nirik> aikidouke: I sent my feedback to the list. ;)
18:13:01 <nirik> I think it looks good, but perhaps adjust the colors
18:13:05 <aikidouke> thanks nirik - ack'd
18:13:28 <nirik> and will definitely save me time at least... I often 'hostname' to make sure I am in stg or the like...
18:13:43 <aikidouke> i haven't tried csh, but I can't imagine it wouldnt work in just about any other shell
18:13:58 <nirik> well, due to the way our account system works, everyone gets bash
18:14:09 <aikidouke> i will change the colors - maybe wait a day to see if anyone else has a comment?
18:14:23 <aikidouke> then should i just add that logic to the base role or someplace else?
18:14:30 <puiterwijk> aikidouke: I was also thinking that we might want to put it between the hostname and pwd instead of up front?
18:14:40 * aikidouke nods
18:14:44 <nirik> yeah, base role.
18:14:47 <aikidouke> we can do that
18:14:51 <puiterwijk> So that it looks more like it's part of the hostname, which it actually is
18:15:18 <threebean> :)
18:15:19 <nirik> oh, another thought: if we can identify cloud, we could add a [CLOUD] there?
18:15:21 <aikidouke> anyone else have strong feelings about the positioning one way or the other?
18:15:42 * nirik agrees with puiterwijk that the bikesed should be green. ;)
18:15:45 <threebean> nirik: maybe the net too.  so we could see if we're on the qa net, or elsewhere?
18:15:51 <puiterwijk> nirik: you mean for instances? Or for fed-cloud01 - 15?
18:15:54 <aikidouke> sure - if there is a variable for it, we can change the prompt
18:16:18 <nirik> puiterwijk: instances... but not sure we have an easy var... would have to look for hostname of cloud.fedoraproject.org or fedorainfracloud.org
18:16:26 <puiterwijk> If you mean 01 - 15, I would vote that we just use [STG] and [PROD] just like all others
18:16:54 <aikidouke> cloud is an easy awk/grep - whatever you want to do
18:17:11 <threebean> nirik: there's a 'persistent-cloud' group in ansible we could look for membership in.
18:17:12 <puiterwijk> I think there's a way to find out if you're in openstack.. I'd have to check, but there's an easy http url to get to get that info
18:17:13 <threebean> aikidouke: ^^
18:17:30 <aikidouke> sounds good
18:17:38 <puiterwijk> threebean: though I'm not sure all cloud instances are i nthere. And it might be nice if transient also gets the tag...
18:17:45 * threebean nods
18:17:46 <puiterwijk> Maybe even [TRANSIENT] or something
18:17:58 <puiterwijk> so [CLOUD] for persistent, [TRANSIENT] for transient
18:18:06 <nirik> sure
18:18:09 <misc> puiterwijk: using the metadata url of cloud-init
18:18:16 <puiterwijk> misc: es, that's what I meant
18:18:23 <puiterwijk> misc: I just need to look up the exact URL :)
18:18:29 <nirik> but, we don't want that to delay normal logins elsewhere.
18:18:32 <aikidouke> ok - thanks.....
18:18:38 <puiterwijk> I *think* it's something like http://192.168.122.100/metadata", but I'd need to check
18:18:55 <misc> GET http://169.254.169.254/2009-04-04/meta-data/ ?
18:19:01 <puiterwijk> nirik: we could add this as part of the cloud-init metadata
18:19:02 <aikidouke> i'll b around if you have time to look it up
18:19:08 <puiterwijk> misc: yes, that's it.
18:19:11 <nirik> also actually many of the cloud instances don't get base role either
18:19:22 <puiterwijk> nirik: right. That's why I'm suggesting to use cloud-init data for them
18:19:32 <aikidouke> it could be declared as a variable when a cloud instance gets created, right?
18:19:34 <misc> any
18:19:37 <nirik> sure, if we can do that that would be nice
18:19:37 <misc> rah
18:19:55 <misc> wouldn't it be better to force cloud instane to have the base role ?
18:20:14 <puiterwijk> aikidouke: well, not all cloud instances get created via ansible....regardless of how hard I'm fighting against it, people just use horizon...
18:20:17 <nirik> well, there's things in the current base role that won't make sense for them
18:20:24 <nirik> so it would require some reworking
18:20:36 <nirik> 2fa for example
18:20:37 <puiterwijk> misc: I would totally agree with you, and I'm fighting so hard to get everyone to use ansible to create their instance... but people don't
18:20:52 <aikidouke> puiterwijk - if someone is creating an instance w/o ansible, how important/used would it be?
18:20:55 * puiterwijk considers just not deploying horizon with the new version
18:21:03 <puiterwijk> aikidouke: for consistency, I'd say quite
18:21:09 * aikidouke nods
18:21:23 <aikidouke> ok - any other suggestions?
18:21:25 <misc> puiterwijk: even worst, deploy horizon, but make it crash
18:21:39 <puiterwijk> misc: oh, I don't have to do anything for that one
18:21:47 * nirik notes the cloud thing is an extra... stg/prod are the main things to get done.
18:21:54 <aikidouke> yup
18:21:55 <misc> puiterwijk: or make it too slow to be usable
18:21:57 <nirik> but nothing more for me on this.
18:22:09 <aikidouke> ok - thanks for the feedback everyone
18:22:32 <puiterwijk> nirik: agreed.
18:22:58 <nirik> hum, was just about to paste the next topic, but scollier seems to have deleted it. ;)
18:23:08 <puiterwijk> No, I have after discussing with him :)
18:23:21 <nirik> ah ha. ;) anything you want to share to the meeting? or should we move on?
18:23:44 <puiterwijk> Basically, there are plans to deploy a test instance of openshift in the Fedora Infra cloud, and me and misc need to get together to set that up
18:24:11 <nirik> yeah, thats waiting on the move to RHOSP 7?
18:24:34 <puiterwijk> Not per se, that was mostly delayed by me having to fight with spam at the time we started that project
18:24:43 <nirik> ok
18:25:00 <nirik> any other discussion items? or shall we move on to puiterwijk talking to us about Basset?
18:25:14 <aikidouke> ok for a rando question?
18:25:50 <nirik> sure
18:26:34 <aikidouke> i noticed threebean had started work on a testing instance for the hubs concept - and someone else was setting up a docker machine
18:26:40 <aikidouke> (or maybe it was the same thing)
18:27:11 <aikidouke> would it be a concern to run non-free software in a container in our infra?
18:27:18 <nirik> The hubs test was/is a cloud instance... I don't know that it has any docker in it
18:27:26 <threebean> yeah, no docker stuff here:
18:27:28 <puiterwijk> We cannot run non-free software in our infrastructure, anywhere
18:27:28 <threebean> https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/playbooks/hosts/fedora-hubs-dev.yml
18:27:30 <nirik> yes, very much so.
18:27:33 * threebean nods
18:27:45 <nirik> well, without a exception for it...
18:27:54 <threebean> and that would have to go to FESCo, correct?
18:27:56 <aikidouke> ok - i thought so, must've been something else someone was doing
18:27:57 <nirik> which we are all very unlikey to want to do. ;)
18:28:05 <aikidouke> :) understood
18:28:07 <threebean> :)
18:28:31 <puiterwijk> But yes, Adam is setting up a docker registry, since koji will be building docker containers sometime in the future
18:28:41 <aikidouke> ahh thats what it was then
18:28:43 <puiterwijk> That's probably what you've seen
18:29:03 <nirik> threebean: depends I guess. We don't have a good process, but I guess the council...
18:29:11 * threebean nods
18:29:16 <nirik> we may be forced to run some non free stuff before long sadly. ;(
18:29:43 <puiterwijk> nirik: the mac for OS X builds, you mean, right? Or is there anything else?
18:29:53 <aikidouke> i kind of wondered if that was a possibility
18:29:56 <nirik> yeah, mac and windows builds for fedora media creator.
18:29:56 <misc> hsm stuff ?
18:30:04 <aikidouke> ok - thank you
18:30:14 <nirik> which means we would need osx and win builders/images. ;(
18:30:27 <nirik> but we will see when we come to it.
18:30:27 <sgallagh> nirik: We can't cross-compile?
18:30:32 <nirik> nope.
18:30:43 <sgallagh> What's wrong with mingw?
18:30:59 <nirik> there was a list from them, I don't have it handy...
18:31:12 <sgallagh> ok
18:31:14 <aikidouke> what about some of those build once - run anywhere things like umm...crud - they show up in news feeds here and there...anyway - I'm straying OT some
18:31:20 <nirik> no qt5 support, can't get officially signed, etc
18:31:29 <aikidouke> (not nec xdg-apps)
18:31:58 <nirik> the build once run anywhere mirage has been around for a long time. ;)
18:32:04 <nirik> anyhow...
18:32:09 <nirik> lets move on
18:32:18 <nirik> #topic Learn About: basset - puiterwijk
18:32:24 <nirik> take it away puiterwijk
18:32:47 <puiterwijk> Okay, so some people might have seen info about Basset in the last few weeks, so I figured I'd give a high-level overview
18:33:02 <puiterwijk> People might have seen that our wiki and trac have been plagued by spam recently, and they're quite persistent
18:33:34 <puiterwijk> Basset is our anti-spam system. It gets messages from FAS, wiki and trac about things that are changed, like new accounts registered, people signing FPCA, new pages, new trac tickets, etc, etc
18:34:00 <puiterwijk> Then it runs a set of rules to determine a score of how likely that specific content (account/page/ticket/...) is spam, and then takes actions based on that.
18:34:38 <puiterwijk> For example, if a newly created FAS account is deemed spam, it will mark the account as spamcheck_denied, and the user can't login. If they created a spam page, it will delete the page, block their wiki account and block their FAS account.
18:35:36 <puiterwijk> It also ingests data from FAS and wiki and other sources to help it in determining whether something is spam, and keeps track of its previous decissions, to make better decisions i nthe future.
18:36:29 <puiterwijk> It is running on a single server at the moment (basset01.phx2 for prod, basset01.stg.phx2 for staging), but could scale out if needed in due time.
18:37:43 <puiterwijk> In the case of FAS, it's even so that until Basset deemed an account non-spam, the account is inactive (status awaiting_spamcheck), so that possible spammers can't abuse the possible delay in processing.
18:38:13 <puiterwijk> Are there any questions about this general working so far?
18:39:36 <nirik> it needs a FAS account right? so we will not be able to use it on say ask... but possibly on pagure?
18:40:08 <misc> what happen when basset is down ?
18:40:10 <puiterwijk> Well, we could use it on Askbot, it just won't be able to block the FAS account. It could still delete spam questions and block the account itself.
18:40:15 <puiterwijk> And I'm adding Pagure indeed
18:40:55 <puiterwijk> misc: the frontend is about 50 lines of code. As long as that and the message broker are up, all messages will be queued, and processed as soon as the worker is back up.
18:41:34 <aikidouke> is there any way to know/tell how many accounts in FAS were set up by spammers before basset was deployed?
18:41:42 <puiterwijk> This is guaranteed by Redis with highly consistent messaging QOS
18:41:53 * aikidouke guesses the answer is no..
18:42:18 <nirik> aikidouke: not off hand, but I'd guess 1000-2000 or so. They were making them at a pretty good rate toward the end.
18:42:27 <puiterwijk> aikidouke: I have recently set the ones that I'm aware of  (because they created spam pages)  to spamcheck_denied, so they're deemed in the same stats as accounts picked by Basset
18:42:42 <puiterwijk> But yeah, other than that, we have no idea.
18:43:11 <aikidouke> ok - tough problem - add that to my someday stack
18:43:23 <aikidouke> ty
18:43:48 <puiterwijk> misc: to get back to your question: in case the frontend and/or broker are offline, it will not be able to queue messages, and lose them. But in the case of FAS, we can get them back by checking all accounts in status awaiting_spamcheck, and for other things we can go back to fedmsg and ingest those.
18:44:21 <misc> puiterwijk: that's more that this is not gonna prevent people from creating account or editing the wiki ?
18:44:42 <puiterwijk> misc: if Basset is down, new account creations are queued for spamcheck.
18:45:03 <puiterwijk> the wiki will continue to work as it normally is, as it's not "gated" on Basset
18:45:14 <misc> so just account are gated, ok
18:45:31 <nirik> yeah, the gated accounts don't even get the welcome email with their initial password until they pass
18:45:33 <puiterwijk> The goal from the outset was to block the accounts at the FAS level, and all the other input is mostly to allow Basset to get more learning input
18:46:48 <puiterwijk> Note that while all the sources for Basset are public, and so is the code for the rules, the values that are the input for the rules are not
18:47:25 <puiterwijk> So it's public information that we have a scoring plugin that gives a score to IP addresses owned by certain companies, but the "greylisted" or "blacklisted" companies themselves are not public, as that would give the spammers info on how to avoid the system
18:48:13 <puiterwijk> The content for the rules are in the private repo, since I have seen reasons to believe the spammers are reading IRC conversations.
18:48:46 <misc> puiterwijk: so now, you will trick spammer into becoming fedora admins and contribute to read the repo :p
18:49:14 <puiterwijk> misc: it's in the private repo.. So they'd need to be a -main'er.. And I pretty much trust the people in there.. :)
18:49:44 <puiterwijk> If they're in there, all is lost anyway
18:50:27 <puiterwijk> Until now, most of the rules have been manually configured, but I'm now testing the first automatically learning rulesets, so hopefully in due time we will have less and less manual work on tweaking the Basset rules
18:51:33 <puiterwijk> I think that that's the most important high-level info for Basset. If there are any questions, feel free to ask them or ping me whenever
18:52:18 <nirik> any upcoming plans? I guess pagure support...
18:52:18 <puiterwijk> Oh, one other thing: because of the nature of the information stored i nthe Basset servers, they are only accessible for sysadmin-main, not for fi-apprentice or any other group :)
18:52:27 <misc> puiterwijk: a link to the source code ?
18:53:19 <puiterwijk> nirik: yeah, Pagure support, askbot support, and finish the automatically learning rules
18:53:28 <puiterwijk> Basset's source code is available at https://pagure.io/basset
18:54:26 <nirik> cool. Thanks puiterwijk!
18:54:28 <threebean> puiterwijk++ thanks for all the work on this.  :)
18:54:36 <aikidouke> indeed
18:54:39 <nirik> #topic Open Floor
18:54:52 <threebean> I have an item to throw out there..
18:54:52 <nirik> anyone have any items for open floor? or shall we call it a meeting?
18:54:56 <threebean> real quick
18:54:56 <nirik> shoot...
18:55:01 <threebean> in pkgdb and dist-git we have namespaces now
18:55:11 <threebean> there's rpms/ and docker/ (for rpms and dockerfiles, which we don't use yet)
18:55:27 <threebean> we're going to be adding namespaces for taskotron checks too (so that packagers can add their own checks for their own packages)
18:55:37 * nirik nods.
18:55:49 <threebean> in the modularity working group meeting earlier today, we came up with the idea to store the definitions of 'modules' in dist-git as well (even though we don't 100% know what those definitions are going to look like yet)
18:55:59 <threebean> and then manage ACLs for those modules in pkgdb too, like we do for everything else.
18:56:02 <nirik> makes sense.
18:56:09 <threebean> i think this will involve very very little code change, and hopefully it will all just 'click'.
18:56:18 <threebean> anyone have any objections to pursuing that down the road?  alternative ideas?
18:56:32 <puiterwijk> Maybe we need to rename pkgdb? Otherwise it all makes sense to me
18:56:36 <nirik> I think it makes sense and is fine...
18:56:37 <threebean> hehehe
18:56:42 <threebean> aclDB would maybe be a better name.
18:56:43 <puiterwijk> how about "alldb"? :)
18:56:44 <nirik> atrifactsdb
18:57:16 <puiterwijk> Oh, and are there any plans for sync between namespaces?
18:57:27 <nirik> are there still plans to put some kind of pagure like frontend over dist-git? hopefully that will still be possible with namespaces.
18:57:31 <tflink> https://github.com/fedora-infra/pkgdb2/issues/329
18:57:32 <threebean> not for modules, but we do have such plans for taskotron checks <-> rpms namespace.
18:57:33 <puiterwijk> For example, if we have pkgs/whatever, would ACLs for that sync to tests/whatever?
18:57:44 <puiterwijk> threebean: yeah, that's what I meant. Okay
18:57:55 <threebean> nirik: yes - pingou and puiterwijk have been working on namespace support for pagure for just that.
18:58:13 <nirik> great! I love it when things are all just already thought of. ;)
18:58:26 * nirik sits back on the beach with the umbrella drink.
18:58:34 <threebean> heheh :)
18:58:54 <nirik> ok, anything else for open floor?
18:58:59 <threebean> (thanks!)
18:59:48 <nirik> ok, thanks for coming everyone. Do continue in #fedora-admin, #fedora-noc and #fedora-apps!
18:59:51 <nirik> #endmeeting