12:03:03 #startmeeting 12:03:03 Meeting started Wed Feb 18 12:03:03 2015 UTC. The chair is ndevos. Information about MeetBot at http://wiki.debian.org/MeetBot. 12:03:03 Useful Commands: #action #agreed #halp #info #idea #link #topic. 12:03:09 hello all! 12:03:30 The agenda can be found here: 12:03:31 https://public.pad.fsfe.org/p/gluster-community-meetings 12:03:36 hello :) \o 12:03:37 #topic Roll Call 12:04:10 * msvbhat is present 12:04:13 o/ 12:04:14 we have raghu and hchiramm, anyone else? 12:04:19 aha! 12:04:19 * Debloper is present 12:04:31 * overclk is there 12:05:09 #topic Last weeks action items 12:05:21 #info subtopic: ndevos should publish an article on his blog 12:05:30 * ndevos still needs to do that 12:05:35 thanks ndevos :) 12:05:58 * JustinClift here now too (sorry a bit late) 12:05:58 #info subtopic: hchiramm will try to fix the duplicate syndication of posts from ndevos 12:06:16 yes, that can only be checked when I posted something.... 12:06:29 #info subtopic: hchiramm will share the outcome of the non-mailinglist packagng discussions on the mailinglist (including the Board) 12:06:42 hchiramm: was a final result reached? 12:06:53 ndevos, yet to conclude the solution.. 12:07:06 discussion is 'on' .. 12:07:08 Is this the thing that was fixed by Spot and JMW? 12:07:16 Ahhh no, that was multiple Tweets... 12:07:29 this is abt the blog duplication 12:07:44 Yeah, sorry 12:07:48 * JustinClift should get coffee 12:07:50 * ndevos silences JustinClift 12:07:53 * hchiramm :) 12:08:02 #info subtopic: hagarth to open a feature page for (k)vm hyperconvergence 12:08:13 ndevos, he is on pto 12:08:23 I wonder if I won my bet? did hagarth create a page for it? 12:08:31 afaik, he is not :) 12:08:45 hchiramm: yeah, I know he's out, thats why I'm in charge :D 12:08:48 We get to find out next week? <-- you have to imagine me saying this, as I'm currently silent :) 12:09:08 ndevos, that info is for others , not for u :) 12:09:21 ah, ok :) 12:09:22 #info subtopic: spot to reach out to community about website messaging 12:09:44 I did not see an email about this? 12:09:53 did someone else see one? 12:10:05 I didnt 12:10:22 okay, I'll remind him 12:10:34 * msvbhat brb 12:10:44 #action ndevos will contact spot about open standing action items on the weekly agenda 12:10:57 #info subtopic: hagarth to carry forward discussion on automated builds for various platforms in gluster-infra ML 12:11:21 I also did not notice anything about that 12:11:43 #info subtopic: ndevos should send out a reminder about Maintainer responsibilities to the -devel list 12:12:00 argh, forgot about that, will try to get that done later today 12:12:18 #info subtopic: telmich will send an email to the gluster-users list about Gluster support in QEMU on Debian/Ubuntu 12:12:37 telmich: are you paying attention? 12:13:07 was this email sent? I can not remember seeing one 12:13:48 I guess I did not miss it then :-/ 12:13:57 #info subtopic: jimjag to engage the board, asking for their direction and input for both 3.7, and 4.0 releases 12:14:34 jimjag: has there been any discussion on the (private) board list about it? 12:14:44 or JustinClift, could you chime in? 12:15:26 * ndevos feels a little lonely - is there anyone reading this? 12:15:38 yes :) 12:15:54 well, at least I'm not alone :D 12:15:54 yes. 12:16:14 #topic GlusterFS 3.6 12:16:21 ndevos: nope 12:16:24 raghu: you're up! 12:16:36 jimjag: okay, maybe something by next week? 12:16:44 ndevos: +1 12:17:08 jimjag: okay, thanks! 12:17:16 Not much progress this week. I wanted to take some patches in. But many patches were failing regression tests (dont know whether its a spurious or not) 12:17:35 yeah, regression tests were a pain the last days 12:17:51 but I think JustinClift spent some time on checking it out 12:18:40 So as of now, in this list of 3.6 patches (http://review.gluster.org/#/q/status:open+branch:release-3.6,n,z), there is no patch which has both +1 from a reviewer and passed regression tests 12:19:16 apart from that I closed (changed the status properly to be precise), many bugs from release-3.6 branch 12:19:31 Sorry, just had to fix something. Back now. 12:19:41 ndevos: I can ping jimjag about the board 12:19:55 ndevos: yeah. I retriggered some tests thinking the failures might be spurious. But they failed again. 12:19:56 JustinClift: jimjag ponged already 12:19:57 I haven't seen anything about stuff mentioned on it tho (yet) 12:20:06 * JustinClift reads that ;) 12:20:24 JustinClift: how are the regression tests looking? 12:20:38 Not in a good state 12:20:46 We have failures in 3.6 release branch 12:20:51 I've pinged Avra about one of them 12:20:57 I need to ping Jeff about the other 12:21:33 * JustinClift still needs to test our other branches more, and figure out whats a "real" failure vs something wrong with the slave nodes that needs fixing 12:21:37 So... "in progress". :/ 12:21:48 raghu: were those two tests the issues for the failures you saw, or were there more? 12:22:02 * partner late :/ 12:22:36 raghu: http://www.gluster.org/pipermail/gluster-devel/2015-February/043882.html 12:22:49 This gives the two failure names^ 12:22:57 tests/bugs/bug-1045333.t 12:23:05 tests/features/ssl-authz.t 12:23:06 ndevos: I saw different failures 12:23:29 #info the release of 3.6 is getting delaybed because of regression test failures 12:23:49 some times it was in ec 12:23:52 Yeah. We're probably right to delay it too, until we know the cause. 12:24:26 Also saw a failure in tests/bugs/bug-1176062.t, but it wasn't consistent 12:24:39 I'll have more idea of what's a real failure vs not a real failure later today 12:24:46 hmm, yes, but figuring out what suddenly causes this should be top priority, right? 12:24:49 * JustinClift is going to run some bulk regression tests again 12:25:09 I'm happy to spin up VM's in Rackspace for people to investigate as needed 12:25:10 ndevos: Some tests failed were from the ones that you mentioned now. 12:25:16 #action JustinClift keeps on investigating the regression test failures 12:25:17 from ssl-authz.t 12:25:52 raghu: and you did not merge any ssl changes? 12:26:27 * raghu checking 12:26:47 I kinda hope it's some new dependency that needs adding to the slaves, and not a real failure 12:27:13 It's possible 12:27:36 ndevos: not recently. I think the last ssl patch was from october-2014. 12:27:52 raghu: okay, thanks 12:28:21 well, we'll leave it in the hands of JustinClift to track down and involve developers 12:28:36 raghu: anything else for 3.6? 12:28:50 ndevos: nope. 12:29:06 #topic GlusterFS 3.5 12:29:07 Failures in ssl-authz.t are almost certain to be because of increased parallelism, not SSL per se. 12:29:29 jdarcy: I do not think the epoll changes are in 3.6? 12:30:01 anyway, very little progress on 3.5 too 12:30:36 jdarcy: Just emailed you. Would you have time to login to slave31 and take a look or something? 12:30:37 mainly due to the failing regression tests and the increased time they are waiting in the queue before they get run 12:30:48 ndevos: They're not, but socket multi-threading was implemented when SSL was and gets turned on by default when SSL is. First test might be to turn off own-thread even when SSL is turned on. 12:30:59 JustinClift: Sure. 12:31:29 Both slave30 and slave31.cloud.gluster.org. They're using our standard Jenkins user/pw (I can dig it up for you if needed) 12:31:40 slave31 has only ever run the one regression test 12:31:43 jdarcy: hmm, right, but I wonder why things start to fail only just recently - but we'll follow the conversation on the -devel list 12:32:26 jdarcy: Also, keep an eye out for "these newly setup slaves are not setup correctly" problem. Just in case. They're new slaves, and the configuration script needed some updates. 12:32:33 I *think* it's good... but keep it in mind. 12:32:34 I still hope to do a beta1 for 3.5 later this week, but only if the testing becomes more stable 12:33:05 #info beta1 for 3.5.4 might be delayed a week due to regression test issues 12:33:05 Hmmmm.... I wonder if our memory usage in Gluster has grown or something? 12:33:23 #topic GlusterFS 3.4 12:33:29 kkeithley: please have a go at it 12:33:30 Maybe there's a memory leak in our code base or something that's causing us to run out of ram on the nodes, so weird behavour... 12:33:39 nothing, still working through permutations of perf xlators 12:33:53 trying to find a combination short of *all* that leaks 12:33:57 * gothos hasn't seen any problematic memory behavior 12:34:11 In the client 12:34:37 some people see the client glusterfs daemon grow until it trips the OOM killer 12:34:57 And I see it growing continuously in my test harness 12:35:29 is that with the default configuration? 12:35:33 anyway, if i get ahead of my work for 3.7, I'll take a look at other patches for 3.4.7. IIRC I've seen a couple patches go by for 3.5 12:35:35 3.4 12:35:48 yes, that's the out-of-the-box defaults 12:36:08 hmm, ok 12:36:38 Is there a way to grab the entire host memory for a VM at a given time, and then analyse it to find out what's going on where? 12:36:59 not that I'm aware of 12:37:00 *cough* vmcore *cough* 12:37:04 kkeithley: just rechecked all our servers, we actually have one where the gluster process is using about 8GB RES the other servers are around 1.5GB 12:37:16 kkeithley: I'm pretty sure I saw a debugging tool that might suit this actually. Not open source, but they did offer us a license. 12:37:38 not servers, client-side fuse mounts. The glusterfs daemon on the clients 12:37:40 Lets discuss in -dev later 12:37:55 sure 12:38:05 yes, that's is what I meant, since we have that running on our servers 12:38:19 I think you would be able to use gcore to capture a coredump of the glusterfs process too? 12:38:47 * ndevos just does not have an idea where to look into such a core and find the leaking structures... 12:39:02 kkeithley: anything else for 3.4? 12:39:23 I'm running a debug build, so mem-pools are disabled. Should see it with valgrind, but so far no smoking gun 12:39:27 no, nothing else 12:39:39 #topic Gluster.next 12:39:48 #info subtopic: 3.7 12:40:00 I have a couple of questions 12:40:01 jdarcy: are you following the 3.7 progress? 12:40:25 ndevos: Cache tiering somewhat, the rest hardly at all. 12:40:26 bene2: sure, ask them, but hagarth isnt there, so I'm not sure if we have all the answers :) 12:41:04 1) is there anyone working on md-cache with listxattr capability described in http://www.gluster.org/community/documentation/index.php/Features/stat-xattr-cache 12:41:57 Vijay told me that someone was, in the form of using md-cache on the server, but the patches were still private. 12:42:15 jdarcy, myself and raghu have the initial patchset ready for BitRot 12:42:53 overclk: Ahh, good. It would be nice to get those into the pipeline. 12:42:56 overclk: ah, nice, did you sent out an email with a pointer to the patches or git repo? 12:43:27 2) is anyone working on cluster.lookup-unhashed? I met with Raghavendra G in Bengaluru and he did not see any obstacle other than possible problems with clients caching out-of-date directory layouts. Any other thoughts on that? 12:43:33 bene2: yes, I also only heard about the plan to put md-cache on the bricks, but do not know who would be working on that 12:43:41 jdarcy, ndevos actually I was planning to send out the patch to review.g.o before this meeting started.. 12:43:58 .. but checkpatch.pl restricted me to do so :) 12:44:14 #action overclk will send out initial BitRot patches for review this week 12:44:31 ndevos, make that today (max tomorrow) :) 12:44:35 bene2: AFAIK nobody's actively working on that. I think you're right that there's little controversy, though. Just needs more eyeballs. 12:44:39 can we disable checkpatch.#!/usr/bin/env python 12:44:39 that are sent as rfc? 12:45:14 oops sorry....I means disabling checkpatch.pl checking for patches that are sent as rfc 12:45:31 I think that would make sense, but I dont know how to disable checkpatch :-/ 12:45:51 raghu, if that's possible, I will send the patches right away. 12:46:04 for rfc patches, BUG id would not be given. We can use that info 12:46:20 Some rfc.sh trickery could achieve that 12:46:29 yeah. 12:46:57 raghu: ah, if it is called from rfc.sh, you could use 'git-review' to push the change ;) 12:47:06 jdarcy: is the controversy on lookup-unhashed about the need for it or the method of implementation? Because with JBOD support we are going to need something similar to this patch very soon IMHO. 12:47:26 According to Vijay in the last 4.0 meeting, the person working on server-side md-cache is himself. 12:48:10 okay, I guess that makes a nice topic to move to 12:48:15 #info subtopic: 4.0 12:48:16 ndevos: ok. We can try it 12:48:45 bene2: The only thing really approaching controversy is some implementation details. Nothing that should take long to resolve if we can tear people away from $thisweekscrisis. 12:49:28 Not much going on for 4.0, mostly some talk about multiple networks and my own continuing struggle to revive NSR from the near-dead. 12:49:47 We agreed on a meeting time. Is that progress? 12:50:00 yeah, I'd call that progress 12:50:16 4.0 is scheduled for 2016, isnt it? 12:50:27 I believe so. 12:50:50 and 3.7 feature freeze is the end of this month, so I expect to see more work on 4.0 soon 12:51:09 Has there been any talk at all about 3.8? 12:51:22 oh, the "Gluster Design Summit" proposal has been sent ouy 12:51:24 *out 12:51:33 that would encourage working on 4.0 too :) 12:51:41 Indeed. 12:52:12 #topic Other Agenda Items 12:52:24 #info REMINDER: Upcoming talks at conferences: https://public.pad.fsfe.org/p/gluster-events 12:52:41 please update that etherpad with anything you think Gluster should be present at 12:52:58 #info subtopic: GSOC 2015 12:53:07 kshlm, JustinClift: anything to report? 12:53:48 Unfortunately no progress has been yet. 12:53:56 :-/ 12:54:16 I need to get hold of spot 12:54:28 Have the Red Hat Summit folks made their decisions yet? 12:55:06 I do not know if the presentation proposals already have been accepted/rejected? 12:55:24 I haven't heard anything from them yet. 12:55:43 I don't remember getting my annual "ha ha you loser" email either. 12:55:53 kshlm: if spot does not respond to email/irc, I suggest you call him 12:56:02 I'm preparing a proposal myself, and will have it on the mailing lists soon. 12:56:30 kshlm: really, give the guy a call :) 12:57:20 #topic Open Floor 12:57:52 nobody added a topic to the agenda, but maybe someone has an item to discuss? 12:57:58 ndevos, yes 12:58:04 * ndevos \o/ 12:58:24 ndevos, would it benefit to have a hangout session on bitrot (at least the flow and a small demo) with the community? 12:58:38 overclk: +1, yes please 12:58:44 overclk: yes, that would be AMAZING 12:59:07 overclk: when I presented at FOSDEM, BitRot was one of the topics that were most interesting for users 12:59:08 overclk: Yes, please 12:59:11 ndevos, how about early next week? 12:59:25 overclk: sure, whatever works for you? 12:59:46 ndevos, Tuesday (24th Feb) 13:00:09 overclk: sounds good to me, but I dont think I can really help with that :) 13:00:17 JustinClift: you know about hangouts? 13:00:41 Not much 13:00:42 ndevos, well someone just need to send out a hangout invite to the community.. 13:00:47 overclk: or, I think hchiramm can organize things lika that? 13:00:54 Have used them before, but never recorded them 13:01:12 overclk: is there a reason you would not send the invite yourself? 13:01:20 ndevos, np. I guess it I'll send it myself. 13:01:44 ndevos, yeh. but last time I guess davemc use to initiate it and chair the meeting.. 13:01:55 overclk: I'm happy to help you out with that, but I also have never done that before :) 13:02:10 ndevos, NP. I'll take care of that. 13:02:17 spot is probably overloaded --- just do it 13:03:00 #action overclk will schedule a BitRot Google Hangout with some technical details and a small demo 13:03:02 but let him know you're doing it, just so he knows 13:03:43 -- any other topics? 13:03:48 ndevos, sure. 13:03:59 ndevos, nothing more from me as of now.. 13:04:27 ndevos: do we want to mention our travel plans? 13:04:41 kkeithley: sure, remind me? 13:05:02 ndevos and I are coming to BLR April 6-10. 13:05:36 oh, yes, *those* travel plans :) 13:05:42 ;-) 13:06:31 #info kkeithley and ndevos will be visiting Bangalore, 6-10 April 13:06:50 I do not think many are impressed, kkeithley 13:06:55 lol 13:07:34 I guess thats it for today, thanks for joining! 13:07:34 ugh, I wrote "lol" 13:07:44 #endmeeting