22:07:47 #startmeeting 22:07:47 Meeting started Sat Dec 5 22:07:47 2009 UTC. The chair is smooge. Information about MeetBot at http://wiki.debian.org/MeetBot. 22:07:47 Useful Commands: #action #agreed #halp #info #idea #link #topic. 22:08:10 Mike McGrath talking 22:08:23 Metrics: The difference between problems and measurements 22:09:02 Brief introduction, New Repo is Slow, Lunch? Nope. Nagios 22:09:35 When you have an issue what you do to get rid of symptoms and what are the real problems 22:10:10 if your eye is red, its a symptom. if you have a car in your eye. thats a problem 22:10:34 examples of measurements are given: load, temperature, hits/second, page swaps per second 24 lbs of ribs are all measurements 22:11:10 problems: OOM Killer, bad offsite mirroring, and Bernie Madoff are all problems 22:11:52 Example: koji is slow, why because it shells out to createrepo and was slow 22:12:12 first tool was sar and all the lines looked fine. 22:12:23 second tool was iperf and the network looked line 22:12:46 after this it was get a profile tool with python 22:13:48 this required source changes to get the profiler called which meant he wrapped various 'def's 22:14:09 the tool he used was hotshot though there are faster ones.. 22:14:40 this gave out dumps which gprof2dump made pictures of speed area. 22:16:26 mike then showed the picture output which is a pretty picture that shows that 87.5% of its time doing a posixpath exists 44192 times 22:17:57 root cause turns into NFS is too slow and we go over how could make it better 22:19:20 looking at the code it turned out the call was redundant and could be removed. 22:19:42 the call being a os.path.exists(foobar) 22:20:00 nagios and the pain that it is 22:20:40 mirror manager is known to be our most redundant thing. 22:20:53 one day we got a nagios saying it was down. 22:21:25 checked to see if it was down (which it was).. then go to sar for load/disk/cpu/memory/swapping/ and then iperf for network 22:22:00 next steps were looked at (load? apache-status showed a lot of mirror-list requests) 22:22:19 why and where were these waiting mirror-lists coming from? 22:26:10 * ricky is on the edge of his seat :-) 22:26:48 looked at what the changes were made and found a commit to mirror-manager happened 22:27:34 Mike got Matt D to help him go through and found a line that did readlines() versus ohter commands. This caused a huge increase amount of memory or place 22:28:00 Options: revert code, add more servers, more proxy caching, use something other than readlines??? 22:28:32 what will they do? tune in tomorrow.. same McGrath channel same McGrath time 22:28:41 * ricky falls off :-) 22:29:07 actually the last option was taken.. it was use f.readline versus f.readlines.. using xreadlines might do in the future 22:30:01 * ricky remembers this issue - xreadlines is deprecated now 22:30:11 ah 22:30:22 but the code is python 2.4 where its not :/ 22:30:27 anyway... back to the lecture 22:30:31 broad stroke 22:30:40 [or maybe it is and I missed it.] 22:30:42 I think it was deprecated in 2.4+ 22:31:19 Broad Strokes: Clearly identify the problem, Clearly identify what the solution should be, Baseline, Baseline, Baseline! 22:33:26 Questions: 22:33:43 What tools do you use to profile code? 22:34:11 Hotshot was what was used.. and how to do wrapper functions. 22:36:02 function maps then can be used to find your performance issues 22:38:02 we need to do more profiling and will put ricky on doing it after the move 22:38:37 ... or at least someone :) 22:40:19 After the move = winter break, so happy to help then :-D 22:41:02 question #2 ... (real question... ) where is ricky 22:41:35 * ricky is at CMU procrastinating writing an essay 22:42:12 they are not impressed with your excuses.. you could have procrastinated here. 22:42:38 Heh 22:42:42 they really hope you will do well, and make them proud. 22:42:48 anyway lecture over. 22:42:49 Thanks :-) 22:42:54 #endmeeting