06:03:12 #startmeeting i18n 06:03:12 Meeting started Thu Jun 27 06:03:12 2013 UTC. The chair is tagoh_. Information about MeetBot at http://wiki.debian.org/MeetBot. 06:03:12 Useful Commands: #action #agreed #halp #info #idea #link #topic. 06:03:13 #meetingname i18n 06:03:13 The meeting name has been set to 'i18n' 06:03:14 #topic agenda and roll call 06:03:14 #link https://fedoraproject.org/wiki/I18N/Meetings/2013-06-27 06:03:24 Hi! 06:03:27 hi 06:03:40 hi, long time no meeting. shall we have one today 06:04:17 Hi! 06:05:12 hi 06:05:12 hi 06:06:06 hi 06:06:15 hi 06:06:37 hi 06:08:55 okay, let's get started. 06:09:21 #topic Upcoming schedule 06:09:22 #info 2013-07-02 Fedora 19 Final Release 06:09:48 RC2/2.2 is out and on testing now 06:11:23 go/no-go meeting will be this night in our timezone... 06:11:52 (I think 2.2 is just a respin for kde live) 06:11:56 right 06:12:01 No delta isos from rc2 to rc2.2? 06:12:12 no deltas for live 06:12:13 juhp: Ah, so no big difference to rc2. 06:12:58 I am not quite sure what is in 2.2 - there was some proposed blocker about kdepim iirc 06:13:33 * juhp is re-downloading desktop live... 06:15:07 okay, move on then 06:15:15 #topic New topics 06:15:16 #info #20: Fedora 20 Planning (tagoh) 06:15:16 #link https://fedorahosted.org/i18n/ticket/20 06:15:18 I have been seeing so strange behaviour with gnome search field not getting focus trying harder to reproduce 06:15:24 so = some 06:15:40 in overview, but anyway 06:16:26 so it's time to think about f20 planning. just fyi the planning process has been changed since f20. please read http://fedoraproject.org/wiki/Changes/Policy if you are planning to change something in f20. 06:17:38 sure 06:18:40 mfabian, rc2.2 is just kde respin 06:18:50 is there any plans so far? :) 06:22:01 okay, it's still earler so just keep it in mind. 06:22:16 #info #21: Compare different Desktop Environments in Fedora 19 (pnemade) 06:22:16 #link https://fedorahosted.org/i18n/ticket/21 06:22:37 just thinking to have this post GA 06:24:06 aha 06:24:35 is there any new DE in f19? 06:26:28 I did test MATE and Cinnamon to support for imsettings in f18 time frame btw. 06:26:44 I don't think anything is new. all DE are also in F18 06:26:57 No enlightenment 17 yet ... 06:27:26 but versions updated etc 06:28:53 paragan: yes, idea is worth. Having comparison chart will help users when migrating from one DE to other 06:29:24 but what things we should include? 06:29:31 1. How to change IME? 06:30:24 yes 06:31:26 change? 06:31:28 if these DE differs in ime available, ime setup, fonts default, app names 06:31:49 Shouldn’t the default fonts be the same? 06:31:51 tagoh_: means add, new IME 06:32:36 mfabian: i think it should be same based on fontconfig priorities. 06:33:36 btw who's the target for this information? 06:34:57 people who want to find what changes are there between any DE 06:37:41 any other comments? 06:37:43 how about making some mini-draft to get a better idea? 06:38:05 maybe help page for IM etc might be useful dunno 06:38:15 ok 06:40:25 anything else? 06:41:24 Has anybody tried ibus-typing-booster 1.1.0? 06:41:50 It can learn from a text file now. 06:42:05 #topic Open Floor 06:42:47 mfabian: not yet - aha 06:43:13 mfabian: i am using ibus-typing-booster for Marathi language. 06:43:32 nope, dont have 1.1.0 yet 06:43:51 mfabian: that is an excellent feature. 06:43:57 By reading files, it is easy to get a huge database which makes typing slow. I need to think how to prune statistically irrelevant entries from the database ... 06:44:32 mfabian, still python right? 06:44:32 Learning from a text file is done via the setup tool, there is a button to read a text file. 06:44:33 mfabian: so words from text files will get added to Hunspell dictionaries? or i-t-b specific database? 06:44:52 It is still python, yes, but the speed is limited by the SELECT statements. 06:44:58 hmm 06:45:17 so maybe a better db format is needed perhaps? 06:45:18 The words from the text file get added to the user database of ibus-typing-booster. 06:45:23 Actually trigrams. 06:45:33 Yes, I am thinking whether a different database format is needed. 06:45:42 cool 06:45:56 how about bi-gram, which can also improves performance? 06:46:16 it stores trigram, bigram, and unigram in the database. 06:46:49 Not separately, the bigrams are just two words of a trigram, the unigram is just the first word of a trigram. 06:47:29 I.e. they are all read from the same table, only the select statements differe. 06:47:30 looks fare enough to have them in local database rather than pushing to hunspell 06:47:30 differ. 06:47:57 Hunspell doesn’t do the trigram stuff anyway. 06:48:12 pravins: did you notice that it learns better now when using it for Marathi? 06:48:16 mfabian, so stored in reverse word order? 06:48:43 yes. It should my added word nicely 06:48:47 Something like this is a database row: 06:48:49 juhp, what is reverse word order? 06:48:49 s/should/shows 06:48:50 14284418|s|suit|space|the|5 06:49:04 I see 06:49:05 The text is "the space suit" 06:49:28 The last word was inserted when the user typed only "s" and selected "suit" from the lookup table. 06:49:41 I.e. the 2nd row is the last user input. 06:50:06 5 is the number of times this particular combination of trigram and last user input happened. 06:50:25 aha 06:51:29 Even if the select statements can be made faster, some limit of the database size is probably a good thing. 06:51:35 and the first number? 06:51:43 The first number is just a rowid. 06:51:48 ok 06:51:54 Maybe not needed. Was always there so far. 06:52:21 is it used? 06:52:30 No, not used a all. 06:52:32 at all. 06:52:34 ok 06:52:41 we took initial database from ibus-tables, so it might be used in ibus-tables 06:53:08 I changed the database a lot from ibus-tables, but so far I left the rowid alone. 06:53:34 I believe ibus-table doesn’t use the rowid either. 06:53:44 may be it's used to set primary key 06:54:23 sqlite implicitly creates rowids anyway, one would probably not save anything by not having it explicitely in a table row. 06:55:01 http://www.sqlite.org/autoinc.html 06:55:24 “If a table contains a column of type INTEGER PRIMARY KEY, then that column becomes an alias for the ROWID” 06:55:37 So in some form, that column would always be there. 06:55:39 i see 06:56:28 To limit the database size, I think about introducing another row with a timestamp when this row was last used. 06:56:50 And then delete rows from the database which have not been used for a long time *and* have a low count. 06:57:33 sounds reasonable 06:57:51 Timestamps might also be nice to display something like: “You have saved 82% of the keystrokes to day and 79% yesterday” like swiftkey does on Android. 06:58:55 if you want to scale out, better stop use of sqlite I guess... 06:59:11 tagoh_: can you recommend anything faster? 06:59:54 I also thought of writing python dictionaries directly to disk, even that would probably be faster then sqlite already. 07:00:01 hmm, do you want SQL-like ? 07:00:25 The select statements are rather simple, a simple key-value store is probably enough. 07:01:57 sqlite also has a strange problem that the -wal file gets huge when a huge number of inserts is done. 07:02:25 hmm 07:03:28 mfabian: yes, simple key-value store is more faster. :) 07:03:34 When learning from a file with 3 million words, the -wal files grows to 1.7 GB. 07:03:47 s/more/much/ 07:04:13 It is then merged into the main database during a checkpoint, but temporarily it gets huge. 07:05:22 mfabian, lol 07:05:40 probably sql is too heavy 07:06:02 The main database is only 38MB then, so 1.7GB for the WAL (Write-Ahead-Log) is surprisingly huge. 07:06:10 nod 07:06:32 mfabian: does more commit transactions help? 07:06:39 * epico_ just a guess. 07:06:56 epico_: You mean for the huge WAL? 07:07:00 yes 07:07:09 Apparently not. 07:07:30 np, just guess it. 07:07:31 Fastest is inserting the whole data from a text file with a single commit. 07:07:50 self.db.executemany(sqlstr, sqlargs) 07:07:50 self.db.commit() 07:07:50 self.db.execute('PRAGMA wal_checkpoint;') 07:07:50 07:08:05 The executemany inserts everything, then one commit. 07:08:17 mfabian, ask sqlite upstream? 07:08:51 Doing the insert in groups of say 10000 records and then a commit and a checkpoint does not stop the WAL file from growing to 1.7GB, it just makes it a lot slower. 07:09:11 mfabian, I see. 07:09:16 epico_: Yes, maybe I should ask upstream 07:09:24 :) 07:09:47 Also strange is that the WAL file does not seem to grow above 1.7GB even if I use much larger texts. 07:14:08 okay, anything else we want to discuss in the meeting? 07:17:14 let's stop here then. 07:17:23 thanks everyone for the meeting! 07:17:27 #endmeeting