00:59:44 | * | q66 quit (Quit: Quit) |
03:02:38 | * | fowl quit (Quit: Leaving) |
03:14:59 | * | Amrykid is now known as Phalanxia |
03:19:12 | * | Phalanxia is now known as Amrykid |
04:03:29 | * | fowl joined #nimrod |
05:12:58 | * | zahary_ quit (Read error: Connection reset by peer) |
05:47:50 | * | fowl quit (Read error: Connection reset by peer) |
05:50:42 | * | fowl joined #nimrod |
06:25:39 | * | fowl quit (Ping timeout: 248 seconds) |
07:04:32 | * | Boscop quit (Read error: Connection reset by peer) |
07:05:34 | * | Boscop joined #nimrod |
07:20:14 | * | fowl joined #nimrod |
08:17:04 | * | XAMPP_ joined #nimrod |
08:20:36 | * | XAMPP quit (Ping timeout: 264 seconds) |
08:21:14 | * | Araq_ joined #nimrod |
08:37:04 | * | Boscop quit (Disconnected by services) |
08:37:06 | * | Boscop joined #nimrod |
08:38:10 | * | Araq_ quit (Quit: ChatZilla 0.9.89 [Firefox 18.0.1/20130116073211]) |
09:13:43 | * | Araq_ joined #nimrod |
10:22:21 | * | Araq_ quit (Quit: ChatZilla 0.9.89 [Firefox 18.0.1/20130116073211]) |
10:39:58 | * | zahary_ joined #nimrod |
10:53:57 | * | fowl quit (Quit: Leaving) |
11:17:24 | * | Araq_ joined #nimrod |
11:17:36 | Araq_ | ping zahary, zahary_ |
11:33:59 | * | q66 joined #nimrod |
11:38:24 | zahary_ | hi Araq |
11:44:43 | Araq_ | oh hi; bad connection here, as usual ... |
11:45:16 | Araq_ | I've implemented Lins' algorithm for the cycle collector which basically does: |
11:45:37 | Araq_ | for root: handleSingleRoot(root) |
11:45:55 | * | Araq_ quit (Read error: Operation timed out) |
11:46:34 | * | Araq_ joined #nimrod |
11:47:22 | Araq_ | proc handleSingleRoot(r) = decRefGraph(r); if r.rc > 0: freeGraph(r); else: restoreRCs(r) |
11:48:23 | Araq_ | where both restoreRCs() and freeGraph() only do a linear traversal over the computed graph |
11:48:41 | Araq_ | so that only decRefGraph() performs the expensive graph traversal |
11:49:32 | Araq_ | however, we get back the undesirable O(n²) behaviour of Lins' algorithm ... |
11:51:56 | Araq_ | however, it doesn't work yet for bootstrapping so I have no numbers ... |
11:53:22 | Araq_ | but I'd guess that for the compiler itself most of the cycle collector's time is spent trying to prove graphs to be dead which are alive; is that correct? |
11:57:27 | Araq_ | (er, proc handleSingleRoot(r) = decRefGraph(r); if r.rc == 0: freeGraph(r); else: restoreRCs(r) of course) |
12:49:27 | * | Araq_ quit (Quit: ChatZilla 0.9.89 [Firefox 18.0.1/20130116073211]) |
17:34:31 | * | zahary quit (Quit: Leaving.) |
17:35:58 | * | zahary joined #nimrod |
17:39:25 | zahary | hi Araq, why did you decide to try Lin's algorithm? our current algorithm (the synchronous cycle collector) is supposed to be a significant improvement over Lin's |
17:40:00 | zahary | With Lin's, you should pay the big graph traversal price numerous times in each collection instead of just once |
19:07:33 | Araq | ping zahary |
19:10:04 | zahary | hi |
19:12:22 | Araq | well I still think it's possible to modify Lins' algorithm so that it only takes 1 pass over the graph |
19:12:51 | Araq | you can't do that if you traverse the whole transitive closure at once like in Bacon's algorithm |
19:13:03 | Araq | but I may be wrong ;-) |
19:13:34 | Araq | in my pseudo-code the problem is the check 'rc(r) > 0' |
19:14:06 | Araq | unfortunately it's not true that the processed graph is dead iff rc(r) == 0 |
19:14:39 | Araq | (note that I only perform the recDecRef pass in a traditional way) |
19:15:17 | Araq | however, the graph is dead if: for every processed node n: rc(n) == 0 |
19:15:32 | zahary | you decref the transitive closure, then what? |
19:15:32 | * | shevy quit (Ping timeout: 248 seconds) |
19:16:06 | Araq | I keep a data structure that I called "speculative table" |
19:16:36 | Araq | why for every processed node stores what it's RC would be should the graph turn out to be dead |
19:16:40 | zahary | it's a list of visited nodes |
19:16:41 | zahary | ? |
19:16:52 | Araq | and the original RC should the graph turn out to be alive |
19:17:02 | Araq | yeah it's a list |
19:17:22 | Araq | well actually it's an array that works as a hash table |
19:17:30 | Araq | but these details don't matter |
19:17:38 | zahary | well, the real problem is that the transitive closure is some 1 million edges in the final collections |
19:17:50 | zahary | edge = ref |
19:18:49 | zahary | I don't think the problem is the complexity of traversing itself, it's the mere size |
19:21:09 | Araq | hm quite possible |
19:21:20 | Araq | but it doesn't help you have to traverse it 2-3 times |
19:22:00 | Araq | a <-> b -> c |
19:22:23 | zahary | well, with bacon's algorithm you have 2 traversals. as far as I understand you, with Lin's you'll still have to walk over it for each root |
19:22:52 | Araq | really, how is it 2 traversals? |
19:23:00 | Araq | decref, incref, free |
19:23:58 | Araq | and yeah with Lins' you have to walk over each root, I some ideas how to fix that though |
19:24:33 | Araq | however, the real puzzle for now is: what's the condition that the traversed graph is dead? |
19:24:51 | zahary | freeing is somewhat cheaper, because it traverses only the dead objects |
19:25:40 | zahary | well, what's your idea about fixing it? |
19:27:27 | Araq | well it's only an issue if the graphs are connected |
19:27:56 | * | shevy joined #nimrod |
19:27:57 | zahary | and they are |
19:28:06 | Araq | so when I traverse the root 'r' I may keep a bit set of processed nodes |
19:28:14 | Araq | and check if the other root is in that set |
19:28:25 | Araq | if so, I can skip that root |
19:28:28 | Araq | well not really |
19:28:30 | Araq | unfortunately |
19:29:05 | Araq | but it could be good enough for a start |
19:29:21 | zahary | well, isn't that what bacon's algorithm is doing after all? it uses efficient coloring scheme to mark processed nodes |
19:29:38 | Araq | I know ;-) |
19:30:38 | Araq | I don't think I'll beat bacon's algorithm |
19:30:50 | Araq | but I like to know whether my idea can work out |
19:31:27 | Araq | so back to the more interesting problem: a <-> b -> c <- d |
19:31:39 | Araq | 'a' is processed as root |
19:31:46 | Araq | 'd' keeps 'c' alive |
19:32:00 | Araq | however a and b are dead |
19:32:27 | Araq | so 'for every processed node n: rc(n) == 0' is wrong too |
19:32:55 | Araq | as we touch 'c' in the traversal but it's RC is 1 after the traversal |
19:33:06 | zahary | you are about to rediscover the markAlive pass :) |
19:33:17 | Araq | that's what I fear ... |
19:33:38 | Araq | that's it's not possible and *requires* a pass |
19:37:32 | Araq | the condition 'rc(root) == 0' is wrong too: a <-> b <- e |
19:38:06 | Araq | whould free 'a', but it's still alive |
19:44:07 | Araq | for every inner node n: RC(n) == 0 ? |
19:44:37 | Araq | could work |
19:49:38 | zahary | what is inner node here? |
19:50:34 | Araq | nodes that have at least 1 child |
19:51:16 | Araq | from the perseptive of the traversal |
19:51:24 | Araq | *perspective |
19:51:57 | zahary | doesn't matter, `a` is inner node in your previous example |
19:53:44 | Araq | yeah and its RC is 0 after recDecRef |
19:53:54 | Araq | but b's is still 1 so the graph is alive |
19:54:42 | Araq | the point is that leaves don't matter but the rest needs to have RC == 0 |
19:55:07 | Araq | of course you can't free leaves blindly should the graph be dead |
19:55:47 | zahary | © -> a <-> b <- [e] |
19:56:57 | Araq | well? |
19:57:01 | zahary | (cycle candidate) [stack root] |
19:57:31 | zahary | `a` is inner here and it will have rc=0 after the decref, so what's the algorithms exactly? |
19:57:48 | Araq | 'a' is the root? |
19:57:54 | Araq | or 'c'? |
19:57:58 | zahary | ( c ) |
19:58:22 | zahary | brackets mark roots according to the legend above |
19:58:31 | Araq | ok good |
19:58:51 | Araq | well 'b' keeps it alive |
19:58:59 | Araq | as it's an inner node with RC > 0 |
19:59:00 | zahary | after the mark alive pass, yes |
19:59:16 | Araq | no after the decref pass it's RC is 1 |
19:59:24 | Araq | before the pass it's 2 |
20:00:00 | zahary | er, what is exactly is happening when we reach `b` in the decref pass? |
20:00:14 | Araq | we process its children |
20:00:14 | zahary | don't we decref the edge towards `a`? |
20:00:21 | Araq | yes |
20:00:29 | zahary | so, a becomes 0 |
20:00:55 | zahary | we started from `c` |
20:00:57 | Araq | yeah but the condition is: for every inner node n: RC(n) == 0 |
20:01:08 | Araq | and 'b' is an inner node too |
20:01:38 | Araq | because it has 'a' as a child |
20:02:42 | zahary | well, how does `b` affect `a` after the decref pass? you notice that `a` is reachable from `b`? |
20:03:26 | Araq | no I notice within the decref pass that 'b' is an inner node |
20:03:46 | Araq | as 'forAllChildren' doesn't produce an empty list for it |
20:04:20 | zahary | so? what we are discussing here is how `a` will get collected |
20:04:41 | Araq | well 'a' is not dead in your example |
20:05:42 | zahary | exactly, but its RC reached 0 so I'm asking what will prevent collecting it |
20:07:07 | Araq | well the RCs will be restored after the decRefPass |
20:07:11 | zahary | anyway, I'll be more interested if you try something more radical instead - like precise stack marking that will allow moving collectors, etc |
20:07:34 | Araq | but I guess 'c' is dead and not collected with this rule |
20:08:38 | Araq | precise stack marking is not necessary for a moving collector |
20:08:54 | Araq | you have to 'pin' the objects referenced from the stack though |
20:10:36 | zahary | it's not very interesting if you can't move short-lived objects (which are more likely to be on the stack) |
20:11:49 | zahary | but it was just an example, tracing collector will be another |
20:12:40 | Araq | alright, well I gave up since the inner node rule doesn't work either |
20:14:05 | Araq | btw I thought your "is bit set" template is wrong |
20:14:20 | Araq | but then I figured you use it to conflate different colors |
20:15:08 | Araq | and then I wondered whether you can't achieve the same with fewer bits if you try to use <= for color checks |
20:16:17 | zahary | well, there are colors and there are flags that are a bit independent from them |
20:16:59 | zahary | I'm a bit more conservative than the original paper when it comes to dead objects. they are reusing the black color for them |
20:17:08 | zahary | i.e. the "normal" color |
20:20:18 | Araq | I had another unrelated idea: the RC value is a probability for "the node is dead" |
20:20:41 | Araq | so instead of generations, we could try to group cycle roots by their RC value |
20:20:51 | Araq | and process nodes with a low RC value more often |
20:21:50 | zahary | what you mean is to use some kind of bucketing scheme when we add cycle roots? |
20:22:13 | Araq | or if we process them |
20:22:50 | Araq | if I'm not mistaken you can always process a subset of all cycle roots to find some garbage |
20:23:56 | zahary | you grokked why I needed this "trimming" feature, right? |
20:24:18 | Araq | I still haven't looked at it ;-) |
20:24:35 | zahary | the problem is that I cannot free objects if they are cycle roots - I don't want to search the cycleRoots array for them, so I just mark them dead and wait for the next collection |
20:24:56 | zahary | so if CollectCycles is disabled, cycleRoots accumulate |
20:25:02 | Araq | oh that's what you mean, yeah I know |
20:25:14 | Araq | you could easily 'excl' cycle roots though |
20:25:24 | Araq | if you use the TCellSet for them |
20:25:59 | zahary | yes, but my bet is that will slow things down. we should try it do to see how it will affect memory |
20:26:46 | zahary | try it *tho* to see |
20:33:24 | Araq | I noticed that you mark the graphs referenced by stack roots alive |
20:33:42 | Araq | and don't run the cycle collector on them |
20:34:30 | Araq | however, it's then a small step to make the codegen register globals to the GC |
20:34:41 | Araq | and do the same for graphs referenced by globals |
20:36:27 | zahary | could help, but this heuristic wasn't particularly successful - it hardly changed anything |
20:37:07 | Araq | well then I'd remove it completely; it's dangerous |
20:37:21 | zahary | how so? |
20:37:31 | Araq | the sweet thing about the current GC is that doesn't trace what hasn't changed |
20:39:03 | Araq | it's annoying we have to add the stack roots back to cycleRoots |
20:39:14 | zahary | I see your point. I tried it because it was suggested in one of the papers. there is a switch to disable it iirc |
20:42:11 | Araq | so ... if we have cyclic graphs that span millions of nodes |
20:42:40 | Araq | how can we ever deal with them in a realtime setting? |
20:43:14 | Araq | it can perform the scans incrementally |
20:43:23 | Araq | so that the deadline is never missed |
20:43:44 | Araq | but it can only free things after the whole graph has been traversed |
20:44:35 | Araq | which is exactly what tracing/copying GCs offer |
20:44:43 | zahary | unfortunately, to do that you need to be able to traverse the heap as it was when you started the traversal (heap shapshots or sliding views) |
20:45:15 | Araq | yeah that makes it even worse |
20:45:38 | Araq | but if even we ignore that for a moment |
20:45:56 | Araq | I can't imagine this to really help in a realtime setting |
20:46:26 | zahary | why not? |
20:46:31 | Araq | the nice thing about RC is that it's local and it proves things to be dead without needing the "whole picture" |
20:47:32 | Araq | so you can allocate&free within a cycle |
20:48:01 | Araq | but with the other scheme you have a long period where you can't free |
20:48:24 | Araq | so my fear would be that it can't keep up with the mutator |
20:48:44 | Araq | it doesn't miss a deadline, but the memory usage keeps growing |
20:49:44 | Araq | so yeah, I quite like the current setting where you run the cycle collector in batch when you can afford it |
20:49:47 | zahary | I see, but this probably won't be true for many projects |
20:51:36 | Araq | perhaps; we'll see |
20:55:06 | Araq | btw I also thought about implementing a simple mark&sweep GC |
20:55:35 | Araq | but the sweep phase is: "free every cell that is not marked alive" |
20:56:17 | Araq | and I'm too stupid to see how to implement that efficiently |
20:56:41 | Araq | with the current allocator |
20:57:14 | Araq | I guess I could use another datastructure to easily traverse over *every* cell |
20:57:32 | Araq | without touching the whole heap |
20:59:52 | Araq | any better ideas? |
21:01:16 | reactormonk | Araq, crash once in a while and let the kernel clean up |
21:01:35 | Araq | Ruby style, huh? |
21:01:44 | reactormonk | unicorn style. |
21:16:33 | zahary | well, I think I remember seeing intrusive doubly linked lists in the original lua GC (the most naive way do it) |
21:17:01 | zahary | the bitmap sweeping in the upcoming luajit GC sounds quite cool. |
21:17:45 | Araq | bitmap sweeping is an old hat though |
21:17:50 | zahary | otherwise, it shouldn't be too expensive to traverse the free list for a page populating a map in temporary stack space that indicates with are the allocated slots |
21:18:05 | zahary | * which are the * |
21:18:26 | Araq | however Mike Pall also invented a new write barrier |
21:18:33 | Araq | which I don't get ;-) |
21:19:13 | zahary | I haven't put much thought in it, but isn't it supposed to be simple? |
21:20:20 | Araq | write barriers are never simple :P |
21:24:50 | Araq | well actually it's pretty easy to traverse every allocated object with the current allocator |
21:25:30 | Araq | but most of them should be alive |
21:25:46 | Araq | and you don't want to touch these |
21:27:16 | Araq | however 2 TCellSets would work |
21:27:16 | zahary | the most simple mark and sweep probably won't be competitive either |
21:27:27 | Araq | one set for the 'alive' |
21:27:32 | Araq | one set for 'allocated' |
21:30:24 | Araq | hrm I should really implement this; we could pick on a per thread basis which GC to use ... |
21:34:56 | zahary | implement more infrastructure for GCs - marking global roots, precise marking, etc - then we can put a competition for GC researches for the best implementation :) |
21:42:15 | * | FreeArtMan joined #nimrod |
21:42:46 | Araq | true :D |
21:52:22 | dom96 | Instacode is pretty nice, and supports Nimrod: http://instacod.es/58918 :D |
22:04:44 | * | gradha joined #nimrod |
22:10:54 | * | FreeArtMan quit (Ping timeout: 264 seconds) |
22:16:57 | gradha | new forum software http://www.codinghorror.com/blog/2013/02/civilized-discourse-construction-kit.html which will likely be used due to fame/inertia from stackoverflow |
22:17:18 | gradha | not implemented in nimrod though |
22:33:58 | q66 | dom96, haha nice |
22:37:45 | dom96 | gradha: This new forum's popularity already makes me jealous. |
22:39:19 | q66 | dom96, can one add their own language to instacode? |
22:39:51 | dom96 | q66: I think it uses pygments as the backend, so you simply need to add support for it to pygments. |
22:40:04 | q66 | aw, python |
23:01:14 | * | gradha quit (Quit: bbl, have youtube videos to watch) |
23:21:07 | * | XAMPP-8 joined #nimrod |
23:33:47 | * | XAMPP-8 quit (Ping timeout: 255 seconds) |
23:50:09 | reactormonk | looks like openid is officially dead |
23:59:18 | reactormonk | oh wait, they require it in the Gemfile |