#nimrod logs for 05-02-2013

<<05-02-2013>>

00:59:44	*	q66 quit (Quit: Quit)
03:02:38	*	fowl quit (Quit: Leaving)
03:14:59	*	Amrykid is now known as Phalanxia
03:19:12	*	Phalanxia is now known as Amrykid
04:03:29	*	fowl joined #nimrod
05:12:58	*	zahary_ quit (Read error: Connection reset by peer)
05:47:50	*	fowl quit (Read error: Connection reset by peer)
05:50:42	*	fowl joined #nimrod
06:25:39	*	fowl quit (Ping timeout: 248 seconds)
07:04:32	*	Boscop quit (Read error: Connection reset by peer)
07:05:34	*	Boscop joined #nimrod
07:20:14	*	fowl joined #nimrod
08:17:04	*	XAMPP_ joined #nimrod
08:20:36	*	XAMPP quit (Ping timeout: 264 seconds)
08:21:14	*	Araq_ joined #nimrod
08:37:04	*	Boscop quit (Disconnected by services)
08:37:06	*	Boscop joined #nimrod
08:38:10	*	Araq_ quit (Quit: ChatZilla 0.9.89 [Firefox 18.0.1/20130116073211])
09:13:43	*	Araq_ joined #nimrod
10:22:21	*	Araq_ quit (Quit: ChatZilla 0.9.89 [Firefox 18.0.1/20130116073211])
10:39:58	*	zahary_ joined #nimrod
10:53:57	*	fowl quit (Quit: Leaving)
11:17:24	*	Araq_ joined #nimrod
11:17:36	Araq_	ping zahary, zahary_
11:33:59	*	q66 joined #nimrod
11:38:24	zahary_	hi Araq
11:44:43	Araq_	oh hi; bad connection here, as usual ...
11:45:16	Araq_	I've implemented Lins' algorithm for the cycle collector which basically does:
11:45:37	Araq_	for root: handleSingleRoot(root)
11:45:55	*	Araq_ quit (Read error: Operation timed out)
11:46:34	*	Araq_ joined #nimrod
11:47:22	Araq_	proc handleSingleRoot(r) = decRefGraph(r); if r.rc > 0: freeGraph(r); else: restoreRCs(r)
11:48:23	Araq_	where both restoreRCs() and freeGraph() only do a linear traversal over the computed graph
11:48:41	Araq_	so that only decRefGraph() performs the expensive graph traversal
11:49:32	Araq_	however, we get back the undesirable O(n²) behaviour of Lins' algorithm ...
11:51:56	Araq_	however, it doesn't work yet for bootstrapping so I have no numbers ...
11:53:22	Araq_	but I'd guess that for the compiler itself most of the cycle collector's time is spent trying to prove graphs to be dead which are alive; is that correct?
11:57:27	Araq_	(er, proc handleSingleRoot(r) = decRefGraph(r); if r.rc == 0: freeGraph(r); else: restoreRCs(r) of course)
12:49:27	*	Araq_ quit (Quit: ChatZilla 0.9.89 [Firefox 18.0.1/20130116073211])
17:34:31	*	zahary quit (Quit: Leaving.)
17:35:58	*	zahary joined #nimrod
17:39:25	zahary	hi Araq, why did you decide to try Lin's algorithm? our current algorithm (the synchronous cycle collector) is supposed to be a significant improvement over Lin's
17:40:00	zahary	With Lin's, you should pay the big graph traversal price numerous times in each collection instead of just once
19:07:33	Araq	ping zahary
19:10:04	zahary	hi
19:12:22	Araq	well I still think it's possible to modify Lins' algorithm so that it only takes 1 pass over the graph
19:12:51	Araq	you can't do that if you traverse the whole transitive closure at once like in Bacon's algorithm
19:13:03	Araq	but I may be wrong ;-)
19:13:34	Araq	in my pseudo-code the problem is the check 'rc(r) > 0'
19:14:06	Araq	unfortunately it's not true that the processed graph is dead iff rc(r) == 0
19:14:39	Araq	(note that I only perform the recDecRef pass in a traditional way)
19:15:17	Araq	however, the graph is dead if: for every processed node n: rc(n) == 0
19:15:32	zahary	you decref the transitive closure, then what?
19:15:32	*	shevy quit (Ping timeout: 248 seconds)
19:16:06	Araq	I keep a data structure that I called "speculative table"
19:16:36	Araq	why for every processed node stores what it's RC would be should the graph turn out to be dead
19:16:40	zahary	it's a list of visited nodes
19:16:41	zahary	?
19:16:52	Araq	and the original RC should the graph turn out to be alive
19:17:02	Araq	yeah it's a list
19:17:22	Araq	well actually it's an array that works as a hash table
19:17:30	Araq	but these details don't matter
19:17:38	zahary	well, the real problem is that the transitive closure is some 1 million edges in the final collections
19:17:50	zahary	edge = ref
19:18:49	zahary	I don't think the problem is the complexity of traversing itself, it's the mere size
19:21:09	Araq	hm quite possible
19:21:20	Araq	but it doesn't help you have to traverse it 2-3 times
19:22:00	Araq	a <-> b -> c
19:22:23	zahary	well, with bacon's algorithm you have 2 traversals. as far as I understand you, with Lin's you'll still have to walk over it for each root
19:22:52	Araq	really, how is it 2 traversals?
19:23:00	Araq	decref, incref, free
19:23:58	Araq	and yeah with Lins' you have to walk over each root, I some ideas how to fix that though
19:24:33	Araq	however, the real puzzle for now is: what's the condition that the traversed graph is dead?
19:24:51	zahary	freeing is somewhat cheaper, because it traverses only the dead objects
19:25:40	zahary	well, what's your idea about fixing it?
19:27:27	Araq	well it's only an issue if the graphs are connected
19:27:56	*	shevy joined #nimrod
19:27:57	zahary	and they are
19:28:06	Araq	so when I traverse the root 'r' I may keep a bit set of processed nodes
19:28:14	Araq	and check if the other root is in that set
19:28:25	Araq	if so, I can skip that root
19:28:28	Araq	well not really
19:28:30	Araq	unfortunately
19:29:05	Araq	but it could be good enough for a start
19:29:21	zahary	well, isn't that what bacon's algorithm is doing after all? it uses efficient coloring scheme to mark processed nodes
19:29:38	Araq	I know ;-)
19:30:38	Araq	I don't think I'll beat bacon's algorithm
19:30:50	Araq	but I like to know whether my idea can work out
19:31:27	Araq	so back to the more interesting problem: a <-> b -> c <- d
19:31:39	Araq	'a' is processed as root
19:31:46	Araq	'd' keeps 'c' alive
19:32:00	Araq	however a and b are dead
19:32:27	Araq	so 'for every processed node n: rc(n) == 0' is wrong too
19:32:55	Araq	as we touch 'c' in the traversal but it's RC is 1 after the traversal
19:33:06	zahary	you are about to rediscover the markAlive pass :)
19:33:17	Araq	that's what I fear ...
19:33:38	Araq	that's it's not possible and requires a pass
19:37:32	Araq	the condition 'rc(root) == 0' is wrong too: a <-> b <- e
19:38:06	Araq	whould free 'a', but it's still alive
19:44:07	Araq	for every inner node n: RC(n) == 0 ?
19:44:37	Araq	could work
19:49:38	zahary	what is inner node here?
19:50:34	Araq	nodes that have at least 1 child
19:51:16	Araq	from the perseptive of the traversal
19:51:24	Araq	*perspective
19:51:57	zahary	doesn't matter, `a` is inner node in your previous example
19:53:44	Araq	yeah and its RC is 0 after recDecRef
19:53:54	Araq	but b's is still 1 so the graph is alive
19:54:42	Araq	the point is that leaves don't matter but the rest needs to have RC == 0
19:55:07	Araq	of course you can't free leaves blindly should the graph be dead
19:55:47	zahary	© -> a <-> b <- [e]
19:56:57	Araq	well?
19:57:01	zahary	(cycle candidate) [stack root]
19:57:31	zahary	`a` is inner here and it will have rc=0 after the decref, so what's the algorithms exactly?
19:57:48	Araq	'a' is the root?
19:57:54	Araq	or 'c'?
19:57:58	zahary	( c )
19:58:22	zahary	brackets mark roots according to the legend above
19:58:31	Araq	ok good
19:58:51	Araq	well 'b' keeps it alive
19:58:59	Araq	as it's an inner node with RC > 0
19:59:00	zahary	after the mark alive pass, yes
19:59:16	Araq	no after the decref pass it's RC is 1
19:59:24	Araq	before the pass it's 2
20:00:00	zahary	er, what is exactly is happening when we reach `b` in the decref pass?
20:00:14	Araq	we process its children
20:00:14	zahary	don't we decref the edge towards `a`?
20:00:21	Araq	yes
20:00:29	zahary	so, a becomes 0
20:00:55	zahary	we started from `c`
20:00:57	Araq	yeah but the condition is: for every inner node n: RC(n) == 0
20:01:08	Araq	and 'b' is an inner node too
20:01:38	Araq	because it has 'a' as a child
20:02:42	zahary	well, how does `b` affect `a` after the decref pass? you notice that `a` is reachable from `b`?
20:03:26	Araq	no I notice within the decref pass that 'b' is an inner node
20:03:46	Araq	as 'forAllChildren' doesn't produce an empty list for it
20:04:20	zahary	so? what we are discussing here is how `a` will get collected
20:04:41	Araq	well 'a' is not dead in your example
20:05:42	zahary	exactly, but its RC reached 0 so I'm asking what will prevent collecting it
20:07:07	Araq	well the RCs will be restored after the decRefPass
20:07:11	zahary	anyway, I'll be more interested if you try something more radical instead - like precise stack marking that will allow moving collectors, etc
20:07:34	Araq	but I guess 'c' is dead and not collected with this rule
20:08:38	Araq	precise stack marking is not necessary for a moving collector
20:08:54	Araq	you have to 'pin' the objects referenced from the stack though
20:10:36	zahary	it's not very interesting if you can't move short-lived objects (which are more likely to be on the stack)
20:11:49	zahary	but it was just an example, tracing collector will be another
20:12:40	Araq	alright, well I gave up since the inner node rule doesn't work either
20:14:05	Araq	btw I thought your "is bit set" template is wrong
20:14:20	Araq	but then I figured you use it to conflate different colors
20:15:08	Araq	and then I wondered whether you can't achieve the same with fewer bits if you try to use <= for color checks
20:16:17	zahary	well, there are colors and there are flags that are a bit independent from them
20:16:59	zahary	I'm a bit more conservative than the original paper when it comes to dead objects. they are reusing the black color for them
20:17:08	zahary	i.e. the "normal" color
20:20:18	Araq	I had another unrelated idea: the RC value is a probability for "the node is dead"
20:20:41	Araq	so instead of generations, we could try to group cycle roots by their RC value
20:20:51	Araq	and process nodes with a low RC value more often
20:21:50	zahary	what you mean is to use some kind of bucketing scheme when we add cycle roots?
20:22:13	Araq	or if we process them
20:22:50	Araq	if I'm not mistaken you can always process a subset of all cycle roots to find some garbage
20:23:56	zahary	you grokked why I needed this "trimming" feature, right?
20:24:18	Araq	I still haven't looked at it ;-)
20:24:35	zahary	the problem is that I cannot free objects if they are cycle roots - I don't want to search the cycleRoots array for them, so I just mark them dead and wait for the next collection
20:24:56	zahary	so if CollectCycles is disabled, cycleRoots accumulate
20:25:02	Araq	oh that's what you mean, yeah I know
20:25:14	Araq	you could easily 'excl' cycle roots though
20:25:24	Araq	if you use the TCellSet for them
20:25:59	zahary	yes, but my bet is that will slow things down. we should try it do to see how it will affect memory
20:26:46	zahary	try it tho to see
20:33:24	Araq	I noticed that you mark the graphs referenced by stack roots alive
20:33:42	Araq	and don't run the cycle collector on them
20:34:30	Araq	however, it's then a small step to make the codegen register globals to the GC
20:34:41	Araq	and do the same for graphs referenced by globals
20:36:27	zahary	could help, but this heuristic wasn't particularly successful - it hardly changed anything
20:37:07	Araq	well then I'd remove it completely; it's dangerous
20:37:21	zahary	how so?
20:37:31	Araq	the sweet thing about the current GC is that doesn't trace what hasn't changed
20:39:03	Araq	it's annoying we have to add the stack roots back to cycleRoots
20:39:14	zahary	I see your point. I tried it because it was suggested in one of the papers. there is a switch to disable it iirc
20:42:11	Araq	so ... if we have cyclic graphs that span millions of nodes
20:42:40	Araq	how can we ever deal with them in a realtime setting?
20:43:14	Araq	it can perform the scans incrementally
20:43:23	Araq	so that the deadline is never missed
20:43:44	Araq	but it can only free things after the whole graph has been traversed
20:44:35	Araq	which is exactly what tracing/copying GCs offer
20:44:43	zahary	unfortunately, to do that you need to be able to traverse the heap as it was when you started the traversal (heap shapshots or sliding views)
20:45:15	Araq	yeah that makes it even worse
20:45:38	Araq	but if even we ignore that for a moment
20:45:56	Araq	I can't imagine this to really help in a realtime setting
20:46:26	zahary	why not?
20:46:31	Araq	the nice thing about RC is that it's local and it proves things to be dead without needing the "whole picture"
20:47:32	Araq	so you can allocate&free within a cycle
20:48:01	Araq	but with the other scheme you have a long period where you can't free
20:48:24	Araq	so my fear would be that it can't keep up with the mutator
20:48:44	Araq	it doesn't miss a deadline, but the memory usage keeps growing
20:49:44	Araq	so yeah, I quite like the current setting where you run the cycle collector in batch when you can afford it
20:49:47	zahary	I see, but this probably won't be true for many projects
20:51:36	Araq	perhaps; we'll see
20:55:06	Araq	btw I also thought about implementing a simple mark&sweep GC
20:55:35	Araq	but the sweep phase is: "free every cell that is not marked alive"
20:56:17	Araq	and I'm too stupid to see how to implement that efficiently
20:56:41	Araq	with the current allocator
20:57:14	Araq	I guess I could use another datastructure to easily traverse over every cell
20:57:32	Araq	without touching the whole heap
20:59:52	Araq	any better ideas?
21:01:16	reactormonk	Araq, crash once in a while and let the kernel clean up
21:01:35	Araq	Ruby style, huh?
21:01:44	reactormonk	unicorn style.
21:16:33	zahary	well, I think I remember seeing intrusive doubly linked lists in the original lua GC (the most naive way do it)
21:17:01	zahary	the bitmap sweeping in the upcoming luajit GC sounds quite cool.
21:17:45	Araq	bitmap sweeping is an old hat though
21:17:50	zahary	otherwise, it shouldn't be too expensive to traverse the free list for a page populating a map in temporary stack space that indicates with are the allocated slots
21:18:05	zahary	* which are the *
21:18:26	Araq	however Mike Pall also invented a new write barrier
21:18:33	Araq	which I don't get ;-)
21:19:13	zahary	I haven't put much thought in it, but isn't it supposed to be simple?
21:20:20	Araq	write barriers are never simple :P
21:24:50	Araq	well actually it's pretty easy to traverse every allocated object with the current allocator
21:25:30	Araq	but most of them should be alive
21:25:46	Araq	and you don't want to touch these
21:27:16	Araq	however 2 TCellSets would work
21:27:16	zahary	the most simple mark and sweep probably won't be competitive either
21:27:27	Araq	one set for the 'alive'
21:27:32	Araq	one set for 'allocated'
21:30:24	Araq	hrm I should really implement this; we could pick on a per thread basis which GC to use ...
21:34:56	zahary	implement more infrastructure for GCs - marking global roots, precise marking, etc - then we can put a competition for GC researches for the best implementation :)
21:42:15	*	FreeArtMan joined #nimrod
21:42:46	Araq	true :D
21:52:22	dom96	Instacode is pretty nice, and supports Nimrod: http://instacod.es/58918 :D
22:04:44	*	gradha joined #nimrod
22:10:54	*	FreeArtMan quit (Ping timeout: 264 seconds)
22:16:57	gradha	new forum software http://www.codinghorror.com/blog/2013/02/civilized-discourse-construction-kit.html which will likely be used due to fame/inertia from stackoverflow
22:17:18	gradha	not implemented in nimrod though
22:33:58	q66	dom96, haha nice
22:37:45	dom96	gradha: This new forum's popularity already makes me jealous.
22:39:19	q66	dom96, can one add their own language to instacode?
22:39:51	dom96	q66: I think it uses pygments as the backend, so you simply need to add support for it to pygments.
22:40:04	q66	aw, python
23:01:14	*	gradha quit (Quit: bbl, have youtube videos to watch)
23:21:07	*	XAMPP-8 joined #nimrod
23:33:47	*	XAMPP-8 quit (Ping timeout: 255 seconds)
23:50:09	reactormonk	looks like openid is officially dead
23:59:18	reactormonk	oh wait, they require it in the Gemfile