One thing I didn’t see here that might be hurting your performance is a lack of ...

yakkomajuri · 2025-11-28T18:47:59 1764355679

hey, author (not op) here. we do do semantic chunking! I think maybe I gave the impression that we don't because of the mention of aggregating context but I tested this with questions that would require aggregating context from 15+ documents (meaning 2x that in chunks), hence the comment in the post!

NebulaStorm456 · 2025-11-29T10:28:38 1764412118

Is there a way to convert documents into a hierarchical connected graph data structure which references each other similar to how we use personal knowledge tools like Obsidian and ability to traverse this graph? Is GraphRag technique trying to do this exactly?

mips_avatar · 2025-11-29T18:43:19 1764441799

Not exactly what you’re looking for but Wilson Lin’s search engine creates a graph from the DOM for context. Here’s his write up: https://blog.wilsonl.in/search-engine/

mips_avatar · 2025-11-28T19:15:55 1764357355

Ah so you’re generating context from multiple docs for your chunks? How do you decide which docs get aggregated?

nostrebored · 2025-11-28T23:11:18 1764371478

Haven’t seen an answer better than “vibes” here. Especially with data across multiple domains.

mips_avatar · 2025-11-29T03:21:05 1764386465

I mean as long as they're not too long I suppose you could use just about any heuristic for grouping sources. Just seems like it would be hard to generate succinct context if you mess it up.