The one spot I'd actually push on is the idea that you have a menu of choices about where to process the entropy. I'd argue those places aren't really interchangeable, because where you process it decides how much of it survives. The earlier you resolve the ambiguity, in a design-time schema, or baked into a graph at ingestion, the more you've quietly answered the question before anyone got around to asking it.
That's what keeps nagging me about the graph-versus-vector question specifically. We spent a long time down the knowledge-graph road, and I keep seeing teams do a version of it now, a hand-built graph per project, often borrowing from Karpathy's LLM-wiki idea (someone at McKinsey described exactly this to me last week). The moment you commit to a graph, you've frozen its semantics and ontology in place, and entropy doesn't sit still. The graph goes stale on contact and never compounds. You're still building a map out of artifacts that are already heavily compressed: the ticket, the approval, the log.
Which is the real reason precision is hard, and probably where I'd extend your closing the most. Precision in an entropic domain doesn't come from cleaner structure downstream, it comes from not throwing the entropy away too early. The context that makes a decision legible, the offhand comment, who was worried, what precedent got invoked, lives upstream of the artifact, and a schema imposed too soon destroys the exact signal you were trying to keep. So I've landed on holding the raw substrate and deriving the semantics and ontology late, once you actually know the question. A system of memory rather than a system of record, if I can lean on my own framing.
You close by telling people to find where the entropy sits and choose where to process it. The only place I'd amend it: I'm not sure the choosing is as open as it sounds. My bet is the value concentrates in processing it as late and as close to the question as possible. But that's the part I'd most enjoy being argued out of.
Justin — you've sharpened something I left implicit: resolution point affects signal survival, not just processing cost. That's a genuine extension, and a better framing of why the choice matters.
The menu I described has rungs, not just two endpoints. The banking example is already a hybrid: agents hold raw source documents and transcripts (no early schema imposed), derive structure at ingestion (knowledge graph), and re-derive by meaning at query time (vector search). The schema isn't a prison — it's a fast path for questions you can anticipate, layered on a substrate that still answers questions you can't. Schemas can evolve, and the real question isn't whether to commit but which parts of the question distribution are stable enough to justify it.
Full late-binding has real costs too. Query-time semantic derivation is slower and more expensive, and it still requires a well-posed question when it arrives — which in enterprise settings isn't guaranteed. So I'd amend your amendment slightly: the right discipline isn't "process it as late as possible." It's "how stable is your question distribution?" Pre-commit on the known questions, hold the substrate open for the unknown ones. Harder than either the Cartesian grid or the pure system of memory — but that's where the actual design work lives.
The one spot I'd actually push on is the idea that you have a menu of choices about where to process the entropy. I'd argue those places aren't really interchangeable, because where you process it decides how much of it survives. The earlier you resolve the ambiguity, in a design-time schema, or baked into a graph at ingestion, the more you've quietly answered the question before anyone got around to asking it.
That's what keeps nagging me about the graph-versus-vector question specifically. We spent a long time down the knowledge-graph road, and I keep seeing teams do a version of it now, a hand-built graph per project, often borrowing from Karpathy's LLM-wiki idea (someone at McKinsey described exactly this to me last week). The moment you commit to a graph, you've frozen its semantics and ontology in place, and entropy doesn't sit still. The graph goes stale on contact and never compounds. You're still building a map out of artifacts that are already heavily compressed: the ticket, the approval, the log.
Which is the real reason precision is hard, and probably where I'd extend your closing the most. Precision in an entropic domain doesn't come from cleaner structure downstream, it comes from not throwing the entropy away too early. The context that makes a decision legible, the offhand comment, who was worried, what precedent got invoked, lives upstream of the artifact, and a schema imposed too soon destroys the exact signal you were trying to keep. So I've landed on holding the raw substrate and deriving the semantics and ontology late, once you actually know the question. A system of memory rather than a system of record, if I can lean on my own framing.
You close by telling people to find where the entropy sits and choose where to process it. The only place I'd amend it: I'm not sure the choosing is as open as it sounds. My bet is the value concentrates in processing it as late and as close to the question as possible. But that's the part I'd most enjoy being argued out of.
Justin — you've sharpened something I left implicit: resolution point affects signal survival, not just processing cost. That's a genuine extension, and a better framing of why the choice matters.
The menu I described has rungs, not just two endpoints. The banking example is already a hybrid: agents hold raw source documents and transcripts (no early schema imposed), derive structure at ingestion (knowledge graph), and re-derive by meaning at query time (vector search). The schema isn't a prison — it's a fast path for questions you can anticipate, layered on a substrate that still answers questions you can't. Schemas can evolve, and the real question isn't whether to commit but which parts of the question distribution are stable enough to justify it.
Full late-binding has real costs too. Query-time semantic derivation is slower and more expensive, and it still requires a well-posed question when it arrives — which in enterprise settings isn't guaranteed. So I'd amend your amendment slightly: the right discipline isn't "process it as late as possible." It's "how stable is your question distribution?" Pre-commit on the known questions, hold the substrate open for the unknown ones. Harder than either the Cartesian grid or the pure system of memory — but that's where the actual design work lives.