Heuri: Hic Sunt Leones
The ethical guidelines of Heuri are established. I’ve also explained why having a merchant math substitute is a good idea. Recently, I started developing the very first version of the app. Only then did I really realise how much work it entails.
1. Knowledge Representation: Scrapping the Graph
At least to my knowledge, there is no machine-readable definition of a Czech high-school curriculum. The Ministry of Education rolling out a competence-based framework for schools (RVP) further complicates the situation. You might now be wondering:
“You’ve just spent many articles elaborating on how horrible merchant math is, just to complain the Ministry of Education gave a free hand to schools on how to educate children?”
There are two factors to consider here. Firstly, the inertia of pedagogical practices is quite high and that’s no coincidence. Secondly, if you read the curriculum carefully, it’s not that different to what Czech schools were mandated to align with before. To be crystal clear here, this is just not hollow theorizing: I still see this when I’m tutoring students.
Notwithstanding the materializing hangover of education revolution No. 123, let’s work with a benefit of the doubt and presume that schools will be flexible, responsive, modern and will have all the capacity and resources to educate students according to their specific desires and needs. Even in that environment, we still don’t have to throw away the whole math curriculum: it’s just about the way we present it to students.
That’s exactly where the first snag arose. Filled with excitement, I started modelling a graph database and learning its quirks, just to realise understanding a completely new database paradigm is not a very good idea, given my stringent resources, and that using a graph database is complete overkill for my use case.
There won’t be millions of rules and edges in the knowledge base and I don’t need to traverse a complicated dynamic structure, at least not yet. I quickly concluded that having a machine-readable text document that’s loaded to memory is more than enough for my first MVP.
Knowledge base building without prior experience? Bring it on! Image source: Picryl
2. Frontend Architecture & Cognitive Links
Use Claude Code, they said! So I did: I started designing vibecoding the front-facing app solely based on what Claude Code gave me, as my knowledge of React was very superficial. Since React is a very popular framework, Claude might have a good chance of getting the basics right, given proper guardrails. After two weeks of developing the app, I noticed that the codebase started resembling the infamous tangled and overengineered spaghetti code.
One of the biggest fails was that Claude started dumping the whole context into a web storage, completely tangling component state management. In case you have no idea what I’m talking about, I did manage to restructure the code. I’ve learned my lesson: when I stand my ground to Claude that whenever in doubt, it should ask me for additional details, Claude is able to deliver semi-usable code that I take over.
I was also able to distill a very simple domain-specific protocol between the frontend and the BFF (backend for frontend service). Conceptually, when a user is solving a problem, the frontend asks for the next step, and BFF sends what should be rendered.
Let’s say Heuri asks a user what changing the linear coefficient does to the linear function. BFF sends this step definition: the task of the FE is the how. Let’s say that the student observes that the line is tilted. As soon as the step is over - conditions are a part of the step definition - FE asks for the next step, and so forth. We proceed until the user is given a summary of what they’ve learned and created.
Given enough time, Claude Code typically produces code architecture similar to such cable artistry. Image source: Picryl
One concept that’s already quite well established is what I call a cognitive link: it’s a part of the constructivist philosophy I want Heuri to be aligned with. When students learn new things, the last thing we want to do is to ply them with terminology and definitions that are at this point still hanging in the void: Heuri embraces the moment by letting the student name the effect first, under the supervision of language models.1
Only then do we present the established terminology so that the student knows how to communicate the concept with others. In the background, the student knows there is something called a linear coefficient but what they’ve learned - and what’s more valuable - is that this coefficient tilts the line. tilt <–> linear coefficient is an example of a cognitive link.
3. LLM Latency
For prototyping, I settled for Weights and Biases which offers a free tier of LLM inference. My first “architecture” was two-tiered: a sentinel tier, behind an 8B-parameter reasoning LLM, that gatekept any malevolent inputs, and a cognition tier, another 8B-parameter reasoning LLM. Anyone who has dealt with more serious language model usage, they already know how naive and hopeless this setup is.
Currently, any interaction that requires validating user input takes more than five seconds (so, one whole TikTok reel) and works only for English input. That’s simply unacceptable and this needs to change as soon as possible, even in the MVP phase. I’m on my way towards a more performant architecture but it’s still a work in progress I’ll report on in upcoming articles.
Happy learning!
-
Yes, that certainly has its risks, but there are ways to make it more robust and predictable; more about this later. ↩︎