7 min read
Why we built Polmi
Every transcription app is built for boardrooms. We built one for the back row of a 9 a.m. lecture, and made a few quiet decisions to keep it calm.

A friend of ours, a sophomore studying biochemistry, once described her notebook from organic chemistry as “a transcript of panic.” Half-written structures. Arrows that pointed to nothing. A long horizontal line where the lecturer had said something fast and she had given up. We looked at the audio apps on her phone and noticed something. Every one of them was built for meetings. The icons were grey rectangles. The pricing pages talked about sales calls and stand-ups and team alignment. None of them knew what an enzyme was, and none of them seemed to care.
That gap is why Polmi exists. Lectures are not meetings. The room is different, the audio is different, the words a student needs to keep are different. We built a small mobile app that catches an hour of lecture and tidies it into a transcript, a summary, and a set of flashcards before the next class. It is in early access on TestFlight, iOS and Android. This is the story of how we got here and why we shaped it the way we did.
The market built itself a boardroom
Otter charges $16.99 a month for its Pro plan and spends most of its product copy on sales pipelines and CRM integrations. Notta sells itself to project managers and consultants. The open-source wave of Whisper apps trends toward podcast workflows: clean two- voice studios, transcript editors built for trimming, export to Descript. These are good products. They are good products for a person who sits in a chair at a clean desk with a single microphone and a quiet calendar invite.
That is not the room a student is in. A student is in the back of a lecture hall with a phone in their pocket. The room has 200 other people. The lecturer paces. The slide deck has terms the transcript will mangle on the first pass. The lecture is 80 minutes long, and the student has another one starting in fifteen minutes across campus. Sales-call apps do not lose much if they mishear “Q3.” A student who misses the spelling of “phosphofructokinase” is going to feel it next Tuesday.
We are not picking a fight with Otter. They built a careful product for the customer they understood. We chose a different customer.

Record now, transcribe later
The first big decision we made was to leave live transcription on the table. Polmi does not stream words to the screen while you record. You press record, attend the lecture, press stop, and the transcript appears a few minutes later. Some users have asked us why, and the answer is boring on purpose.
Streaming transcription in a classroom is fragile. The lecture hall has bad acoustics and intermittent Wi-Fi. The phone is in a backpack pocket. A streaming pipeline keeps a socket open for 80 minutes, drains the battery, and fails noisily when the network dips. Every failure mode in that pipeline becomes a failure mode in the student’s study workflow.
Batch is sturdy. The app records to local storage, finalizes a single audio file, and uploads it once over whatever network the student walks into next. The transcription happens on our backend with a calm budget and a single retry. If the upload fails, we try again from local storage. If the worker fails, we refund the quota. The student sees a result a few minutes after the lecture ends, not a live ticker that might drop the only sentence she needed.
This is a small decision in code and a load-bearing one in philosophy. The studio runs on the idea that calm beats busy, and this is the first place in Polmi where that shows up. The app does the work after class, the way a stenographer might tidy shorthand on the train home.
The hard part: flashcards that teach
Generating a summary from a lecture transcript is a solved problem. Ask a large model to summarize a paragraph and it will give you something passable. Generating flashcards a student should actually study is a different job, and it is the engineering tentpole of Polmi.
The naive approach is what most AI tools ship. Walk the transcript, pick a sentence that looks important, put it on the front of a card, put a paraphrase on the back. Ship it. The output looks like flashcards. They are not flashcards. A card that reads “What did the lecturer say about glycolysis in the second half of the hour?” is a search query, not a study prompt. A useful card is small, specific, and atomic: one term, one definition, one mechanism. It teaches one thing.
Getting there takes more work than summarization. A lecture is long, and the relevant terms are scattered. The model has to read an hour of audio transcript without losing the thread across chunk boundaries. The mid-lecture aside about a midterm date is not a flashcard. The five-minute walkthrough of the Krebs cycle is. Distinguishing one from the other is the problem.
We chunk the transcript in overlapping windows, pass each window through a structured-extraction prompt, then reconcile the candidates. We ask the model to name the term, write the definition the lecturer used, and surface a short example only when one was given in class. We score and merge duplicates. We drop cards that read as paraphrases of other cards. The output is small. A 50-minute lecture might produce 12 flashcards, not 40. We would rather ship 12 cards a student will actually flip through than 40 she will skim and abandon.
We will get this wrong sometimes. A card will be off. A term will be missed. The promise we make is that the cards are shaped like study material, not pulled-out sentences with a question mark glued to the front.

The shape of Polmi
Everything above points at one workflow. A student opens the app, taps a Marigold button, attends the lecture, and taps stop. Polmi uploads the audio in the background, hands it to a transcription model, hands the transcript to a notes model, and hands the transcript to a flashcard generator. A few minutes later the phone has three things the student did not have before: a clean transcript she can search, a short summary she can scan before the next class, and a small deck of flashcards she can flip through on the bus.
From there, the artifact travels. A student can share a single lecture with a friend who missed it. A study group can buy a five-seat bundle and pool their lectures across a semester, so the chemistry major’s recording shows up alongside the literature major’s notes. The transcript is searchable across the whole library. The flashcards can be reviewed inside the app on a spaced-repetition schedule. Nothing here is novel on its own. The thing that is new, we hope, is that it is one app, shaped for one workflow, end to end.
We watched a beta tester this past month catch a recording, tidy it without opening the app again, and pass it to a friend in her seminar. The whole loop took her less than a minute of actual attention. That moment, of a student handing a folded lecture to a friend, is the product.
The shape of the app is the same shape as the student’s week. A lecture happens. A few minutes pass. The notes are ready. The group sees the notes. The flashcards come back on review days. We did not invent any of these moments. We just kept asking, at every step, what a student in a 9 a.m. lecture would want to be true on a Tuesday afternoon.

Where we are
Polmi is in early access on TestFlight, with an Android build running through Play Internal Testing. The product home is polmi.app, where pricing lives and the early access link sits. If you are a college student who would like to try it, or a professor who would like to know what we are doing in your classroom before we do it, the studio inbox is hi@murakamilabs.com. We read everything.
There is no newsletter. There is no launch countdown. The work continues on a quiet cadence, and the next note will appear here when there is something specific to say. We expect that note to be about flashcards, because the next batch of testers is going to push the generator harder than anything we have seen, and we want to write down what it teaches us before the lesson fades.
Until then, thanks for reading. If you have a lecture you would let us test on, even an old one sitting on your phone, we would love a chance to listen.