Morphology of a Marvel Movie

https://github.com/dhealy05/morphology_of_a_marvel_movie

3•higuidebot•8mo ago

Comments

PaulHoule•8mo ago

Cosine similarity works for this but the right way to think about it is as a classical ML classification problem with all the tools from

https://scikit-learn.org/stable/supervised_learning.html

For instance you will probably get better results with SVM or a not-so-deep perceptron or maybe random forest model than you will with cosine similarity. You can also probability calibrate such a model

https://scikit-learn.org/stable/modules/calibration.html

which is quite useful.

higuidebot•8mo ago

What do you think a "better" result would be here? Better by what metric?

PaulHoule•8mo ago

Accuracy.

If you got N people (say N=10) to classify different segments of the script you'd find that they'd mostly agree about how to classify them but they wouldn't agree perfectly. You can get closer to a "gold truth" if you sit people together to discuss the difficult cases.

Any given classifer is going to be like one individual, if it is any good it is going to mostly agree with the gold truth but sometimes it won't. It's also the truth that some classifications will be ambiguous as some segment of the script will have some characteristics of one class and some of another or just might not fit rationally into the schema.

This toolbox

https://scikit-learn.org/stable/model_selection.html

is helpful for the process of testing a number of different models for a range of parameters and deciding what works best. A classifier that is calibrated (returns a probability of class membership) can skip cases where it knows it doesn't know what it is talking about. In the financial world, a calibrated model + a Kelly better can make money trading, an uncalibrated model will lose money almost always.

Show HN: A unique twist on Tetris and block puzzle

The logs I never read

How to use AI with expressive writing without generating AI slop

Show HN: LinkScope – Real-Time UART Analyzer Using ESP32-S3 and PC GUI

Cppsp v1.4.5–custom pattern-driven, nested, namespace-scoped templates

The next frontier in weight-loss drugs: one-time gene therapy

At Age 25, Wikipedia Refuses to Evolve

Show HN: ReviewReact – AI review responses inside Google Maps ($19/mo)

Why AlphaTensor Failed at 3x3 Matrix Multiplication: The Anchor Barrier

Ask HN: How much of your token use is fixing the bugs Claude Code causes?

Show HN: Agents – Sync MCP Configs Across Claude, Cursor, Codex Automatically

Hello

FSD helped save my father's life during a heart attack

Show HN: Writtte – Draft and publish articles without reformatting, anywhere

Portuguese icon (FROM A CAN) makes a simple meal (Canned Fish Files) [video]

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

Transcribe your aunts post cards with Gemini 3 Pro

.72% Variance Lance

ReKindle – web-based operating system designed specifically for E-ink devices

Encrypt It

NextMatch – 5-minute video speed dating to reduce ghosting

Personalizing esketamine treatment in TRD and TRBD

SpaceKit.xyz – a browser‑native VM for decentralized compute

NotebookLM: The AI that only learns from you

Show HN: An open-source starter kit for developing with Postgres and ClickHouse

Game Boy Advance d-pad capacitor measurements

South Korean crypto firm accidentally sends $44B in bitcoins to users

Apache Poison Fountain

Web.whatsapp.com appears to be having issues syncing and sending messages

Google in Your Terminal