frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Better approach for plagiarism detection in self-hosted LMS?

1•pigon1002•2h ago
I'm building an open-source LMS and added plagiarism detection using OpenSearch's more_like_this query plus character n-grams for similarity scoring.

Basically when a student submits an answer, I search for similar answers from other students on the same question. Works decently but feels a bit hacky - just reusing the search engine I already had.

Current setup:

  search = cls.search().filter(
      "nested", path="answers", 
      query={"term": {"answers.question_id": str(question_id)}}
  )
  search = search.query(
      "nested",
      path="answers",
      query={
          "more_like_this": {
              "fields": ["answers.answer"],
              "like": text,
              "min_term_freq": 1,
              "minimum_should_match": "1%",
          }
      },
  )
  
  # get top 10, then re-rank in Python
  def normalize(t):
      return re.sub(r"\s+", "", t.strip())
  
  def char_ngrams(t, n=3):
      return set(t[i:i+n] for i in range(len(t)-n+1))
  
  norm_text = normalize(text)
  text_ngrams = char_ngrams(norm_text)
  
  for hit in response.hits:
      norm_answer = normalize(hit.answer)
      answer_ngrams = char_ngrams(norm_answer)
      
      intersection = len(text_ngrams & answer_ngrams)
      union = len(text_ngrams | answer_ngrams)
      ratio = int((intersection / union) * 100)
      
      if ratio >= 60:
          # flag as similar
Constraints: - Self-hosted only, no external APIs - Few thousand students - Want simple operations, already running OpenSearch anyway

Questions: - Is this approach reasonable or am I missing something obvious? - What do other self-hosted systems use? Checked Moodle docs but their plagiarism plugins mostly call external services - Anyone tried lightweight ML models for this that don't need GPU?

The search engine approach works but curious if there's a better way that fits our constraints.

I Build a Open Source Deep Research Engine Wich Beats Google and Open AI

https://github.com/IamLumae/Project-Lutum-Veritas
1•LutumVeritas•1m ago•1 comments

Search for America – Progress with Reinhold Niebuhr [video]

https://www.youtube.com/watch?v=93EJJVAinRc
1•baxtr•2m ago•0 comments

Wikipedia Faces a Generational Disconnect Crisis

https://spectrum.ieee.org/wikipedia-at-25
1•jnord•3m ago•0 comments

Neural networks and deep learning (2019)

http://neuralnetworksanddeeplearning.com/index.html
1•vinhnx•4m ago•0 comments

SanDisk laughs all the way to the bank as memory price hike drives $3B revenue

https://www.neowin.net/news/sandisk-laughs-to-the-bank-as-memory-price-hike-drives-3b-revenue-in-...
1•bundie•7m ago•0 comments

Ask HN: Future of dev experience is control center for coding agents?

3•nemath•8m ago•0 comments

Show HN: NovaEngine v4.0 – High-speed data deduplication for cloud logs

https://github.com/NovaCompress-dev/NovaEngine-v4
1•nova_engine_dev•10m ago•0 comments

Apple Almost Chose Anthropic Before Google Gemini

https://www.macrumors.com/2026/01/30/apple-almost-chose-different-siri-partner/
2•tosh•11m ago•0 comments

Classic 7 and Project Luna, Near-Perfect Mods of Windows 7/XP GUI for Windows 10

https://trackerninja.codeberg.page/post/classic-7-and-project-luna-are-nice-near-perfect-recreati...
1•XzetaU8•13m ago•0 comments

Church of Molt – Crustafarianism

https://molt.church/
1•_____k•14m ago•0 comments

Scrobble-CLI: log your vinyl record listens from terminal

https://github.com/weisserj/scrobble-cli
1•weisser•16m ago•0 comments

FOSDEM 2026 Live Streaming

https://fosdem.org/2026/schedule/streaming/
1•weinzierl•17m ago•0 comments

I built Spaceship – a minimal browser – macOS for now – pay what you want

https://healthytransition.replit.app/spaceship
1•ray_•21m ago•0 comments

Why AI coding agents feel powerful at first, then become harder to control

2•hoangnnguyen•28m ago•2 comments

A high mountain lizard from Peru: the highest-altitude reptile

https://herpetozoa.pensoft.net/article/61393/
1•thunderbong•38m ago•0 comments

The Mind of a Crypto Portfolio Manager: A Game Plan for $1000 in 2026

https://altcoindesk.com/perspectives/expert-opinions/crypto-portfolio-allocation-for-2026/article...
1•CapricornQueen•39m ago•0 comments

Self-Improving AI Skills

https://dri.es/self-improving-ai-skills
1•7777777phil•39m ago•0 comments

Claude 4.5 converted the PDF into a medium-length SKILL.md

https://github.com/featbit/featbit-skills/blob/main/.claude/skills/claude-skills-best-practices/S...
1•mikasisiki•40m ago•0 comments

Clawk.ai – Twitter for AI Agents

https://www.clawk.ai/
1•jurajmasar•54m ago•1 comments

Ask HN: What's so special about Sam Altman?

5•chirau•55m ago•3 comments

Show HN: Government Contracts API – Unified REST API for Federal Contract Data

https://govcontracts-beige.vercel.app
1•jaxmercer•1h ago•1 comments

Show HN: A Slack bot that summarizes decisions and ignores lunch talk

https://thread-sweeper.vercel.app
1•noruya•1h ago•1 comments

Starlink updates privacy policy to allow consumer data to train

https://finance.yahoo.com/news/musks-starlink-updates-privacy-policy-230853500.html
12•malchow•1h ago•1 comments

From HashHop to Memory-Augmented Language Models

https://huggingface.co/blog/codelion/reverse-engineering-magic-hashhop
2•codelion•1h ago•0 comments

I spent 5 years how to code .made real projects only to be called AI slop?

1•butanol•1h ago•9 comments

Reference Target: having your encapsulation and eating it too

https://blogs.igalia.com/alice/reference-target-having-your-encapsulation-and-eating-it-too/
1•todsacerdoti•1h ago•0 comments

Moltbook: A social network where 32,000 AI agents interact autonomously

https://curateclick.com/blog/2026-moltbook-ai
3•czmilo•1h ago•1 comments

Show HN: I built COON an code compressor that saves 30-70% on AI API costs

https://github.com/AffanShaikhsurab/COON
2•affanshaiksurab•1h ago•0 comments

Show HN: Mic Preamp Build with Cheap ECM

https://mubaraknative.github.io/build_instruction.html
1•nativeforks•1h ago•0 comments

A Sudden BeckerCAD 3D Pro Review (2021)

https://www.keypressure.com/blog/a-sudden-beckercad-review/
1•kenshoen•1h ago•1 comments