SymbolicAI: A neuro-symbolic perspective on LLMs

https://github.com/ExtensityAI/symbolicai

119•futurisold•7h ago

Comments

sram1337•5h ago

This is the voodoo that excites me.

Examples I found interesting:

Semantic map lambdas

  S = Symbol(['apple', 'banana', 'cherry', 'cat', 'dog'])
  print(S.map('convert all fruits to vegetables'))
  # => ['carrot', 'broccoli', 'spinach', 'cat', 'dog']

comparison parameterized by context

  # Contextual greeting comparison
  greeting = Symbol('Hello, good morning!')
  similar_greeting = 'Hi there, good day!'

  # Compare with specific greeting context
  result = greeting.equals(similar_greeting, context='greeting context')
  print(result) # => True

  # Compare with different contexts for nuanced evaluation
  formal_greeting = Symbol('Good morning, sir.')
  casual_greeting = 'Hey, what\'s up?'

  # Context-aware politeness comparison
  politeness_comparison = formal_greeting.equals(casual_greeting, context='politeness level')
  print(politeness_comparison) # => False

bitwise ops

  # Semantic logical conjunction - combining facts and rules
  horn_rule = Symbol('The horn only sounds on Sundays.', semantic=True)
  observation = Symbol('I hear the horn.')
  conclusion = horn_rule & observation # => Logical inference

`interpret()` seems powerful.

OP, what inspired you to make this? Where are you applying it? What has been your favorite use case so far?

futurisold•4h ago

That's gonna be a very, very, long answer. What's funny is that not much changed since 2022 (eoy) when the projected started; the models just got better, but we had a good chunk of the primitives since gpt-3.

What's more recent is the DbC contribution which I think is unique. It literally solved anything agent related I've thrown at it -- especially because I can chain contracts together and the guardrails propagate nicely.

I've built most the custom tools myself. For instance, not only perplexity was rendered useless by openai's web search, but openai's web search itself is not good enough for what you can customize yourself. To this end, I've built my own deep research agent. Here's a thread with some results from the first day it was working: https://x.com/futurisold/status/1931751644233945216

I'm also running a company, and we've built an e2e document generation just from contracts (3 contracts chained together in this case). Here's an output (sorry about the PDF rendering, that's not what we serve, it's just what I've quickly hacked for local dev): https://drive.google.com/file/d/1Va7ALq_N-fTYeumKhH4jSxsTrWD...

This was the input:

---

Prompt:

> I want the files to be analyzed and I am interested in finding patterns; feel free to make suggestions as well. I want to understand how different providers use their system prompts, therefore things like: what kind of tags do they use – are they XML, markdown, etc, are they prone toward sycophancy or trying to manipulate the user, are they using tools and if so how, etc. I want the tech report to deconstruct and synthesize and compare the information, find interesting patterns that would be hard to spot.

Generated instructions:

(a) Query: Conduct a comparative analysis of system prompts across major AI providers (OpenAI, Google, Anthropic, xAI, etc.) to identify structural patterns, linguistic frameworks, and operational constraints that shape AI behavior and responses.

(b) Specific Questions:

1. What syntactic structures and formatting conventions (XML, markdown, JSON, etc.) are employed across different AI system prompts, and how do these technical choices reflect different approaches to model instruction?

2. To what extent do system prompts encode instructions for deference, agreeability, or user manipulation, and how do these psychological frameworks vary between commercial and research-focused models?

3. How do AI providers implement and constrain tool usage in their system prompts, and what patterns emerge in permission structures, capability boundaries, and function calling conventions?

4. What ethical guardrails and content moderation approaches appear consistently across system prompts, and how do implementation details reveal different risk tolerance levels between major AI labs?

5. What unique architectural elements in specific providers' system prompts reveal distinctive engineering approaches to model alignment, and how might these design choices influence downstream user experiences?

---

Contracts were introduced in March in this post: https://futurisold.github.io/2025-03-01-dbc/

They evolved a lot since then, but the foundation and motivation didn't change.

futurisold•4h ago

Btw, besides the prompt, the other input to the technical report (the gdrive link) was this repo: https://github.com/elder-plinius/CL4R1T4S/tree/main

futurisold•4h ago

One last comment here on contracts; an excerpt from the linked post I think it's extremely relevant for LLMs, maybe it triggers an interesting discussion here:

"The scope of contracts extends beyond basic validation. One key observation is that a contract is considered fulfilled if both the LLM’s input and output are successfully validated against their specifications. This leads to a deep implication: if two different agents satisfy the same contract, they are functionally equivalent, at least with respect to that specific contract.

This concept of functional equivalence through contracts opens up promising opportunities. In principle, you could replace one LLM with another, or even substitute an LLM with a rule-based system, and as long as both satisfy the same contract, your application should continue functioning correctly. This creates a level of abstraction that shields higher-level components from the implementation details of underlying models."

haileys•2h ago

Why is carrot the vegetablefication of apple?

herval•2h ago

Also if you run it twice, is it gonna be a carrot again?

pfdietz•1h ago

Are you asking for the root cause?

HappMacDonald•1h ago

I think it's interpreting the command as "replace each fruit with a vegetable", and it might intuit "make the resulting vegetables unique from one another" but otherwise it's not trying to find the "most similar" vegetable to every fruit or anything like that.

lmeyerov•26m ago

You might enjoy Lotus: https://github.com/lotus-data/lotus

It takes all the core relational operators and makes an easy semantic version of each as a python dataframe library extension . Each call ends up being a 'model' point in case you also want to do fancier things later like more learning based approaches. Afaict, snowflake and friends are moving in this direction for their cloud SQLs as well.

We ended up doing something similar for louie.ai , where you use AI notebooks/dashboards/APIs (ex: MCP) to talk to your data (splunk, databricks, graph db, whatever), and it'll figure out symbolic + semantic operators based on the context. Super helpful in practice.

My 80% case here is:

- semantic map: "get all the alerts from splunk index xyz, add a column flagging anything suspicious and another explaining why" <--- generates an enriched dataframe

- semantic map => semantic reduce: "... then summarize what you found" <--- then tells you about it in natural text

robertkrahn01•5h ago

Probably linking the paper and examples notebook here makes sense as they are pretty explanatory:

https://github.com/ExtensityAI/symbolicai/blob/main/examples...

https://arxiv.org/pdf/2402.00854

futurisold•4h ago

Wanted to do just that, thank you

futurisold•4h ago

I didn't expect this -- I was supposed to be sleeping now, but I guess I'll chat with whoever jumps in! Good thing I've got some white nights experience.

b0a04gl•4h ago

this works like functional programming where every symbol is a pure value and operations compose into clean, traceable flows. when you hit an ambiguous step, the model steps in. just like IO in FP, the generative call is treated as a scoped side effect. this can engage your reasoning graph stays deterministic by default and only defers to the model when needed. crazy demo though, love it

futurisold•4h ago

Yes, pretty much. We wanted it be functional from the start. Even low level, everything's functional (it's even called functional.py/core.py). We're using decorators everywhere. This helped a lot with refactoring, extending the framework, containing bugs, etc.

nbardy•4h ago

I love the symbol LLM first approaches.

I built a version of this a few years ago as a LISP

https://github.com/nbardy/SynesthesiaLisp

futurisold•4h ago

Very nice, bookmarked for later. Interestingly enough, we share the same timeline. ~2yo is when a lot of interesting work spawned as many started to tinker.

jaehong747•4h ago

great job! it reminds me genaiscript. https://microsoft.github.io/genaiscript/

// read files

const file = await workspace.readText("data.txt");

// include the file

content in the prompt in a context-friendly way def("DATA", file);

// the task

$`Analyze DATA and extract data in JSON in data.json.`;

futurisold•4h ago

Thank you! I'm not familiar with that project, will take a look

krackers•3h ago

Some of this seems a bit related to Wolfram Mathematica's natural language capabilities.

https://reference.wolfram.com/language/guide/FreeFormAndExte...

It can (in theory) do very similar things, where natural-language input is a first class citizen of the language and can operate on other objects. The whole thing came out almost a decade before LLMs, I'm surprised that they haven't revamped it to make it really shine.

futurisold•3h ago

> I'm surprised that they haven't revamped it

No worries! I can't find it right now, but Wolfram had a stream (or short?) where he introduced "Function". We liked it so much we implemented it after one day. Usage: https://github.com/ExtensityAI/symbolicai/blob/main/tests/en...

futurisold•3h ago

Wolfram's also too busy running his TOE exps to focus on LLMs (quite sadly if you ask me).

bjt12345•3h ago

How did you sort out mapping python constructs to their semantic equivalents?

I hope you keep at this, you may be in the right place at the right time.

It's getting to the point where some of the LLMs are immediately just giving me answers in Python, which is a strong indication of what the future will look like with Agents.

futurisold•2h ago

I'm struggling to understand the question. I'll revisit this when I wake up since it's quite late here.

VinLucero•3h ago

Nice! But have you considered a Neurosymbolic AI that can Evolve?

https://deepwiki.com/dubprime/mythral/3.2-genome-system

Or feel Emotion? https://deepwiki.com/search/how-do-emotives-work_193cb616-54...

Have you read Marvin Minsky’s Society of Mind?

afxjzs•3h ago

But is it also explainable or a magic black box?

futurisold•3h ago

Shortly, yes to all. We actually had an experiment going from theory of mind to emotion, but it's hanging right now since I feel the models aren't quite there yet and it yields diminish returns relative to effort. But could easily be revived. Minsky isn't my fav though, I'm leaning more toward Newell/Simon and friends from that generation.

VinLucero•2h ago

@futurisold, would love to collaborate with your team on running experiments. We have $300k of GPU credits to burn in the next 2 months.

There’s only so many cat videos my Agentic AI Army can create:

https://youtu.be/upVY1QioDeY?si=H566-_JIm7FmW4u0

futurisold•2h ago

That's very kind of you, thank you. Let's sync and see if we can align on something. You can find me on X, or shoot me an email at leo@extensity.ai

xpitfire•3h ago

We've been working on some exciting things with SymbolicAI and here a few things which might interest the HN community.

Two years ago, we built a benchmark to evaluate multistep reasoning, tool use, and logical capabilities in language models. It includes a quality measure to assess performance and is built on a plugin system we developed for SymbolicAI.

- Benchmark & Plugin System: https://github.com/ExtensityAI/benchmark

- Example Eval: https://github.com/ExtensityAI/benchmark/blob/main/src/evals...

We've also implemented some interesting concepts in our framework: - C#-style Extension Methods in Python: Using GlobalSymbolPrimitive to extend functionalities.

    - https://github.com/ExtensityAI/benchmark/blob/main/src/func.py#L146

- Symbolic <> Sub-symbolic Conversion: And using this for quality metrics, like a reward signal from the path integral of multistep generations. - https://github.com/ExtensityAI/benchmark/blob/main/src/func....

For fun, we integrated LLM-based tools into a customizable shell. Check out the Rick & Morty-styled rickshell:

- RickShell: https://github.com/ExtensityAI/rickshell

We were also among the first to generate a full research paper from a single prompt and continue to push the boundaries of AI-generated research:

- End-to-End Paper Generation (Examples): https://drive.google.com/drive/folders/1vUg2Y7TgZRRiaPzC83pQ...

- Recent AI Research Generation:

    - Three-Body Problem: https://github.com/ExtensityAI/three-body_problem  

    - Primality Test: https://github.com/ExtensityAI/primality_test 

    - Twitter/X Post: https://x.com/DinuMariusC/status/1915521724092743997

Finally, for those interested in building similar services, we've had an open-source, MCP-like API endpoint service available for over a year:

- SymbolicAI API: https://github.com/ExtensityAI/symbolicai/blob/main/symai/en...

GZGavinZhao•3h ago

not to be confused with symbolica.ai

futurisold•2h ago

bionhoward•1h ago

FYI, there’s a correctness issue in the part about correctness contracts: valid_opts = ['A', 'B', 'C'] if v not in valid_sizes:

valid_sizes is undefined

nickysielicki•46m ago

I spent some time toying around with LLM-guided "symbolic regression", basically having an LLM review documents in order to come up with primitives (aka operators) that could be fed into github.com/MilesCranmer/PySR

I didn't get very far because I had difficulty piping it all together, but with something like this I might give it another go. Cool stuff.

pkkkzip•32m ago

what are the implicaitons and actual real world application of this? better agents? more accurate, debuggable LLM answers?

krackers•29m ago

One question, OP, how does cost for this work? Do you pay the LLM inference cost (quite literally if using an external API) every time you run a line that involves natural language computation? E.g. what happens if you call a "symbolic" function in a loop.

Show HN: I'm an airline pilot – I built interactive graphs/globes of my flights

Normalizing Flows Are Capable Generative Models

Learn OCaml

SymbolicAI: A neuro-symbolic perspective on LLMs

James Webb Space Telescope Reveals Its First Direct Image of an Exoplanet

Structuring Arrays with Algebraic Shapes

C compiler for Web Assembly (c4wa)

Reinforcement learning, explained with a minimum of math and jargon

Multi-Stage Programming with Splice Variables

Qwen VLo: From "Understanding" the World to "Depicting" It

10 Years of Pomological Watercolors

nimbme – Nim bare-metal environment

bootc-image-builder: Build your entire OS from a Containerfile

Transmitting data via ultrasound without any special equipment

Theoretical Analysis of Positional Encodings in Transformer Models

Rust in the Linux kernel: part 2

Facebook is starting to feed its AI with private, unpublished photos

Spark AI (YC W24) is hiring a full-stack engineer in SF (founding team)

New Process Uses Microbes to Create Valuable Materials from Urine

The Journey of Bypassing Ubuntu's Unprivileged Namespace Restriction

A Brief History of Children Sent Through the Mail (2016)

Weird Expressions in Rust

Whitesmiths C compiler: One of the earliest commercial C compilers available

Glass nanostructures reflect nearly all visible light, challenging assumptions

Does a Focus on Royalty Obscure British History?

A New Kind of Computer (April 2025)

Parameterized types in C using the new tag compatibility rule

Slightly better named character reference tokenization than Chrome, Safari, FF

PJ5 TTL CPU

Project Vend: Can Claude run a small shop? (And why does that matter?)

SymbolicAI: A neuro-symbolic perspective on LLMs

Comments

Show HN: I'm an airline pilot – I built interactive graphs/globes of my flights

Normalizing Flows Are Capable Generative Models

Learn OCaml

SymbolicAI: A neuro-symbolic perspective on LLMs

James Webb Space Telescope Reveals Its First Direct Image of an Exoplanet

Structuring Arrays with Algebraic Shapes

C compiler for Web Assembly (c4wa)

Reinforcement learning, explained with a minimum of math and jargon

Multi-Stage Programming with Splice Variables

Qwen VLo: From "Understanding" the World to "Depicting" It

10 Years of Pomological Watercolors

nimbme – Nim bare-metal environment

bootc-image-builder: Build your entire OS from a Containerfile

Transmitting data via ultrasound without any special equipment

Theoretical Analysis of Positional Encodings in Transformer Models

Rust in the Linux kernel: part 2

Facebook is starting to feed its AI with private, unpublished photos

Spark AI (YC W24) is hiring a full-stack engineer in SF (founding team)

New Process Uses Microbes to Create Valuable Materials from Urine

The Journey of Bypassing Ubuntu's Unprivileged Namespace Restriction

A Brief History of Children Sent Through the Mail (2016)

Weird Expressions in Rust

Whitesmiths C compiler: One of the earliest commercial C compilers available

Glass nanostructures reflect nearly all visible light, challenging assumptions

Does a Focus on Royalty Obscure British History?

A New Kind of Computer (April 2025)

Parameterized types in C using the new tag compatibility rule

Slightly better named character reference tokenization than Chrome, Safari, FF

PJ5 TTL CPU

Project Vend: Can Claude run a small shop? (And why does that matter?)