SFH (Semantic First-Hop) is a tiny, deterministic semantic layer for the public web. It gives machines (LLMs, crawlers, agents) a stable first hop for any topic—no central authority, no ranking, no guessing.
It’s the same idea as DFH (Deterministic First-Hop) but named for clarity: the first semantic step when resolving “what is X?” online.
0. Problem
The web has DNS (where), HTTP (how), HTML/JSON (what), but no standard for the semantic starting point of a topic. Today, LLMs and search engines guess using embeddings, PageRank, or ads. Meaning is implicit and mutable.
SFH adds one explicit file telling machines where the canonical map of a domain/topic lives.
1. Goals
Minimal: one well-known JSON file.
Deterministic: same input → same first hop.
Decentralized: anyone can publish.
Composable: works with sitemaps, JSON-LD, schema.org.
LLM-friendly: mirrors how models canonicalize concepts.
2. Concepts
SFH Node
A domain/subdomain providing an SFH descriptor at /.well-known/sfh.json.
Examples: example.com, cars.example.com, googlesitemap.com, moneysitemap.com.
This doesn’t impersonate Google; it simply publishes a stable map pointing to real URLs.
6. DFH Compatibility
If stack.json exists, it is SFH-compatible.
Client rule: try sfh.json → fall back to stack.json.
7. Security Notes
SFH doesn’t solve trademark disputes; it makes anchors transparent.
Domains MUST use HTTPS.
Clients SHOULD combine SFH with other trust signals.
8. Implementation
Publisher: serve a static JSON at /.well-known/sfh.json.
Client pseudo-code:
def resolve(domain):
for p in ["/.well-known/sfh.json", "/.well-known/stack.json"]:
try:
r = requests.get("https://" + domain + p, timeout=3)
if r.status_code == 200: return r.json()
except: pass
return None
9. Why Now?
AI already behaves as if this layer exists internally. The web never standardized it. SFH is the smallest workable patch: one file per domain. It gives machines a deterministic, public starting point instead of hidden heuristics.
10. TL;DR
SFH is a tiny / .well-known / sfh.json file that tells machines where the canonical map of a topic lives.
Publish one file → AI stops guessing.
Deterministic grounding for the public web.
sematicstackdfh•58m ago
SFH (Semantic First-Hop) is a tiny, deterministic semantic layer for the public web. It gives machines (LLMs, crawlers, agents) a stable first hop for any topic—no central authority, no ranking, no guessing.
It’s the same idea as DFH (Deterministic First-Hop) but named for clarity: the first semantic step when resolving “what is X?” online.
0. Problem
The web has DNS (where), HTTP (how), HTML/JSON (what), but no standard for the semantic starting point of a topic. Today, LLMs and search engines guess using embeddings, PageRank, or ads. Meaning is implicit and mutable.
SFH adds one explicit file telling machines where the canonical map of a domain/topic lives.
1. Goals
Minimal: one well-known JSON file. Deterministic: same input → same first hop. Decentralized: anyone can publish. Composable: works with sitemaps, JSON-LD, schema.org. LLM-friendly: mirrors how models canonicalize concepts.
2. Concepts SFH Node
A domain/subdomain providing an SFH descriptor at /.well-known/sfh.json. Examples: example.com, cars.example.com, googlesitemap.com, moneysitemap.com.
SFH Descriptor
JSON (or JSON-LD) document describing:
what the node represents
its canonical URL
sitemap
mirrors
metadata
SFH Resolution
Map topic → candidate domain.
Fetch /.well-known/sfh.json (fallback: stack.json).
Parse anchors.
Build stable graph from canonical URL + sitemap.
3. Spec (condensed)
Descriptor MUST be at:
https://<domain>/.well-known/sfh.json
Top-level JSON MUST include:
spec_version node.id node.type anchors.canonical_url
anchors.sitemap_url SHOULD be included.
Example minimal descriptor:
{ "spec_version": "1.0", "node": { "id": "https://example.com", "type": "Topic", "label": "Example" }, "anchors": { "canonical_url": "https://example.com/", "sitemap_url": "https://example.com/sitemap.xml" }, "meta": { "updated_at": "2025-12-07T00:00:00Z" } }
Clients MUST accept application/json or application/ld+json.
4. Resolution Algorithm
For a domain:
GET /.well-known/sfh.json (then stack.json).
Validate required fields.
Use:
node.id → stable ID
canonical_url → first hop
sitemap_url → expansion
Cache according to HTTP headers.
5. Example External Anchor { "spec_version": "1.0", "node": { "id": "https://googlesitemap.com", "type": "Topic", "label": "Google Sitemaps" }, "anchors": { "canonical_url": "https://googlesitemap.com/", "sitemap_url": "https://www.google.com/sitemap.xml" }, "meta": { "updated_at": "2025-12-07T00:00:00Z" } }
This doesn’t impersonate Google; it simply publishes a stable map pointing to real URLs.
6. DFH Compatibility
If stack.json exists, it is SFH-compatible. Client rule: try sfh.json → fall back to stack.json.
7. Security Notes
SFH doesn’t solve trademark disputes; it makes anchors transparent. Domains MUST use HTTPS. Clients SHOULD combine SFH with other trust signals.
8. Implementation
Publisher: serve a static JSON at /.well-known/sfh.json. Client pseudo-code:
def resolve(domain): for p in ["/.well-known/sfh.json", "/.well-known/stack.json"]: try: r = requests.get("https://" + domain + p, timeout=3) if r.status_code == 200: return r.json() except: pass return None
9. Why Now?
AI already behaves as if this layer exists internally. The web never standardized it. SFH is the smallest workable patch: one file per domain. It gives machines a deterministic, public starting point instead of hidden heuristics.
10. TL;DR
SFH is a tiny / .well-known / sfh.json file that tells machines where the canonical map of a topic lives. Publish one file → AI stops guessing. Deterministic grounding for the public web.