frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Patent Array Analysis Using a Combination of ClickHouse and HDFS

https://link.springer.com/chapter/10.1007/978-3-031-67685-7_3
2•teleforce•3mo ago

Comments

teleforce•3mo ago
Abstract:

For users, patent search is provided by large companies such as: Yandex. Patents, Google Patents, Espacenet, United States Patent and Trademark Office (USPTO), Dimensions, which contain a large number of patent documents. Many search engines present various search criteria, for example, by date, title, applicant and category. The problem with such services is that they do not provide access to their databases, but only offer a web interface for viewing information. This chapter discusses the process of developing a module for the formation of a patent sample based on data (natural language and metadata) of the US patent array (USPTO) for solving various analysis tasks, such as: building a patent landscape through clustering procedures, identifying patent trends, etc. To solve this problem, a patent filtering algorithm has been developed that allows you to check the basic patent class, determining whether it is included in the clarifying list from the configuration file, an algorithm for parsing patent documents that allows you to extract the necessary elements of the description of the sources under consideration and the clustering algorithm of the patent sample. The parsing module is developed using the lxmllibrary and Beautiful Soup. For clustering, evaluation of the accuracy and completeness of clustering, the sklearn library was used, a database management system (DBMS) was selected for the organization of information storage Clickhouse and HDFS distributed file system. Thus, as a result of testing the developed software, it was determined that the division of the patent sample based on meta-information (IPC classes) with it coincides with the results of clustering carried out on the basis of the analysisof textual patent information with great accuracy.

Logic Puzzles: Why the Liar Is the Helpful One

https://blog.szczepan.org/blog/knights-and-knaves/
1•wasabi991011•1m ago•0 comments

Optical Combs Help Radio Telescopes Work Together

https://hackaday.com/2026/02/03/optical-combs-help-radio-telescopes-work-together/
1•toomuchtodo•6m ago•1 comments

Show HN: Myanon – fast, deterministic MySQL dump anonymizer

https://github.com/ppomes/myanon
1•pierrepomes•12m ago•0 comments

The Tao of Programming

http://www.canonical.org/~kragen/tao-of-programming.html
1•alexjplant•13m ago•0 comments

Forcing Rust: How Big Tech Lobbied the Government into a Language Mandate

https://medium.com/@ognian.milanov/forcing-rust-how-big-tech-lobbied-the-government-into-a-langua...
1•akagusu•13m ago•0 comments

PanelBench: We evaluated Cursor's Visual Editor on 89 test cases. 43 fail

https://www.tryinspector.com/blog/code-first-design-tools
2•quentinrl•15m ago•1 comments

Can You Draw Every Flag in PowerPoint? (Part 2) [video]

https://www.youtube.com/watch?v=BztF7MODsKI
1•fgclue•21m ago•0 comments

Show HN: MCP-baepsae – MCP server for iOS Simulator automation

https://github.com/oozoofrog/mcp-baepsae
1•oozoofrog•24m ago•0 comments

Make Trust Irrelevant: A Gamer's Take on Agentic AI Safety

https://github.com/Deso-PK/make-trust-irrelevant
2•DesoPK•28m ago•0 comments

Show HN: Sem – Semantic diffs and patches for Git

https://ataraxy-labs.github.io/sem/
1•rs545837•30m ago•1 comments

Hello world does not compile

https://github.com/anthropics/claudes-c-compiler/issues/1
13•mfiguiere•35m ago•1 comments

Show HN: ZigZag – A Bubble Tea-Inspired TUI Framework for Zig

https://github.com/meszmate/zigzag
2•meszmate•38m ago•0 comments

Metaphor+Metonymy: "To love that well which thou must leave ere long"(Sonnet73)

https://www.huckgutman.com/blog-1/shakespeare-sonnet-73
1•gsf_emergency_6•40m ago•0 comments

Show HN: Django N+1 Queries Checker

https://github.com/richardhapb/django-check
1•richardhapb•55m ago•1 comments

Emacs-tramp-RPC: High-performance TRAMP back end using JSON-RPC instead of shell

https://github.com/ArthurHeymans/emacs-tramp-rpc
1•todsacerdoti•59m ago•0 comments

Protocol Validation with Affine MPST in Rust

https://hibanaworks.dev
1•o8vm•1h ago•1 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
3•gmays•1h ago•0 comments

Show HN: Zest – A hands-on simulator for Staff+ system design scenarios

https://staff-engineering-simulator-880284904082.us-west1.run.app/
1•chanip0114•1h ago•1 comments

Show HN: DeSync – Decentralized Economic Realm with Blockchain-Based Governance

https://github.com/MelzLabs/DeSync
1•0xUnavailable•1h ago•0 comments

Automatic Programming Returns

https://cyber-omelette.com/posts/the-abstraction-rises.html
1•benrules2•1h ago•1 comments

Why Are There Still So Many Jobs? The History and Future of Workplace Automation [pdf]

https://economics.mit.edu/sites/default/files/inline-files/Why%20Are%20there%20Still%20So%20Many%...
2•oidar•1h ago•0 comments

The Search Engine Map

https://www.searchenginemap.com
1•cratermoon•1h ago•0 comments

Show HN: Souls.directory – SOUL.md templates for AI agent personalities

https://souls.directory
1•thedaviddias•1h ago•0 comments

Real-Time ETL for Enterprise-Grade Data Integration

https://tabsdata.com
1•teleforce•1h ago•0 comments

Economics Puzzle Leads to a New Understanding of a Fundamental Law of Physics

https://www.caltech.edu/about/news/economics-puzzle-leads-to-a-new-understanding-of-a-fundamental...
3•geox•1h ago•1 comments

Switzerland's Extraordinary Medieval Library

https://www.bbc.com/travel/article/20260202-inside-switzerlands-extraordinary-medieval-library
2•bookmtn•1h ago•0 comments

A new comet was just discovered. Will it be visible in broad daylight?

https://phys.org/news/2026-02-comet-visible-broad-daylight.html
4•bookmtn•1h ago•0 comments

ESR: Comes the news that Anthropic has vibecoded a C compiler

https://twitter.com/esrtweet/status/2019562859978539342
2•tjr•1h ago•0 comments

Frisco residents divided over H-1B visas, 'Indian takeover' at council meeting

https://www.dallasnews.com/news/politics/2026/02/04/frisco-residents-divided-over-h-1b-visas-indi...
5•alephnerd•1h ago•5 comments

If CNN Covered Star Wars

https://www.youtube.com/watch?v=vArJg_SU4Lc
1•keepamovin•1h ago•1 comments