frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Patent Array Analysis Using a Combination of ClickHouse and HDFS

https://link.springer.com/chapter/10.1007/978-3-031-67685-7_3
2•teleforce•2h ago

Comments

teleforce•2h ago
Abstract:

For users, patent search is provided by large companies such as: Yandex. Patents, Google Patents, Espacenet, United States Patent and Trademark Office (USPTO), Dimensions, which contain a large number of patent documents. Many search engines present various search criteria, for example, by date, title, applicant and category. The problem with such services is that they do not provide access to their databases, but only offer a web interface for viewing information. This chapter discusses the process of developing a module for the formation of a patent sample based on data (natural language and metadata) of the US patent array (USPTO) for solving various analysis tasks, such as: building a patent landscape through clustering procedures, identifying patent trends, etc. To solve this problem, a patent filtering algorithm has been developed that allows you to check the basic patent class, determining whether it is included in the clarifying list from the configuration file, an algorithm for parsing patent documents that allows you to extract the necessary elements of the description of the sources under consideration and the clustering algorithm of the patent sample. The parsing module is developed using the lxmllibrary and Beautiful Soup. For clustering, evaluation of the accuracy and completeness of clustering, the sklearn library was used, a database management system (DBMS) was selected for the organization of information storage Clickhouse and HDFS distributed file system. Thus, as a result of testing the developed software, it was determined that the division of the patent sample based on meta-information (IPC classes) with it coincides with the results of clustering carried out on the basis of the analysisof textual patent information with great accuracy.

Dabous Giraffes

https://en.wikipedia.org/wiki/Dabous_Giraffes
1•gametorch•8m ago•0 comments

NTSB – Hull Failure and Implosion of Submersible Titan [pdf]

https://www.ntsb.gov/investigations/AccidentReports/Reports/MIR2536.pdf
1•twalichiewicz•9m ago•0 comments

What Does George Orwell's '1984' Mean in 2024?

https://www.smithsonianmag.com/history/what-does-george-orwells-1984-mean-in-2024-180984468/
10•KnuthIsGod•16m ago•0 comments

KH3: A Frugal Trajectory Indexing System

https://ieeexplore.ieee.org/abstract/document/11104509
1•teleforce•18m ago•0 comments

The State of PHP in 2025

https://blog.jetbrains.com/phpstorm/2025/10/state-of-php-2025/
2•brentroose•19m ago•0 comments

Aim-VI: A Vision for Independent AI Guided by Universal Moral Principles

https://github.com/Bongikk/AIM-VI
2•bongik•23m ago•1 comments

Printing Money: Paxos Mints, Then Burns $300T in PayPal Stablecoins

https://decrypt.co/344463/printing-money-paxos-mints-burns-300-trillion-paypal-stablecoins
1•shscs911•27m ago•0 comments

Computerized Cognitive Training Improved Acetylcholine Transporter Levels

https://games.jmir.org/2025/1/e75161
1•jbotz•30m ago•0 comments

Many developers leave GZDoom due to leader conflicts and fork it into UZDoom

https://www.gamingonlinux.com/2025/10/many-developers-leave-gzdoom-due-to-leader-conflicts-and-fo...
2•MallocVoidstar•34m ago•0 comments

D'Angelo's Genius Was Pure, and Rare

https://www.newyorker.com/culture/postscript/dangelos-genius-was-pure-and-rare
2•tintinnabula•36m ago•0 comments

Nvidia Sued for Scraping YouTube

https://www.404media.co/nvidia-sued-for-scraping-youtube-after-404-media-investigation/
2•JumpCrisscross•36m ago•0 comments

Extracting Physical and Technical Structured Info from Natural Language Document

https://ieeexplore.ieee.org/document/10803712
1•teleforce•40m ago•0 comments

Australian wet rainforests may be switching from absorbing carbon to emitting it

https://www.abc.net.au/news/science/2025-10-16/australian-rainforest-trees-carbon-storage-produce...
3•nreece•40m ago•0 comments

Sanitized SQL

https://ardentperf.com/2025/10/15/sanitized-sql/
2•qianli_cs•41m ago•0 comments

Do we still have the spark gap in our rearview mirror?

https://www.amateurradio.com/do-we-still-have-the-spark-gap-in-our-rearview-mirror/
1•iamhamm•54m ago•0 comments

Ollama Rolls Out Experimental Vulkan Support for AMD and Intel

https://www.phoronix.com/news/ollama-Experimental-Vulkan
3•geerlingguy•55m ago•0 comments

New Relic's compute based pricing creates unpredictable costs

https://signoz.io/blog/new-relic-ccu-pricing-unpredictable-costs/
1•ak_builds•57m ago•0 comments

OpenAI Build Hour: Responses API [video]

https://www.youtube.com/watch?v=hNr5EebepYs
1•handfuloflight•58m ago•0 comments

Household upgrades can meet 100 percent of data center demand growth

https://www.rewiringamerica.org/research/homegrown-energy-report-ai-data-center-demand
4•zekrioca•1h ago•5 comments

Phases of Fitness

https://medium.com/@prashantgupta24/phases-of-fitness-8984c1c06f37
2•prashantgupta24•1h ago•2 comments

Deconstructing Functional Programming (2013)

https://www.infoq.com/presentations/functional-pros-cons/
2•teleforce•1h ago•0 comments

Inside the Trump Administration's Assault on Higher Education

https://www.newyorker.com/magazine/2025/10/20/inside-the-trump-administrations-assault-on-higher-...
5•mitchbob•1h ago•1 comments

Ad-X2: When US Politicians Took on Science

https://www.historytoday.com/archive/history-matters/ad-x2-when-us-politicians-took-science
1•samclemens•1h ago•0 comments

TaxCalcBench: Evaluating Frontier Models on the Tax Calculation Task

https://arxiv.org/abs/2507.16126
6•handfuloflight•1h ago•0 comments

Did you get lucky or unlucky?

https://antithesis.com/blog/2025/findability/
2•wwilson•1h ago•0 comments

First Ever Continuously Operating Quantum Computer

https://www.thecrimson.com/article/2025/10/2/quantum-computing-breakthrough/
1•oldfuture•1h ago•0 comments

Show HN: VO3 – AI video generator powered by Google Veo 3.1

https://vo3-1ai.com
2•derek39576•1h ago•0 comments

Beijing's anger at 'malicious' US move on Chinese tech firms

https://www.cnn.com/2025/09/30/tech/us-export-curbs-expansion-beijing-anger-intl-hnk
4•rguiscard•1h ago•0 comments

Show HN: Open-source sound –> dmx party lighting

https://github.com/davidhughhenrymack/party-parrot
1•edmack•1h ago•0 comments

Free applicatives, the handle pattern, and remote systems

https://exploring-better-ways.bellroy.com/free-applicatives-the-handle-pattern-and-remote-systems...
10•_jackdk_•1h ago•1 comments