Fusepool – Publish-Process-Perform Platform for Linked Data

Development of a set of integrated software components for publishing and processing of linked data.

To make publishing and processing of linked data easy, the Fusepool project develops a set of integrated software components based on open-source Linked Data Platform best practices.  The tightly integrated components support the multilingual data value chain from data exploration (e.g. identifying structured and unstructured data sources), extraction (e.g. using named entity recognition, RDF conversion), enrichment (e.g. interlinking, crowdsourcing), and delivery (e.g. analytics, apps for desktop and mobile devices). These components run on an open-source data platform with various enterprise-grade storage solutions.
The vision is to make publishing and reuse of linked data as easy as possible for the end user thanks to a thriving market economy with data publishers, developers, and consumers along the value chain. Making data reusable and interoperable within and outside the organization requires a fundamentally different ap-proach to ‘storing’ knowledge. “The best name is probably a Logical Data Warehouse…because it focuses on the logic of information …[for] giving integrated access to all forms of information assets.”  Only with integrated access to the data is it possible to have apps on top of that data that scale across use cases and pro-vide real added value.
Fusepool P3 (Linked Data Analytics Processing) derives its name from the idea of fusing and pooling linked data with analytical processing on top of it. Because linked data is multidimensional data, it lends itself to analytical processing such as consolidation (e.g. aggregation within a dimension), drill-down (e.g. navigating through the details), and slicing and dicing (e.g. viewing an aspect from different dimensions). However, an integrated publishing and processing workflow with integrated user interfaces is still missing. The lack of an integrated publishing and processing environment makes it difficult and time-consuming for data publishers and consumers to engage with linked data.