October 21, 2022
@
8:00 am
–
September 30, 2025
@
5:00 pm
CEST
(PROJECT)
enRichMyData. “enRichMyData.” Accessed 13.08.2025. https://enrichmydata.eu.
enRichMyData develops a novel paradigm for building rich, high-quality and valuable datasets to feed Big Data Analytics and AI applications. It has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No 101070284.
The enRichMyData project is coordinated by SINTEF (Norway), one of European’s largest independent research organisations. The project partners include companies such as Philips (The Netherlands) and Bosch (Germany), dedicated to engineering and manufacturing; Speed Network (Estonia), a provider of procurement data; JOT Internet Media (Spain), a digital marketing company; CS Group (Romania), a software service company; Expert AI (Italy), a technology company specializing in natural language understanding; and Ontotext (Bulgaria), a semantic technology company. They will have the full support of the research partners that, in addition to SINTEF, include the University of Milano Bicocca (Italy), Jozef Stefan Institute (Slovenia), University of Copenhagen (Denmark), GATE Institute (Bulgaria), and BGRIMM Technology Group (China).
enRichMyData delivers an open software toolbox – the enRichMyData toolbox – comprising practical, robust and scalable components to support organizations in enriching their data with reference data they may have limited knowledge of, as well as supporting data providers in making their data reusable and available in data enrichment processes.
The toolbox lowers the technological entry barriers by providing support for the definition of highly scalable and replicable data enrichment pipelines through a set of tools and infrastructure services related to capabilities needed during the lifecycle of enrichment pipelines. The toolbox makes the data enrichment process accessible to a broader set of stakeholders by reducing the required expertise and enhancing the tool support level.
Empower AI-driven business products and services
h-quality, rich and meaningful data are crucial to successfully implementing Artificial Intelligence (AI) and Big Data Analytics (BDA) solutions. Delivering required data to feed into AI and BDA models is costly, difficult, and often limited in data and skill availability. It is well known that up to 80% of the effort spent in AI and BDA projects is dedicated to ensuring data is fit for purpose. Activities are required to discover, understand, select, clean, transform, and integrate data from a variety of sources in such a way that data can be fed into the modelling phase. Such activities result in enriched data, eventually improving the quality of downstream BDA and AI applications. The data enrichment process is implemented by specifying, deploying, and executing data enrichment pipelines over data that can be structured, semi-structured and unstructured, in large amounts, and from static or streaming sources. While techniques exist to cover different enrichment operations such as data cleaning, linking, feature extraction, classification and semantic annotation, etc., the lack of comprehensive approaches and established tools dedicated to data enrichment makes the definition, implementation, and operation of enrichment pipelines difficult for too many organizations willing to improve their BDA and AI applications.
The overall vision of the enRichMyData project is to create a novel paradigm for building rich, high-quality, valuable, and FAIR-compliant datasets to feed downstream BDA and AI applications in the context of data-sharing ecosystems, such as data spaces. The paradigm facilitates the specification and execution of data enrichment pipelines, focusing on supporting various data enrichment operations. enRichMyData makes this easily accessible to a wide set of large and small organizations that encounter difficulties in delivering suitable data to feed their BDA and AI solutions due to the lack of usable tools/expertise for the cost-effective management of data enrichment pipelines.