A new solution for classifying scholarly publications: Smart Topic Miner

The process of classifying scholarly outputs is crucial to ensure timely access to knowledge. This process is typically carried out manually by expert editors, leading to high costs and slow throughput. For these reasons, the Rexplore team, in collaboration with Springer Nature, created Smart Topic Miner (STM), a novel solution which uses semantic web technologies to classify scholarly publications on the basis of a very large automatically generated ontology of research areas.

STM was developed to support the Springer Nature Computer Science editorial team in classifying proceedings in the LNCS family, consisting in about 800 proceedings books each year. It analyses in real time a set of publications provided by an editor and produces a structured set of topics and a number of Springer Nature Classification tags, which best characterise the proceedings book. Differently from other applications which characterize a text with topics, STM produces a full taxonomy of the relevant research areas rather than a flat list of keywords or categories. This helps editors and users to understand the context of each topic and its relationships with other research areas.

You can try a public demo of STM at http://rexplore.kmi.open.ac.uk/STM_demo/

Relevant paper:

See you at ISWC 2015

I will be at ISWC 2015 October 11-15 to present the paper “Klink-2: Integrating Multiple Web Sources to Generate Semantic Topic Networks” about the automatic generation of large-scale ontologies of research topics. I will introduce Klink-2, a novel approach which analyses networks of research entities (including papers, authors, venues, and technologies) to infer three kinds of semantic relationships between topics. It also identifies ambiguous keywords (e.g., “ontology”) and separates them into the appropriate distinct topics – e.g., “ontology/philosophy” vs. “ontology/semantic web”. I am using this approach in Rexplore to foster a number of research analytics.

I will also present a poster/demo about the RASH Framework, a set of specifications and writing/conversion/extraction tools for writing academic articles in RASH (Research Articles in Simplified HTML), a HTML-based format that permits to embed RDFa annotations and Turtle statements within a document. This format was adopted already by a number of workshops this year at conferences such as ESWC, ISWC and WWW, and it is spreading quickly and raising the interest of a number of editors and conference organisers.

See you soon in Bethlehem, Pennsylvania!