![]() | |
Type of site | Search engine |
---|---|
Created by | Allen Institute for Artificial Intelligence |
URL |
semanticscholar |
Launched | November 2, 2015[1] |
Semantic Scholar is an artificial intelligence–powered research tool for scientific literature developed at the Allen Institute for AI and publicly released in November 2015. [2] It uses advances in natural language processing to provide summaries for scholarly papers. [3] The Semantic Scholar team is actively researching the use of artificial-intelligence in natural language processing, machine learning, Human-Computer interaction, and information retrieval. [4]
Semantic Scholar began as a database surrounding the topics of computer science, geoscience, and neuroscience. [5] However, in 2017 the system began including biomedical literature in its corpus. [5] As of September 2022, they now include over 200 million publications from all fields of science. [6]
Semantic Scholar provides a one-sentence summary of scientific literature. One of its aims was to address the challenge of reading numerous titles and lengthy abstracts on mobile devices. [7] It also seeks to ensure that the three million scientific papers published yearly reach readers, since it is estimated that only half of this literature are ever read. [8]
Artificial intelligence is used to capture the essence of a paper, generating it through an "abstractive" technique. [3] The project uses a combination of machine learning, natural language processing, and machine vision to add a layer of semantic analysis to the traditional methods of citation analysis, and to extract relevant figures, tables, entities, and venues from papers. [9] [10]
In contrast with Google Scholar and PubMed, Semantic Scholar is designed to highlight the most important and influential elements of a paper. [11] The AI technology is designed to identify hidden connections and links between research topics. [12] Like the previously cited search engines, Semantic Scholar also exploits graph structures, which include the Microsoft Academic Knowledge Graph, Springer Nature's SciGraph, and the Semantic Scholar Corpus. [13]
Each paper hosted by Semantic Scholar is assigned a unique identifier called the Semantic Scholar Corpus ID (abbreviated S2CID). The following entry is an example:
Semantic Scholar is free to use and unlike similar search engines (i.e. Google Scholar) does not search for material that is behind a paywall. [14] [5]
One study compared the search abilities of Semantic Scholar through a systematic approach, and found the search engine to be 98.88% accurate when attempting to uncover the data. [14] The same study examined other Semantic Scholar functions, including tools to survey metadata as well as several citation tools. [14]
As of January 2018, following a 2017 project that added biomedical papers and topic summaries, the Semantic Scholar corpus included more than 40 million papers from computer science and biomedicine. [15] In March 2018, Doug Raymond, who developed machine learning initiatives for the Amazon Alexa platform, was hired to lead the Semantic Scholar project. [16] As of August 2019, the number of included papers metadata (not the actual PDFs) had grown to more than 173 million [17] after the addition of the Microsoft Academic Graph records. [18] In 2020, a partnership between Semantic Scholar and the University of Chicago Press Journals made all articles published under the University of Chicago Press available in the Semantic Scholar corpus. [19] At the end of 2020, Semantic Scholar had indexed 190 million papers. [20]
In 2020, users of Semantic Scholar reached seven million a month. [7]
...the publicly available corpus compiled by Semantic Scholar — a tool set up in 2015 by the Allen Institute for Artificial Intelligence in Seattle, Washington — amounting to around 200 million articles, including preprints.