3.1.2. Search Engines
BioSumm can query three different biomedical databases, having thus access to a big amount of scientific articles. There are, however, differences between the results obtained using different search engines, depending on the type of files returned from them, along with the availability or not of full texts. In this section we'll show differences between them.
- PubMed: we can get a large number of articles in a limited time, but only abstract is available, not full text. It should be useful if the user is looking for large amounts of deep summarized texts, as abstract are usually a sort of summary of full texts. After searching articles via PubMed, abstracts are available double clicking on a specific article in the list.
- PubMed Central: the user can download full text of searched articles (in a format called 'nxml', ideal to be parsed in order to get all informations about the text), having thus access to more specific knowledge. Using this database takes more time, but we can get better results, also because, in some cases, PubMed Central provides also keywords of texts, useful for the summarizing part.
- Google Scholar: it doesn't provide a format suited to be parsed, but only articles as given by the author (usually in pdf or html format). Parsing is more difficult, and we have no way to know if we are getting the article in full text or just its abstract. Google Scholar results are less reliable but it's the only way for getting articles not available on PubMed Central.