Le temps des moteurs de recherche
Trois chercheurs viennent de publier dans l'excellent revue en ligne First Monday des résultats qui devraient nous inquiéter. Ils démontrent en effet que les moteurs de recherche qui proposent des classements (Google, Altavista) par date de document ne prennent pas en compte la date de création du document mais la plupart du temps la dernière date d'indexation. Cela est évidemment problématique dans la citation de documents, notamment dans les sciences humaines.
Voici quelques éléments de leur conclusion:
"We have shown that the search engines Alta Vista and Google systematically relocate the time stamp of Web documents in their databases from the more distant past into the present and the very recent past. Second, they also delete documents from the year they were initially assigned to. This leads to the loss of information in the historical record on the Web as represented in the search engine databases. Third, information also gets lost in the sense of loss of structure in the semantic networks.
This has major consequences for the use of search engines in social science research. In short, search engines are unreliable tools for data collection for research that aims to reconstruct the historical record or for research that aims to analyze the structure of information at a particular moment in history. Only those Web pages that contain the date of the publishing document in question (for example, in various Web archives and citation index databases), can be used for this purpose (Hellsten, 2003). This unreliability is not caused by sudden instabilities of search engines, but precisely by their operational stability in systematically updating the Internet. For many types of social science research, it is therefore necessary to build ‘tailor made’ archiving tools that are not based on the available commercial search engines."
Articles portant sur des thèmes similaires :
- Ne ratez pas Search 2009 demain. - 12/03/09
- Les dossiers documentaires à la demande de Market n'figures - 30/11/07
- Sur TS : Une étude du Butler Group qui rappelle les fondamentaux - 07/10/08
- Iceberg n° 13 - 30/09/2007 au 21/10/2007 (17 services et outils) - 21/10/07









Lien croisé
Ecrit par Anonyme, le Mardi 26 Octobre 2004, 16:07
Outils froids [Secrets 2 Moteurs] : "p of Web documents in their databases from the more distant past into the present and the very recent past. Second, they also delete documents from the year they were initially assigned to. This leads to the loss of information in the historical record on the Web as represented in the search engine databases. Third,...Source: http://outilsfroids.joueb.com/news/709.shtml"
Lien croisé
Ecrit par Anonyme, le Jeudi 16 Février 2006, 09:37
Outils Froids - Moteur Recherche Web : "Guide D'Utilisation De Googlek_praxispresenteleconceptderecherchedanslefutur Les Relations Entre Moteurs De Recherche Du Web Le Temps Des Moteurs De RechercheL'Internaute Au Coeur Des Preocuppations Des Grands Moteurs De RechercheL'Open Source Est Il L'Avenir Des Moteurs De Recherche?"
Lien croisé
Ecrit par Anonyme, le Mercredi 2 Mai 2007, 09:38
indexation document - Sujets chauds swicki - powered by eurekster : "OF - Le temps des moteurs de recherche "