/research/announcements/2004-04-09.xml

originally: http://www.oclc.org/research/announcements/2004-04-09.htm



    OCLC Research will harvest DSpace metadata
    OCLC Research will periodically harvest OAI-compliant metadata from the institutional repositories of interested DSpace users.
    bolander@oclc.org
    
    
    
    http://www.oclc.org/research/announcements/
    http://www.oclc.org/research/announcements/oclc_research_news.rdf
    April 9, 2004
    OCLC Research will convert the harvested metadata into a format suitable for re-harvesting by non-OAI services.
    <P>Much of the scholarly material on the Web is missed by harvesters. This includes metadata in OAI-PMH repositories, which DSpace uses. Google has several problems harvesting OAI repositories, which are different from standard Web pages.</P>
<P>The standard DSpace uses the Handle system (www.handle.net) for identifying items, which (purposely) mask the identity of the host, making harvesting difficult to schedule. The OAI protocol uses possibly non-persistent URLs to link pages of metadata. This also interferes with standard methods of harvesting.</P>
<P>OCLC Research is working with Google and MIT to periodically harvest interested DSpace users' metadata and transform it into a harvest-friendly format, resolve the handles so that institutions can be identified, and make the resulting URLs harvestable by search services such as Google.</P>
    <UL>
<LI>DSpace harvesting project<BR><A href="../projects/dspace/default.htm">http://www.oclc.org/research/projects/dspace/default.htm</A> 
<LI>DSpace<BR><A href="http://www.dspace.org/">http://www.dspace.org/</A> 
<LI>Thom Hickey<BR><A href="../staff/hickeyt.xml">http://www.oclc.org/research/staff/hickeyt.htm</A> 
<LI>Jeff Young<BR><A href="../staff/young.xml">http://www.oclc.org/research/staff/young.htm</A></LI></UL>
    
    Bob Bolander
    +1-614-761-5207
    
    
    
    
    
    
    
    Communications &amp; Programs Manager
    OCLC Research