Eprint website for the University of Tasmania

Professor Arthur Sale

2004 June 22, Draft Version 0.3

 

Executive Summary

This document proposes the urgent establishment of an eprint website for the University of Tasmania, and corresponding policies. A prototype website temporarily codenamed UTasER can be viewed at http://eprints.comp.utas.edu.au/ from within the University intranet.                                                           

 

 

What are eprints?

An eprint is an electronic version of a paper, article or thesis, preserved in an archive and searchable and retrievable globally. The word encompasses preprints (versions of a paper distributed before refereed publication) and postprints or reprints (copies of a published paper distributed by themselves). An eprint server is a server on which all or most of the research output of an institution is mounted, and which provides search and browse capability to find particular papers. Such a server is a useful addition to a university's profile, but not particularly valuable by itself. You have to know about it to search it.

To be really value-adding, an eprint server must comply with the standards of the Open Access Initiative (OAI), and be registered with global OAI harvesters such as myOAI (http://www.myoai.com/) and OAIster (http://oaister.umdl.umich.edu/o/oaister/). These provide global search services for research publications for all registered institutions, currently 3.2M records from 301 universities and research organizations.

A prototype eprint server has been established for the University of Tasmania and to experiment with the look-and-feel of an eprint server, view it at http://eprints.comp.utas.edu.au/. At the date of writing this prototype contains one journal paper, four conference papers, one newspaper article, and two technical reports. To get some experience, search for the key phrase 'spread spectrum' or search for the name 'Malhotra'. This prototype is not public (accessible only on-campus) and not registered with the harvesters. Try also viewing OAIster, and searching all 301 institutions for something or someone interesting to you; if your mind is blank try 'spam filter'.

You cannot reasonably comment on this proposal until you have some experience of what it might offer; to assist you this document itself is uploaded to the prototype server. Download the HTML version and you will have a set of live hyperlinks that you can take you directly to the places on the Web mentioned above and littered through this document.

Benefits

There are many benefits of an eprint website for the University of Tasmania. Perhaps the most significant to academics, RHD students and other researchers are the following which align firmly with the EDGE agenda.

       Our research output is made publicly available, free, and at the time of creation.

       The self-loading of preprints on the server provides prima facie proof of priority of the research findings.

       Global searches through the OAI bring our research and researchers more easily to the attention of other researchers worldwide.

       Papers available online are suggested by some information science researchers to be cited on average 300% more frequently than papers available only in paper form! See 'Articles freely available online are more highly cited', Nature, 411-6837, p521, 2001, also at http://www.neci.nec.com/~lawrence/papers/online-nature01/.

       Theses have limited availability to anyone outside the originating institution. Placing all publicly accessible theses of RHD graduates on an eprint server provides global access and establishes research priority. For example MIT has done this, and its research is now much more accessed.

Besides these, there are a wide range of more peripheral or long-range benefits which are unlikely to motivate academic staff yet which may resonate with senior management. These include:

       The Group of Eight universities have a project to install open access (eprint) archives in all of their membership. At the time of writing only the servers at Melbourne (www.unimelb.edu.au, 260 records), Queensland (http://eprint.uq.edu.au/, 875 records), ANU (http://eprints.anu.edu.au/ 2000 records) and Monash University (http://eprint.monash.edu.au/, 33 records) are operational. The University of Tasmania regards itself as equal with these universities. QUT also has an operational server, as does ALIA and the National Library of Australia.

       No university has access to the entire world's research. The open access initiative is aimed at making access to research output freely available to all, and joining this initiative incidentally assists in combating the serials pricing crisis.

       Some disciplines are already highly electronic in their dissemination practices, such as Physics and Computing. This trend can only be expected to continue, and an eprint archive will assist the University in maintaining its leading edge reputation.

       The initiative is an operation highly driven by standards, where global interoperability is seen as vital.

Implementation Barriers

Direct Costs

The direct costs (cash) are minor. The prototype eprints server is mounted on the same server used by the School of Computing for many other purposes. A dedicated server would cost say $5000, with ample disk space for records for several years and a better response time. However, initially a fully operational server could be mounted on an existing University web server.

The software we propose to use (http://www.eprints.org/) is all free under a GNU open source licence, as are updates. Registration with OAI harvesters is also free. Searches performed on harvesters such as myOAI and OAIster are free apart from Internet traffic costs.

Indirect costs

Indirect costs are more significant and can be broken down into technical support costs, server supervision, and upload costs.

Technical support by ICT personnel

The initial implementation effort for the prototype has been supplied by the School of Computing. The implementation could be easily transported to another server with minimal staff time. There will however need to be some work put into customizing the site to suit the University's visual standards and desired user interface. Other university sites offer examples. This need not be a large task, indeed could be minimal and evolve with the site. Depending on the upload solution adopted, it would be desirable for IT Resources to write a module to interface to the University LDAP server so that all research staff have automatic upload registration on the server with their email username and password; this might require say a week's work. Ongoing technical support should be minor, and mainly concerned with updates and backups.

Server supervision by information specialists

The server will require supervision by someone with a research or information science speciality. Regular monitoring will be required to approve uploads, and monitor the quality of the service and the status of the server. Depending on the take-up of the facilities, this might be a relatively light load. An upper bound estimate of the effort can be made by assuming that the entire research and thesis output of the University is uploaded to the server annually.

Uploads

Creation of content is the province of academic staff and RHD candidates/graduates. However, there is the additional step of submitting the content (preprint files and in some cases postprints) to the server. Three models are possible:

1.     In one, the academic uploads the file and enters the bibliographic information. Experience suggests that the work involved may be 5-10 minutes with a small amount of experience in what is required. This is a tiny fraction of the work involved in producing the paper, and would seem negligible. However, in other institutions, it has been seen as a barrier because it simply does not get done. The quality of the metadata may also be variable.

2.     In a variation on this theme, one person in each school is responsible for the collection and uploading. This might be the person responsible for PES entry. Entry would be smoother, quicker and more reliable, at the expense of some extra liaison with the academic and workload for that person. This solution seems appropriate for RHD theses, regardless of the way individual papers are uploaded.

3.     The ultimate in centralization would be to have a single institutional person do the uploading, with papers simply emailed to that person. This has the ultimate in consistency, but also requires a significant change to the duties of that person. Seeking additional information not initially supplied by the academic would constitute a significant part of that load.

Participation

The implementation of an eprint server is easy; the hard part is getting near 100% participation by researchers and coverage of institutional output. This can be readily seen by the performance of Australian institutions with eprint servers (from 33 at Monash to a respectable 2000 records at ANU). By comparison, MIT has 8000 theses and 4000 papers; Duke's Historical Sheet Music Archive has 17 000 records. To save rewriting what others have already experienced, here is what the eprint FAQ says:

 

How can an institution facilitate the filling of its Eprint archives?

1. Install OAI-compliant Eprint Archives .

2. Adopt a university-wide policy that all faculty maintain and update a standardised online curriculum vitae (CV) for annual review.

3. Mandate that the full digital text of all refereed publications should be deposited in the University Eprint Archives and linked to their entry in the author's online CV. (Make it clear to all faculty how self-archiving is in the interest of their own research and standing , maximizing the visibility , accessibility and impact of their work.)

4. Offer trained digital librarian help in showing faculty how to self-archive their papers in their own university Eprint Archive (it is very easy).

5. Offer trained digital librarian help in doing "proxy" self-archiving, on behalf of any authors who feel that they are personally unable (too busy or technically incapable) to self-archive for themselves. They need only supply their digital full-texts in word-processor form: the digital archiving assistants can do the rest (usually only a few dozen keystrokes per paper).

6. A policy of mandated self-archiving for all refereed research output, together with a trained proxy self-archiving service, to ensure that lack of time or skill do not become grounds for non-compliance, are the most important ingredients in a successful self-archiving program . The proxy self-archiving will only be needed to set the first wave of self-archiving reliably in motion. The rewards of self-archiving -- in terms of visibility , accessibility and impact -- will maintain the momentum once the archive has reached critical mass. And even students can do for faculty the few keystrokes needed for each new paper thereafter.)

7. Digital librarians, collaborating with web system staff , should be involved in ensuring the proper maintenance, backup, mirroring, upgrading, and migration that ensure the perpetual preservation of the university Eprint Archives. Mirroring and migration should be handled in collaboration with counterparts at all other institutions supporting OAI-compliant Eprint Archives.

Copyright

Wherever an eprint server is proposed, many respond 'But I can't do this, because the journal/conference I publish in won't let me.' This is largely nonsense, and there is an extensive literature on the reactions and the common objections. See the 'I worry about...' section at http://www.eprints.org/self-faq/.

In brief, the research and the paper belong to the academic and/or the employing institution prior to publication. At the preprint stage, the author (or the institution) is free to do whatever they want with it. Indeed in many disciplines there was a healthy trade in paper preprints of research articles until electronic archives took over the most significant and first example is Physics, but there are many others in the sciences and technologies. In others the paper preprint culture never took off, especially in the humanities. Regardless of the prior existence of a preprint culture, there is no copyright barrier to mounting preprints on an institutional server, right up to the point where the article is accepted and the publisher asks for a license to use the copyright.

At this stage, all publishers of journals or conference proceedings ask for some form of copyright license or more rarely assignment of copyright. In the majority of cases the exact form of this is more a matter of tradition than requirement, and the publishers are quite happy for preprints and postprints to be mounted on a personal website or institutional eprint server (for example Nature). Indeed in the computer sciences, some publishers will provide the postprint PDF file as printed in the paper journal or conference proceedings for the author to mount personally (for example the Journal of Research & Practice in Information Technology). The number of publishers that insist on sole rights is decreasing, and where possible researchers might consider not submitting to them. For an introduction to the extensive literature on this topic see http://www.eprints.org/self-faq/#publisher-forbids.


Recommendations

To implement a publicly accessible eprint server and get high participation as quickly as possible requires speedy implementation of some policies while others can take their time through the University system. The following recommendations provide a draft plan.

General endorsement

R1 Academic Senate endorses the general principle of an eprint server, and requests the cooperation of the corporate sections of the University and Information Technology Resources in particular in implementing this server as soon as possible.

Overall responsibility

Since the implications of this scheme span the Library, research and academics generally, a distributed responsibility is desirable.

R2 Responsibility for the implementation of an eprint server and the mounting of the University's research output on it be assigned jointly to the University Librarian and the Pro Vice-Chancellor (Research).

R3 The Librarian and the PVC(R) will be advised by a small steering committee appointed by the Academic Senate.

Time-frame

The sooner that this scheme is operational the better, as the G8 universities started down this track a year ago. R4 sets out a desirable and achievable timeframe. Note however that the need for an eprint server is not built in to the University's Plans, nor the performance criteria of the individuals involved. There do arise occasions when the time delays inherent in these procedures need to be bypassed, and this is one of them.

R4 A University server should be operational by end 2004, and participation by academics in uploading research documents should reach 90% by end 2005.

R5 Following achievement of the 2005 target the steering committee will be disbanded and the ongoing responsibility for the service vested in the University Library.

Policies and discussion

Appropriate policies will need to be discussed and agreed, if a high level of academic participation is to be a reality. Organization of additional workload or implementation effort will also need to be considered.

R6 Academic Senate refers this paper and the attached draft policies to the Faculties, the Board of Graduate Studies, the Tasmania University Postgraduate Association, the RHD Unit, the Web Development Office, and the Research & Development Office for discussion. Comment is to be received in time for the Senate meeting on ** 2004.

 



Appendix 1

Draft Research Eprint Policy

1.     The University of Tasmania's policy is to maximise the visibility, usage and impact of its research output by maximising online access to the research for all would-be users and researchers worldwide.

2.     It is also the University's policy to keep to the minimum the effort that each researcher has to expend in order to provide open online access to his or her research output.

3.     The University has accordingly adopted the policy that all research output is to be self-archived by researchers in the University EPrint Server.

4.     This policy will be progressively implemented over the remainder of 2004 and 2005, so that by end 2005 all publicly accessible research output is uploaded to the server at the time of writing and publication. Responsibility for the implementation is assigned to the University Librarian and the Pro Vice-Chancellor (Research) jointly.

5.     Publicly accessible research output includes all refereed journal articles and conference papers/short papers/poster presentations; all unrefereed conference papers/short papers/poster presentations, newspaper articles, books and research monographs, book chapters, and theses of graduating RHD candidates. Optionally, researchers may include long versions of published papers, errata, internal technical reports. Publications under a permanent or temporary embargo because of third party sponsorship are of course excluded as full-text entries, but should be included as abstracts, titles, etc.

6.     Thesis submission rules will be altered to require the provision of an electronic version of the thesis at an appropriate time (refer to Board of Graduate Studies).

7.     This archive will form a comprehensive record of the University's research publications, and will be referred to in the University's Annual Report and Research Report. Note that the archive goes beyond the information entered into PES, which will continue for Commonwealth purposes.

Advice regarding implementation

One of the key matters for discussion is Policy 3 above. The evidence from existing archives suggests that voluntary participation yields low participation rates, and the University will fail to meet its objective. Policy 3 suggests that participation is required for all research output. However, this should be phased in over a year, with the Library conducting training sessions for each school, and establishing a proxy-service desirably in the school but within the Library as a backup to assist researchers who are unable to or unwilling to load their our research.

1.     The University does not require the full text of books or research monographs to be uploaded. It is sufficient to archive a reference along with the usual metadata.

2.     PhD and research Master theses should be archived at the point that the candidate is approved for graduation. The uploading is assigned to the RHD Unit for implementation. Thesis submission guidelines will need to be revised so that candidates provide a complete (or near complete) electronic version of the thesis in an acceptable format. Restricted access theses will not be uploaded as full text, or will be uploaded when the reason for the restriction expires.

3.     Research papers submitted to journals and conferences should be uploaded as a preprint at the time of submission. Following revision, acceptance and publication, a revised record should be stored if the publisher's policies permit (see below).

4.     This policy is compatible with publishers' copyright agreements in the following ways:

       The copyright for all unrefereed preprints resides entirely with the author(s) and the University before it is submitted for peer-reviewed publication, hence it can be self-archived irrespective of the copyright policy of the journal to which it is eventually submitted.

       The copyright for the peer-reviewed postprint will depend on the wording of the copyright agreement which the author signs with the publisher, and self-archiving of the postprint will depend on this agreement.

       Many publishers allow the peer-reviewed postprint to be self-archived (eg American Psychological Association). The copyright transfer agreement will either specify this right explicitly or the author can inquire about it directly. If you are uncertain about the terms of your agreement, a table of copyright policies is available at http://www.sherpa.ac.uk/romeo.php. Wherever possible, you are advised to modify your copyright agreement so that it does not disallow self-archiving.

       In case you have signed a very restrictive copyright transfer form in which you have agreed explicitly not to self-archive the peer-reviewed postprint, you are encouraged to self-archive, alongside your already-archived preprint, a 'corrigenda' file, listing the substantive changes the user would need to make in order to turn the unrefereed preprint into the refereed postprint.

       Copyright agreements may state that eprints can be archived on your personal homepage. As far as publishers are concerned, the eprint archive is a part of the University's infrastructure for your personal homepage.

       Some journals still maintain submission policies which state that a preprint will not be considered for publication if it has been previously 'publicised' by making it accessible online. Unlike copyright transfer agreements, such policies are not a matter of law but simple coercion by the publisher. If you have concerns about submitting an archived paper to a journal which maintains such a restrictive submission policy, please discuss the matter with the University's IP Adviser.