e-Publications

Electronic Publishing in
Science 2001

back to e-Publications

 

SECOND UNESCO/ICSU CONFERENCE ON ELECTRONIC PUBLISHING IN SCIENCE

UNESCO Headquarters, Paris, February 20-23, 2001


Wendy Warr, Wendy Warr & Associates, 6 Berwick Court, Holmes Chapel, Cheshire, CW4 7HZ, England.
Tel and Fax +44 1477 533837
Email: wendy@warr.com
WWW: http://www.warr.com

Introduction

There were approximately 200 attendees from 69 nations, including 33 registrants from the UK, 27 from the United States, and 16 from France.

The meeting opened with addresses from Hiroyuki Yoshikawa, President of the International Council for Science (ICSU), from representatives of the Director General of the United Nations Education Science and Culture Organisation (UNESCO) and the EU Commissioner for Research. Access to scientific information for all is an ICSU principle, but restrictive policies that have been enforced since the first ICSU/UNESCO conference in 1996 have made access harder, especially for those in developing nations. The Declaration of Scientific Knowledge and Science Agenda documents adopted by attendees of the World Conference of Science held in Budapest in 2000 covered various issues, but the sharing of knowledge was an issue that ran through all the conclusions. There is a universal right to share in scientific knowledge.

Conference Chairman’s Introduction

http://associnst.ox.ac.uk/~icsuinfo/elliottppr.htm

Sir Roger Elliott, Chairman of the ICSU Press, emphasised the importance of the free flow of authenticated research information. Electronic publishing is a potential panacea to break the vicious circle of increasing journal costs and cancelled subscriptions. The electronic journal is inevitable but there is a need for quality control, control of multiple versions, peer review, authentication, availability (browsing) and archiving. The inclusion of new media might mean that the new medium is not cheaper, and it is possible that print will not disappear altogether. There is, as yet, no proven economic model fore-publishing.

An international STM/AAAS working group has drafted a paper on what constitutes "publication" in science in the electronic environment. Funding authorities have to pay, so they want more control. Authors have rights and they must be more involved. Copyright law needs modification to cover the possibilities of new media. Recent changes in legislation have impacted "fair use". The new European database law, covering content and compilation, may also have a backlash from authors insisting on their rights and restricting access to their data. A fair balance is needed between the rights required by copyright holders and the right of scientists to have access to data, unrestricted use for education and research, and freedom from contractual interference. The conference also needed to consider issues affecting developing nations. Electronic access could answer some of their problems of underfunded libraries and the low impact factor of local journals.

Principles for a New System of Publishing for Science

http://associnst.ox.ac.uk/~icsuinfo/shulenbergppr.htm

David Shulenburger, University of Kansas, USA, outlined the scholarly communications crisis which he sees as publishers using the market system to limit access to scholarly literature to those increasingly few who can afford to buy it. He considered eleven proposals that have been made to alleviate the crisis:

  • Increasing library budgets. This will simply cause faster price increases.
  • Preprint services. These will not have adequate refereeing and may not reduce demand for journals. "Grey literature" such as that on preprint servers and universities’ own Web sites is not suitable for the preservation of scientific literature, and it is most likely to be accessed by an "in-group" or clique, not by the developing world.
  • Open Archives Initiative. It is questionable whether manuscripts at such sites are a substitute for journals.
  • Minimal refereeing servers. PubMed Central is too lightly refereed to please the medical community.
  • Scholarly Publishing and Academic Resources Coalition (SPARC). Can enough competition be created to lower prices, or will alternative journals merely add to the demands on library budgets?
  • Antitrust activity. Will governments act to keep publishers from acquiring market power?
  • De-coupling. Dissociating refereeing from publication might stifle creativity by creating a monopoly refereeing process.
  • Buying cooperatives. Monopsony is a proven way to reduce cost but the ultimate weapon is not available here: you cannot stop libraries from buying the top journals.
  • National Electronic Article Repository (NEAR). This is a proposal, by Shulenburger, for a publicly accessible repository into which all articles would be placed 90 days after journal publication. This would give publishers 90 days of exclusive ownership.
  • Author boycott. Over 3800 scientists from 77 countries have signed the Public Library of Science Petition.
  • Control of societies by their members. This might keep down the cost of society journals but it will not affect commercial journals.

In the Spring of 2000, a group of US academic leaders (the American Association of Provosts) developed a set of principles to deal with the following elements:

  • Cost containment
  • Use of electronic access
  • Archiving
  • Evaluation of quality (refereeing is not essential but the reader must know whether the work is refereed).
  • Copyright and fair use.
  • No assigning copyright to, or publishing in, expensive journals where there is an alternative.
  • Acceleration of time to print.
  • Quality versus quantity.
  • Privacy rights.

The Contribution of Electronic Communication to Science: Has it Lived up to its Promise?

http://associnst.ox.ac.uk/~icsuinfo/cettoppr.htm

Ana Maria Cetto, Instituto de Fisica, UNAM, Mexico, said that many electronic journals have appeared but most libraries want to preserve the print copy while offering access to the electronic version. Libraries do not want to be forced to take extra unwanted journals or to use multiple front ends. Scientists often only want to read one article per issue of a journal. The novelty of the Internet is wearing off in the developed world, while only 1% of Internet usage is from Africa and the Middle East. Just a few dominant companies have control in science publishing.

The goals of science publishing, according to Pullinger, 1998, are:

  • Communication of the latest research.
  • Archiving information and data.
  • Producing a record of scientific endeavour.
  • Establishing claims to precedence in discovery.
  • Career development.

The academic culture of publish or perish persists and more than 94% of scientific publication is in English.

The Changing Role and Form of Scientific Journals

Michael Keller, Stanford University, USA, spoke from six years of experience in HighWire Press. Internet editions of HighWire journals have new features such as:

  • Hyperlinks to MEDLINE, Genbank, ISI Web of Science and cited references in the HighWire domain.
  • Free back issues, most of them 12 months old (or older) but a few, such as the British Medical Journal, free from the current issue.
  • Cross-journal searching.
  • Citations and other personalised alerts.
  • Electronic-only articles (published faster than "print-plus-electronic" ones.)
  • Publishing when ready.
  • "Electronic long, print short" printing of only a short summary speeds publication and reduces costs.
  • Supplemental data, such as moving pictures, more figures and colour.
  • Opening the refereeing process.
  • Communication between readers and authors.
  • Private "overnet", avoiding the costs and synchronisation of mirror sites.
  • Knowledge environments: virtual journals with new content, threaded conversations, biographical information, calendars, jobs etc.

Keller said that there is no single solution to the issue of digital archives. HighWire is beta-testing LOCKSS open source software http://lockss.stanford.edu. The JSTOR science collection includes Science and the Proceedings of the National Academy of Science. Hyperlinking in PDF is becoming possible.

Before the Internet, scholarly communication was passive and slow, and dependent on process and postal delivery. Information retrieval was difficult and imprecise. The cost of entry to the market was high, which had advantages for the well-financed and well established. The Internet has changed much of this, but reviewing is still not public enough.

Creating a Global Knowledge Network

http://arxiv.org/blurb/pg00bmc.html

Paul Ginsparg, Los Alamos National Laboratory, USA, described the LANL archive which now covers close to 100% of articles in high energy physics. The number of articles in condensed matter physics archived might soon overtake the number in high energy physics. There are 17 international mirror sites. Usage of the archive roughly reflects the GDP of the countries of users: USA 32% Germany 11.5% France 4.9% etc. The number of publications per capita correlates with GNP per capita.

Odlyzko has stated that publishers in mathematics and computer science bring in an average of $4000 in revenue per article. Ginsparg estimates that a typical non-profit professional society publisher brings in $2000 in revenue per article, while some commercial publishers bring in as much as $15,000 in revenue per article. There are newer electronic start-up journals that currently bring in only $500 per article. The arXiv dissemination without peer review costs as little as $1-$5 / article so the increase in cost due to peer review is clear.

Ginsparg next presented "A Future with a View": a way of disentangling and decoupling production and dissemination on the one hand from quality control and validation on the other (as was not possible in the paper realm), to create a more forward-looking research communications infrastructure. He depicted three electronic service layers with the interested reader/researcher being given the choice of the most auspicious access method for navigating the electronic literature. The three layers are the data, information, and knowledge networks, where "information" is taken to mean data plus metadata, and "knowledge" signifies information plus additional synthesising information.

At the data level are providers such as the Los Alamos e-print archive, university library systems such as California Digital Library, and funding agencies such as the French Centre National de Recherche Scientifique. Ginsparg chose these for his diagram to convey the likely importance of library and international components. There already exist cooperative agreements with each of these to coordinate via the "open archives" protocols to facilitate aggregate distributed collections.

Representing the information level, Ginsparg picked a generic public search engine, Google, a generic commercial indexer, the Institute for Scientific Information, and a generic government resource, the PubScience initiative at the Department of Energy, suggesting a mixture of free, commercial, and publicly funded resources. For the biomedical audience, he could have included services like Chemical Abstracts and PubMed at this level.

At the knowledge layer, Ginsparg picked the physics publishers APS, JHEP and ATMP (Applied and Theoretical Mathematical Physics). BioMedCentral could have been included at this level if the subject had not been physics. These third parties can overlay additional synthesising information on top of the information and data levels, and partition the information into sectors according to subject area, overall importance, quality of research, degree of pedagogy, and interdisciplinarity, and can maintain other useful retrospective resources. The synthesising information in the knowledge layer is the glue that assembles the building blocks from the lower layers into a more accessible knowledge structure. The three layers are multiply interconnected. Journals of the future can exist in an "overlay" form, i.e., as a set of pointers to selected entries at the data level.

After this presentation, a member of the audience pointed out that articles that do not get published cost more than articles that do. They go from one journal to a "lesser" one and perhaps even to a third in a process of market selection.

E-BioSci: a Europe-based Platform For e-Publishing and Information Integration in the Life Sciences

http://associnst.ox.ac.uk/~icsuinfo/grivellppr.htm

Les Grivell, European Molecular Biology Organisation, saw that paper is an inadequate medium for the flood of new genomics data which requires both further manipulation and new methods of visualisation as an aid to interpretation. Thus, researchers have come to question established editorial, reviewing and publishing practices. Smoothing the path from data to knowledge requires software for mapping inter-database relationships, increasing the use of XML and CORBA, improving search, navigation and visualisation tools, and making wider use of the Digital Object Identifier (DOI). E-BioSci is a distributed network of information resources, with multiple entry points in different language formats, giving access to abstracts, full text, facts and multimedia. It will provide effective linkages between databases and literature, as a host and archive for peer-reviewed e-publications. It differs from PubMed Central in several respects.

The SciELO Model for Electronic Publishing and Measuring of Usage and Impact of Latin American and Caribbean Scientific Journals

http://associnst.ox.ac.uk/~icsuinfo/packerppr2.htm

Abel Packer of Brazil, represented the Scientific Electronic Library Online (SciELO) project which aims to increase the visibility, accessibility, credibility and impact of Latin American and Caribbean Scientific journals. SciELO sites, developed at national level, are being networked through a simple portal which will be gradually improved over the next 2-3 years. The project was launched in 1988 and currently offers about 85 journals.

Electronic Publishing of Scientific Journals - the (New) Economics

Pieter Bolman, Academic Press, first calculated the "old economy" costs of producing 10 issues of a journal (1000 pages in all) for 900 subscribers or for 5700 subscribers. In the first case, fixed costs are $45,000, variable costs are $4500 ($5 a copy) and the unit cost is $49500/900 or $55. This gives a notional list price of 5 x $55 or $275 (with the multiplier allowing for overheads). If there are 5700 subscribers, the unit cost falls to $12.90 and the notional list price is $89.50. (Postage and handling adds $25 in both cases).

Society publishers can make some savings, e.g. 10% because of tax-free status and 5% for not having to pay an agent. Grants for investment may lead to a further discount. Also, marketing costs are lower because societies have direct access to members. Thus, low circulation journals have a high unit cost and commercial publishers publish low circulation journals. The circulation effect on price is at least as large as other factors.

The SPARC initiative will only partly plug the gap. A SPARC journal would have to have a very large circulation. Andrew Odlyzko, in Competition and Cooperation, 1999, claims that even if journals were free, the growth in the literature would bring us back to the crisis situation.

In the electronic economic model, fixed costs are the same as in the print model plus electronic preparation costs, giving a total fixed cost of $50,000. In addition, there are system costs for hosting, hardware depreciation and maintenance, software licensing, article processing and linking, and customer service. The "pre-run" costs are thus higher but there is a saving on "plate and make" if the journal is electronic-only. Electronic costs are independent of the number of users and usage; electronic publishing is a fundamentally different business model.

Bolman listed three challenges: giving as many people access as possible; managing the transition from a print to an electronic model; and how to have the two models running side by side, with the problem of a low print run. Carl Shapiro and Hal R. Varian, in Information Rules, 1999, HBSP, state that as information is costly to produce and cheap to reproduce, it should be priced according to value, not cost; versioning has to be considered and standards lead to competition within a market.

Bolman proposed a future postulate in which all information is accessible on the Web, properly indexed, searchable and linked. Through standards such as CrossRef, systems will be interoperable. Electronic archiving will be carried out by publishers and trusted third parties, not by librarians.

A simple approach is to give an electronic subscription free with a print subscription. The Academic Press (AP) approach is to give electronic access to all AP journals for a premium over the cost of paper holdings, via a consortium. A plan for low-income countries has recently been announced. AP found that librarians preferred a "pick and choose" method but end-users vindicate the AP approach: more than 50% of journals used by patrons were journals to which the library had not previously subscribed.

APS uses multi-tiered pricing depending on the Carnegie classification. A (cheaper) electronic-only option is available. The online archive (PROLA) is included. This model is reaching for the future. One advantage is that it is to some extent usage-based. The disadvantages are discontinuity, the fact that some institutions pay significantly more or less, and the lack of a Carnegie classification for institutions outside the United States.

Steve Harnad’s approach is to encourage authors to put articles on e-print archives, while journals continue as a stamp of quality. Harnad wants to abolish the Ingelfinger rule. http://www.nejm.org/content/1991/0325/0019/1371.asp [Under the editorship of Franz Ingelfinger, the New England Journal of Medicine adopted a policy of declining to referee or publish research that had been previously published or publicised elsewhere. Other journals have since adopted this "Ingelfinger rule".] The e-archive idea is untried, scaling, and questionable. The journal is still an economic asset. If everything were free we would be inventing a perpetual motion machine.

Next, Bolman discussed PubMed Central. Why centralise? Coverage is incomplete (only journals and Genbank). The archive is valuable but publishers sacrifice a great deal. The model is US-centric. The print-plus-electronic model leads to sure costs. The transition from print to electronic is difficult but usage statistics will eventually help decisions to be made. Electronic-only assumes an electronic archive in all cases. Libraries save a lot of money. Carol Tenopir and Donald W. King have reviewed the situation in Towards Electronic Journals 2000, SLA Publishing.

In the discussion that followed, Smith said that we should get out of the information business and into the knowledge model: a 20,000 circulation with the electronic model seems to be the way ahead. Bolman questioned why publishers should give away information free. He gave an example of how costs might shift: a physics department ending up paying more because the library pays less. The APS has changed its copyright requirements. Bolman prefers the APS approach to the ACS approach. Bob Campbell, Blackwell Science, pointed out that while hard copy circulation was decreasing, electronic circulation was increasing. Sir Roger Elliott resents "vanity publishing" of low circulation journals. Bolman said that libraries can easily stop taking such journals.

Science Publishing in Asia

http://associnst.ox.ac.uk/~icsuinfo/phuappr.htm

A paper from Phua Kok Khoo, Chairman, World Scientific Publishing Singapore, was read by a colleague from Imperial College, London. Science and education now have top priority in China. There are more than 80,000 Chinese scientific professionals in North America. Many new journals in the Chinese language have appeared over the last 20 years. The author made various suggestions, such as licenses to reprint (in English) journals in China, and Chinese government subsidies for journals. Chinese scientists gain greater recognition by publishing in international journals, so the Chinese should improve the quality of their journals and provide incentives for Chinese scientists to publish in local journals.

The Indian subcontinent can import only a small amount of material because of shortage of funds. The gap between developed and developing countries is getting larger. The Indian subcontinent needs licences to reproduce materials, or very low online costs. Cheaper textbooks in schools are also essential: the lack of textbooks in primary and secondary schools in Pakistan and Bangladesh is a particular problem.

Collections of important papers are useful in Asia but Western publishers make high charges for copyrighted material. Authors should have copyright, not publishers. China, India and Russia have so many scientists that it is unwise to exclude them from the communication chain.

Financial Consideration for Scientific Publishing in Developing Countries: the Case for the People’s Republic of China

http://associnst.ox.ac.uk/~icsuinfo/lukppr.htm

Steven K. Luk, The Chinese University Press, started by outlining the unique characteristics of the publishing industry in the People’s Republic. The entire industry is under state ownership and management control. China’s development is defined by its Four Modernisations Programme whereby scientific education and R&D are given top priority. Until 1996 computers were seldom applied in word processing. Beida Fengzheng, the corporate giant of electronic publishing software was established in that year and digitisation of Chinese characters was by then carried out on a large scale. The English language is the lingua franca of the scientific community in Hong Kong and Taiwan but the People’s Republic wants an independent scientific community. Electronic publishing materials in Chinese do have a market outside China.

The largest scientific online publishing project is the government funded China Journal Net administered by the Chinese Academic Journal Electronic Journal Publishing House. The full text databases of the Chinese Academic Journal (CAJ-CD) cover more than 3500 scholarly journals. Most literature in the database will have bibliographies in both Chinese and English, and all academic journals will carry abstracts in both languages. Administration is by the database, information design and publishing unit of the Tsinghua University (the "MIT of China"). The literature is selected, evaluated and edited by the Academy of Science and the Academy of Social Sciences. Distribution of 70 specialised databases is through 80 "mirror substations" and on CD-ROM.

The success of electronic publishing in China owes as much to government funding as to the monopolistic nature of the business. Initial capital investment came from the government. Royalties of 11% are paid to journal publishers and content providers and the China Journal Net will also provide a standard platform to help the journals to digitise their contents. Salaries are probably paid by organisations such as the Academy of Science and Tsinghua University and it is not uncommon in China for a person to hold several concurrent positions. China Journal Net adopts a two-tier pricing system whereby international clients, including those from Taiwan and Hong Kong are charged much more than their counterparts on the mainland.

While the electronic publishing of general titles in China is plagued by widespread piracy, the electronic publishing of scientific journals has, by and large, attained the level of fairness, legality and efficiency defined by Odlyzko [The Bumpy Road of Electronic Commerce, WebNet 96-World Conference Web Society Proceedings, ed. H. Maurer, ACCE, 1996, pp378-389]. Unfortunately, scientific publications in China, similar to those of many countries, are of marginal monetary value, unless they are published in English.

The Role of Non-profit Organisations, such as Learned Societies, in Japan

http://associnst.ox.ac.uk/~icsuinfo/tadappr.html

Kunio Tada, Yokohama National University, Japan, took his examples from the field of physics. Many Japanese scientists submit papers in English to international journals. The Institute of Pure and Applied Physics (IPAP) which was founded last year, linking PSJ, JSAP and other societies, publishes four journals in English. If other organisations join, IPAP could be a major force in physics publishing.

Learned society publishers are facing major challenges, such as the increase in the number of pages published, increasing costs, and a decreasing number of subscriptions. The Ministry of Education has offered help in the form of grants and by taking charge of developing a publishing system with hyperlinking for several organisations in Japan. JJAP is a pilot journal for IPAP online. Abstracts are in HTML, full text is in HTML and PDF, and there are hyperlinks to almost all major journals, AIP etc. This journal will be online in May; the other three will follow.

The Impact of the Electronic Environment on Public Involvement in Scientific Issues

Sir John Maddox, former Editor of Nature, UK, said that Internet technology provides benefits to ordinary people worldwide. The Net is also a source that journalists, for example, find invaluable. The public gets information from the Net: could the scientific community be more proactive in supplying it?

Maddox gave two examples of "anti-scientific" activity on the Net: the campaign against Huntingdon Life Sciences and Fred Singer’s "evidence" opposing global warming theory. Cloning is another contentious issue. The orthodox established organisation could deliver measured opinions rather than polemics. The scientific community could set up Web sites but the money for them must not come from "tainted sources" or from Government.

Changing the subject, Maddox said that the scientific record must be maintained but publishers may not truly appreciate the need for long-term archival security. The APS has, however, made a deal with the Library of Congress. Maddox concluded with some comments about peer review. The object of Nature was to produce an interesting, value-added journal, not just a set of papers. Reviewers do not work to the same set of rules, so standard peer review was not useful for Nature.

New Legislation on Copyright and databases and its Impact on Science

http://associnst.ox.ac.uk/~icsuinfo/dreierppr.htm

Thomas Dreier, Professor of Intellectual Property, Karlsruhe Technical University, Germany, expanded on two treaties of the World Intellectual Property Organisation (WIPO) in Geneva which were enacted at the end of 1996. Protection has been afforded to electronic databases. Some see this as the answer to the problem of the Internet; others see it as inhibiting communication in science. The three-step test is applied, as each country sees how free it is to find loopholes in the rights which are (1) confined to certain special cases (2) should not conflict with the normal exploitation of a work and (3) should not unduly prejudice the rights of the author. These loopholes have been left deliberately vague.

Dreier turned to some hot topics. How far should intermediaries be liable? Circumventing devices, and acts, can be used legally and illegally. Technology cannot distinguish between fair and unfair use. The legal protection of databases is contentious. "Everyone will turn into a database". Copyright does not protect the content: it is the selection and arrangement which is protected. Europe enacted a database directive in 1996 but the issue is still unresolved in the United States. In Europe, the 15 years protection is extended for another 15 years if a substantial change is made to the database. Phone directories, concert programmes, and a collection of 250 links have been impacted in Germany. Scanning other people’s articles for a service is illegal. There are 20 exceptions in the database law and each member state decides on its own implementation, e.g. what is a substantial investment? The existing exception for science is too restrictive: it inhibits teaching.

Ethical and Privacy Issues, Particularly in the Biomedical Sciences

Richard Smith, Editor, British Medical Journal (BMJ), first talked about informed consent. It is very difficult to guarantee anonymity, e.g., the familiar bands across people’s eyes in the media are a waste of time. Doctors and scientists have different views from the public’s on privacy. Public fear could lead to a backlash. People are frightened that the Inland Revenue, hackers, and others could get access to their records. We will all be the losers in epidemiological studies if we do not get our views over to the public.

The Internet is not curing the divide between the haves and the have-nots in access to information because so many people in the developing world do not have access. Sharing information is not like sharing bananas.

In medical ethics you consider justice, autonomy, beneficence and non-malevolence. If you do not communicate, you can be blamed for bias, incompetence and corruption. Unsigned editorials in medical journals were once common. Nowadays peer review should be open. "In God we trust: all others bring data". The BMJ has carried out randomised, controlled trials of peer review. Is it expensive: is it valuable? The BMJ showed that the two-reviewer method is only slightly better than random. They tried out a paper with 8 errors on 600 people. Peer review is slow, expensive, biased, easily abused and poor at detecting fraud. A lone editor can do just as well. What peer review does achieve is to improve a paper.

Smith had a few comments about research misconduct. There is more of it about than is generally acknowledged, but lots of it is minor misconduct which raises issues similar to that of moving from soft to hard drugs. We must tackle misconduct or forfeit the public trust. Public trust is especially important in medical research. We should not torture the data until it confesses. The electronic environment makes it easier to torture data but it should also make it easier to detect copying. Open review in the BMJ on the Web should help. We should also move to publication of protocols: The Lancet does this. We should publish the software and the data set. Thus the Web could help us to do a better job. "The Internet expects transparency".

Impact on the Publication and Use of Large Data Sets

http://associnst.ox.ac.uk/~icsuinfo/rumbleppr.htm

According to John Rumble, President, ICSU Committee on Data for Science and Technology (CODATA), data collections differ from full text databases in their dimensions and the tremendous amount of metadata needed. They come from every area of science. Traditionally, large data sets have been published as an overview with the complete record in the grey literature. Printed records have not been totally computerised. There has been some direct deposition, often managed by specialised data centres. Active users are aware of these but users outside the field may not be. Today’s multi-disciplinary research thus suffers. The information revolution allows new possibilities:

  • No technological barriers.
  • The cost of data generation remains the same. (In 1970, it was estimated that the research cost of each paper was $50,000.)
  • Data investments are small and support for data centres is precarious.
  • The costs of archiving large data sets are not understood, which is especially unfortunate for observational data which cannot be re-measured.
  • Data published electronically grows in value, whereas interest in full text articles falls dramatically soon after publication.
  • Large scientific data sets are valuable.
  • Large scientific data sets require special management.
  • Large scientific data sets present new discovery opportunities.
  • The Intellectual Property regime is uncertain. (Note, for example, the EU Database Directive).
  • Standards are needed.
  • Rumble predicts that adjustments will be needed, for example, for linking.

Best Practices for Electronic Publishing, Including Standardisation, Peer Review and Integrity

http://associnst.ox.ac.uk/~icsuinfo/kirczppr.htm

This paper, by Joost Kircz, KRA Publishing Research, Amsterdam, the Netherlands, is available on the conference Web site. Some of Kircz’ research on information objects was also reported at Chemistry and the Internet in 1998 and is available elsewhere on warr.com. http://www.warr.com/cheminternet98.html#modularity

The Role of Abstracting Services

http://associnst.ox.ac.uk/~icsuinfo/ingerppr

Simon Inger, CatchWord, UK, feels that the term "aggregator" has been misused. He defines three classes of aggregator: the content host, the gateway and the full text aggregator. Content hosts such as HighWire, CatchWord and Ingenta provide a service to publishers and are generally (but not in the case of HighWire) non-selective. They started out as subscription agents and now provide a collection of links to full text. Sometimes (e.g., OCLC FirstSearch) they have knowledge of access rights and sometimes they allow libraries to select resources.

CSA, ISI, Silver Platter and TheScientificWorld are examples of gateways. Since no provider indexes everything, a library can act as a gateway by setting up links on a Web page of its own. "Library portals" are part of Ingenta’s business model. A gateway is a first port of call only: the next click is likely to be to the text or the publisher, so the library loses its visitor. SFX can help the library retain its reader.

ADONIS, Ebsco and UMI (now Bell and Howell) are full-text aggregators. They license full text and sell it as an aggregated selection of content. Do they thwart the growth of primary publishing? The publishers may have to choose between royalty payments or flexible journal pricing. Academic Press were clear leaders in publishers’ consortia.

All groups can benefit from aggregators. They help organise access for libraries. They help small publishers to compete. They offer cost-effective outsourcing to large publishers. Finally, scholarship benefits from diversity of publication.

Inger advised small and medium publishers to go online soon, to maximise visibility through a gateway, to keep careful control over content, to experiment with publishers’ consortia and not to sell too cheaply. Big publishers should outsource to cut costs but licensing terms may compete with their own consortia. Not-for-profit publishers need cost-effective content hosting and careful brand management.

Neuroinformatics: a New Enabling Field of Neuroscience Research in the IT Area

Stephen Koslow, National Institute of Mental Health, USA, gave a high-tech presentation on the working of the human brain. The brain has 100 billion nerve cells, with a million billion connections, akin to 2 million miles of wires, yet it weighs only 3.3 pounds and occupies only 1.5 litres. Neuroscientists use many types of information: numeric, symbolic, literature, images and temporal. They use new technologies such as lab-on-a-chip, and 3D brain mapping techniques (PET, SPECT, MRI, MRS to CT, functional MRI etc.). Using some of these techniques, electrophysiological activity and functionality can be linked. Koslow showed moving images of electrical activity in the brain over 30 seconds. Evidently, communicating such information requires special technologies. Koslow outlined some of the tools and research being funded by the Human Brain Project.

Referencing and Retrieval of Scientific Articles

Eric Swanson, Senior Vice President, John Wiley & Sons, New York, stated that the primary literature can be seen as a group of logically associated articles. CrossRef is a digital switchboard which gives end users access to logically related articles in 1 or 2 clicks. It is run by a not-for-profit organisation, PILA, whose members are publishers of scholarly information. Organisations offering secondary services etc. can be affiliates. Before CrossRef, lots of publishers had bilateral relationships; now about 5000 bilateral agreements have been replaced with just 68 CrossRef agreements. The Board has members from AAAS, Academic Press, AIP, ACM, Blackwell, Elsevier Science, IEEE, Kluwer, Nature, OUP, Springer-Verlag and John Wiley & Sons.

The purposes of CrossRef are to enable persistent links, to maximise links to full text articles from all information resources, and to enable links between all types of scholarly content. The infrastructure uses persistent identifiers (DOIs), standardised metadata (XML and DTD) and a resolution system, the DOI/Handle system, to get from identifiers to content. DOI is a NISO standard; CrossRef adds a look-up service. It does not demand that you deposit an abstract but other metadata is needed.

There are currently 68 member publishers, 60% of whom are not-for-profit. Metadata for 2.7 million articles from 3600 journals is available and reference links are live for 2000 journals. By the end of 2001 there will be 3 million articles and 0.5-1.0 million more will be added each year. Publishers have to provide a full bibliographic citation; most provide free abstracts. Information is supplied on acquiring articles and the business model is neutral. There are no fees for readers: publishers and affiliates pay.

Issues for the future are multiple resolution, the appropriate copy issue, access to non-subscribed content and archive repositories such as JSTOR and ADS. Distributed searching will be a new frontier. ACM, Wiley and Elsevier Science are working on a common shop front for ordering articles in one process. Citation indexing would be enormously complex and CrossRef may not do it.

Metadata for Referencing and Archival Use

http://associnst.ox.ac.uk/~icsuinfo/hakalappr.htm

Sinnika Koskiala of Helsinki University of Technology, Finland, gave this paper in place of Juha Hakala. Preserving a printed book for decades or even centuries has been relatively easy. Electronic resources differ in a fundamental way from printed resources. Every electronic resource has to be interpreted by an application before it can be understood by readers. If the information technology we use were stable, preservation would be easy, but our technological infrastructure is changing with ever increasing speed. The media electronic resources are stored on may become unreadable and file formats and compression schemes are also constantly changing.

Common preservation methods include refreshing, migration, and emulation. Refreshing means periodical copying of the resources to new storage media; migration is conversion of the resource into a new software and hardware platform; and emulation is based on development of applications, which mimic old hardware and/or software in new hardware/software environments.

Refreshing fails if used as the sole preservation strategy: without an application with which the resource can be used, a copy of it is useless. Therefore we need to use other, more efficient preservation techniques. Sometimes migration will be easy, while some other times it may be very challenging. A badly written and tested converter may destroy the whole collection by inadvertently removing vitally important features from the documents. In skilled hands, migration may yield good results, provided that it can be applied at all.

No large-scale or long-range tests of emulation for digital preservation have been carried out. A digital archive based solely on emulation would not be user friendly and special viewer applications would have to be developed. Using emulation for long time preservation will require seamless co-operation of a large number of emulators. Since it is not possible to emulate every old platform in the new one, it must be possible to stack the emulators on top of another. Although all preservation strategies have some shortcomings, they can be used to complement one another. Any institution investing in the archiving of electronic resources should test all preservation methods in order to get familiar with them.

Metadata is data associated with objects which relieves the potential users of having to have full advance knowledge of their existence or characteristics. It supports a variety of options. There have been at least four serious attempts to develop a metadata element set for long time preservation of electronic resources: RLG (for electronic still images), CEDARS, NLA, and NEDLIB.

The CEDARS specification is based on the Open Archival Information System (OAIS). Many projects have used OAIS as a reference model. Since it is very generic it can be applied in all branches of knowledge, but this generality is also a weakness in that a domain specific model has to be developed. The NEDLIB project has developed such a model for digital libraries.

The purpose of the NLA element set is to support both emulation and migration strategies. The preservation metadata elements NLA has defined are compliant with the OAIS model, although the element set is not explicitly OAIS compliant. The NEDLIB metadata element set is based on the OAIS model. There are eight elements for representation information, and ten for preservation and description information. The NEDLIB element set has strong technical bias, and it has been built with the needs of both migration and emulation in mind.

Integrated library systems are based on the MARC (MAchine Readable Cataloguing) format but many of the elements proposed in the element sets described above have not yet been included in the MARC format. Once the library community has agreed on the preservation metadata elements needed, modification of the MARC 21 format is a relatively simple process. Once MARC 21 has been extended, the library system vendors will incorporate the new data elements into their applications. Dublin Core is a simple resource description format. It is possible to incorporate all of MARC 21 and more into Dublin Core, so adding the preservation metadata elements into Dublin Core will be technically easy.

Preservation of electronic resources is a complex technical, organisational and legal problem. Libraries will be among the central players in this area. Metadata will be one of the most important tools but libraries alone cannot preserve electronic publications; broad co-operation is needed. Publishers will be among the libraries’ closest allies.

The Changing Role of Learned Societies

Anthony Durniak, Institute of Electrical and Electronics Engineers (IEEE), USA, discussed the challenges of the Internet for learned societies. Electronic articles need consistent protocols, online submissions procedures, and so on. IEEE’s print costs do not pay for this.

The Internet creates new competition such as VerticalNet (advertising "content, commentary, commerce"), BioMedNet, ChemWeb.com and Engineering Village. Library customers expect some low prices with competitive functions while community activists lump societies with "publishers". Authors expect electronic publishing to be easier and Society members expect special privileges. Societies can lead by serving the entire community, serving all users and being a trusted, independent forum.

Electronic publishing is more than journals: it also includes conference proceedings, technical standards, magazines and preprints. Services such as job advertisements and alerting services can be added to invite regular use.

IEEE serves a growing community around the world with a variety of publication types. It has 366,000 members in 150 countries. Thirty per cent of members are outside the United States.

Societies can lead by collaborating. Collaboration between competitors is a hallmark of Internet business, as, for example, in the Scientific Font project and CrossRef. IEEE/IEE built the IEL electronic library: it works with IHS to sell the IEL project. A joint book imprint with Wiley has just been announced. Societies can lead by being cost effective. This does not mean "cheap" but emphasises value for money. IEL provides outstanding value. Societies can also lead by communicating.

The Changing Role of the Librarian: a Virtual Library and a Real Archive

http://associnst.ox.ac.uk/~icsuinfo/klugkistppr.htm

Alex Klugkist, Chief Librarian, University of Groningen, The Netherlands first described how the function of libraries is changing. In addition to the greater flow of printed material, a great increase in electronic sources of information has taken place. Information and communication technology has changed the face of our society. In addition, scientific education and research have undergone considerable change. Libraries will need to continue to carry out their current key tasks in the area of provision of scholarly information but developments in information and communication technology change the organisation of the library and the library as an organisation. Attention has to be focused on the new tasks associated with digital library systems and the digitalisation of scholarly information, and this has to be done without neglecting the library’s traditional tasks. The terms "dual library system" and "dual library" are appropriate descriptions of this situation.

The dual library will be based on four pillars. It will be a collection centre of printed material, provide access and be a gateway to digital information, be a centre of expertise, and be a centre which can provide study facilities incorporating the new technologies. Klugkist addressed each of these "pillars" in some detail. In turn, the librarian’s functions are undergoing various changes. He will have to act as an information mediator, information expert and information manager.

He has to know what printed information is available for acquisition and what digital information has to be licensed and made accessible on the network. He will have to be an information mediator able to link the supply of information to the demand for it. He must be able to assist the user in his search for information in various ways as carrying out searches on behalf of users. In addition, he will need good teaching skills since a lot of universities are creating electronic learning environments and information plays a major role in the teaching and acquisition of knowledge.

As an information expert, the librarian must be well-informed about all aspects of the information chain and he must be able to interact with information technicians such as programmers and web designers in the development of information systems. He must be able to make digital library facilities available and should also be active in developing personalised "my library" systems. As a manager, the librarian will have to be well informed about business management practices, will have to know how to organise information services, and will be able negotiate licences with the suppliers of information and handle budgets.

Information work is teamwork, carried out by a team with broad-based skills and specialists It will be unusual to find the diverse qualities needed in any one person and the librarian of the future will not be a person who can be profiled in any one way.

Is Electronic Publishing Being Used in the Best Interests of Science? The Publisher’s View

Derk Haank, Chief Executive Officer, Elsevier Science declared that electronic publishing is the best thing since sliced bread even though it is not being used to its fullest extent yet. There is room for improvement in the research process. There is much reinvention of the wheel. If literature were at the desktop 24 hours a day, 7 days a week, it would play an enhanced role in the research process.

Commercial publishers have been taken as guilty. The issue is not yes or no as regards their profits but is a question of what they offer the researcher. Haank is not ashamed of making a profit, but he would be ashamed of delivering a bad service to the community. In the past there was indeed a "disconnection" but the high price of a journal is a symptom, not the actual problem. Cancelled subscriptions left a few people to pay a lot of money. All journals start small, then they grow. First, a publisher loses the students, then faculty, then marginally interested libraries and so on and so forth. His visibility falls if only a few people pay for the journal: for example, a subscription to Brain Research costs $15,000. Inefficient inter-library loan (ILL) and document delivery are being used.

How do you correct this? It is impossible in the paper world, but electronic publishing will help us to solve these problems and to do more. Electronic publishing does not lower the infrastructure cost of the system: reviewing, production etc. These costs increase. However, electronic publishing reduces the marginal cost: the extra cost of serving 601 people instead of 600 is nil.

Differential pricing has been discussed: Elsevier Science is doing this. Consortia contain small libraries which get the same access at a lower price. University X which is big pays the same as University Y which is small. In the paper world, Harvard pays the same as a small university in Zimbabwe, so the small universities cannot subscribe. Hundreds and thousands of libraries paying only $1000 for Brain Research is the way of the future. You pay by what is useful to you. Consortia allow access to all. The price thus depends on usefulness, but everyone must pay something; there is no such thing as a free lunch. It may seem free to the end user but someone has to pay. Commercial publishers must provide a service if they want to stay in business.

Where is the product going? Two years ago, Haank returned to Elsevier Science after a ten year gap. The company was in a disarray. There was fear and excitement; now there is more excitement and less fear! Elsevier Science employs thousands of scientists who feel themselves part of the community. There is now more of a rôle for commercial publishers than there was ten years ago. Compare $30 million in electronic publishing last year with $30 million between 1950 and 1980. This is not a time for amateurs; publishers need to be well-funded and in the business for the long term. It is tedious, hard work and also intellectually challenging. The scientist who says "let’s do it ourselves" underestimates the work, funds and perseverance needed. Once a commercial publisher does it well, scientists will not need to.

Outsourcing is the solution. Do not insource it: leave it to the experts. Haank showed Science Direct figures illustrating that "the more you have, the more you want". Now people want back runs. In five years Elsevier Science will offer a backfile: 1200 journals going back to Volume 1 Issue 1 will be available electronically, seamlessly linked with other material. This will be "the academic workbench" and scientists will use the literature before they write their articles. Preprints and journals will merge into one and there will also be links also to non-refereed material. Elsevier Science has developed Scirus for this latter facility. There will be access to everything but it will be clear what is refereed and what is not. When you switch on your computer, you will be alerted to the fact that someone is citing one of your articles.

Working with the literature is in the interests of all of us. Competition keeps Elsevier Science on its toes: it does not mind competition. Elsevier Science can play a role in an open technology environment; it is not trying to corner the market. Elsevier Science is more liberal than society publishers, for example ACS, whose Editors have decided that they will not publish articles that have appeared on the Web.

An average university spends between 3 to 4% of its total budget on library costs in the widest sense. Of these costs, only 25% is spent on the collection (journals and books, databases, etc); the rest is spent on infrastructure (building) and overheads (staff). This means that the university spends less than 1% on all literature. So the absolute amount

They spend cannot ultimately be the problem. Publishers have to prove that this money results in value for money. And with usage increasing by 400% as it did last year Elsevier Science is well on course. It also wants to demonstrate that literature plays an increasingly important role in the efficiency of the research process so that the debate about costs can be put in the right perspective.

The British Library saw a decline in document delivery last year because people have more access to published material. There were 500,000 Elsevier Science subscriptions two years ago and subscriptions kept going down. Now the company has more than 700,000 subscriptions. No longer is there attrition. Elsevier is now aiming for 7 million. This is the way to go. The company is in a growth business, with more research every year, but less paper publishing.

After this paper, someone in the audience said that libraries must negotiate better. Haank said that consortia are already doing so. Ginsparg said that learned society publishers persist but commercial publishers do not, so what will happen to the archive for use 70 years from now? Haank declared that Elsevier Science intends to persist and has 12 big customers who act as its backup for the archive.

Is Electronic Publishing Being Used in the Best Interests of Science? The Scientist’s View

http://associnst.ox.ac.uk/~icsuinfo/berryppr.htm

Steve Berry, Department of Chemistry, University of Chicago, distinguished between publicly and privately funded science. Scientists in for-profit organisations use traditional publication for only a selected part of their work and they are less motivated by publication. Why should public funds be used to support scientific research? Because the results of that research yield public goods. Scientific public goods are unique: their value is enhanced with their use. How are these goods captured? Only by the distribution and use of the results of the research. The implication is that the wider the distribution the greater the value is likely to be. For this reason, Berry opposes current European database law.

What are the best interests of science? To sustain the scientific enterprise by justifying its support externally, maintaining and fostering the internal characteristics of the best science, the results of research must be available to other researchers and the more widely distributed they are the better. If, by adding value, without inhibiting the distribution of the information, a private publisher can perform this distribution, so be it. What motivates publishers? We should distinguish between commercial publishers and learned societies. The latter should have as their primary motivation the furtherance of the members’ science while profit should be the secondary motive. Commercial publishers have these motivations reversed, quite properly.

A new factor is electronic publishing. It is also adapting to the now chronic pinch on funds. Electronic publishing offers a low cost alternative to paper publishing, especially if we give up conventional reviewing and editing. So what do we do? We keep the principal journals in whatever combination of print and electronic forms best satisfies each particular audience. If the commercial publishers can continue to produce journals competitive with the professional societies’ publications, then there is no reason why they should not do so. However, we must keep in mind the two motivations. Professional societies must keep the journals functioning to keep their science alive. Commercial publishers want to keep the science alive to keep the journals functioning, but if the profits vanish, commercial publishers have no further stake.

Small, specialised journals have little justification for existence in their present form. These journals have to evolve into something else, for example, the "virtual journal" of the American Physical Society (APS). What about mainstream journals? There are different styles for different communities. There will be a period of adjustment, experiment and adaptation. We must try all the possibilities that look promising, and let them compete.

All legislation concerning the communication of science must be permissive and not prohibitory; the former will be in the best interests of science while the latter is very much in opposition. As for fair use for e-print journals, let those permitting fair use compete with those that prohibit it and see who survives! We should experiment with open refereeing, wider use of e-print archives and various channels of publication costs. We must try to understand the differences among scientific disciplines and we must evaluate the results of the experiments.

Who pays? The government carries the ultimate responsibility but payment may take many forms, e.g. page charges, and an overhead to support library subscriptions. Author charges cause problems. Why should a German author help support a journal published in the United States? Moreover, some nations prohibit payment of page or submission charges.

Berry concluded with five rules:

  • Stay flexible.
  • Stay adaptable.
  • Experiment - push the boundaries out.
  • Do not let others lock us in.
  • Do as thou wilt.

This page last updated 24 April 2001