Martin Paul Eve bio photo

Martin Paul Eve

Professor of Literature, Technology and Publishing at Birkbeck, University of London

Email Books Twitter Github Stackoverflow MLA CORE Institutional Repo Hypothes.is ORCID ID  ORCID iD Wikipedia Pictures for Re-Use

There is no single cause of the problems with the economics of scholarly communications. The expectation that we can publish more and more research on the same, or lesser, budgets is one factor. The rise of profiteering commercial publishers is another. There is also a group of smaller other aspects, though, one of which I will discuss here.

It may sound overblown, but a crucial stumbling block in reconfiguring the economics of scholarly communications for the digital age is Microsoft Word. Specifically, the fact that users are wedded to this format presents typesetting and conversion costs that are completely out of proportion to the needs of the system.

Most users are writing in basic rich text. Italics, bold, underline, some images, perhaps some equations and tables. The features of a full-blown word processor are not required. Yet so few scholars understand what the process of XML typesetting entails. This means that they see the need to present their manuscripts beautifully, despite the fact that, usually unbeknown to them, it will be ripped apart and re encoded in XML.

Microsoft Word is not a pretty format under the hood. Its complex XML format, packaged across multiple files, is unreadable to all but the most technically minded of individuals. The specification documents are miserable to read and understand. Hence, even the most advanced open source projects that have Word support, such as LibreOffice and OpenOffice, do not always correctly display Word documents.

The fact that users continue to write using a complex format that is well beyond their needs means that we have to pay for the labour of converting Word documents to an interchangeable XML format called JATS. Automatic conversion is difficult, so we head to the brute-force solution of using tools that help, underwritten by sheer labour power.

This is not an efficient use of institutional resources. We pay for licenses to Microsoft Word at institutions so that authors can use a superfluous format that entails the need to pay for additional software and labour to convert the document to a format that is actually useful.

The costs of typesetting per article are not necessarily huge. But they are a cost. Behavioural changes at sites outside the publishing process could result in the eradication of such inefficiencies but are unlikely to happen. Non-technical users want what they are used to: Word. Without institutional drives away from such software, this is unlikely to happen.

The other approach by more optimistic types is to build better tools. Word is abysmal at collaboration, for example. Could we build online tools that produce friendly output that do this better than Word? Of course, Microsoft will also be aware of its limitations of functionality and will work to improve them, even while it is unlikely that it will improve the underlying output format for scholarly communication purposes.

There are no totally easy answers here. Author awareness will only arise with price sensitivity. At the moment, most aren’t aware and don’t care about this process. On the other hand, if it becomes truly possible to out-feature the giants, then author behaviour might also change. In the meantime, the grail quest for automatic conversion remains.