129
votes

Some colleagues and I were comparing past languages we had programmed in and were talking about our experience with VBScript with its odd features such as 1-based index instead of 0-based indexes like almost every other language has, the reasoning being that it was a language for users (e.g. Excel VBA) instead of a language for developers.

Then someone said, "XPath also has 1-based indexes" which I couldn't believe until I found this article in which many reasons are given in favor of the 0-based approach including some from Michael Kay himself:

  • "...zero-based indexing tends to make the index formulae simpler when accessing a multi-dimensional array with a one-dimensional array access expression"
  • "when handling tables, or subscripting into strings, zero-based addressing would often be much more convenient"
  • "...hardware addressing is not the only benefit of 0-based addressing ... it also makes computations easier..."

but then Michael Kay is quoted as concluding:

...1-based logic was the right choice for XPath and XSLT...because the language was designed for users, not for programmers, and users still have this old-fashioned habit of referring to the first chapter in a book as Chapter One...

Can someone explain that to me? (1) How is XPath designed for users? I can't imagine anyone who is not a developer wrangling with the syntactical rigidity of XPath or the declarative/functional-programming-aspects of XSLT. and (2) Why really did the creators of XPath go against the norm of modern programming languages by choosing a 1-based index?

2
In the same article Michael is also quoted with the following words: "I can't tell you what the actual history of the decision was; I can only post-rationalize it". If even he doesn't know then there is probably no satisfying answer.Dirk Vollmar
I have voted to CLOSE this question as subjective and argumentative. 0-based indexing is in no way better than 1-based indexing and the reverse is also true: 1-based indexing is in no way better than 0-based indexing. Both have plusses and minuses. 1-based indexing is more natural for non-programmers. It also allows to specify the upper boundary of a range as n, not the very unnatural and often leading to errors n - 1. For anyone with perverted due to "modern programming" logic, starting to use 1-based indexing would be an enjoying and refreshing experience :)Dimitre Novatchev
the answers to this stackoverflow question show that 0-based indexes are preferred for many reasons: stackoverflow.com/questions/393462/defend-zero-based-arraysEdward Tanguay
My question is a real question actually, as I teach programming and want to have an answer to this question regarding xpath indexes in case it comes up. I think the best answer is that a 1-based index maps to position() which is used heavily in xpath.Edward Tanguay
I think this is a legit question and should not have been closed. It asks for a historical fact that is not a matter of opinion and the answer would be enlightening.Ben Flynn

2 Answers

35
votes

Array and other collection indexes represent memory offsets, so logically enough they begin at zero. XML and XPATH indexes represent positions and counts, so logically enough they begin at one (and zero is therefore representative of "empty")

10
votes

To answer this question, we must examine the history of some technologies.

RSS XML XSLT and XPath History

Version 0.9 of RSS was originally released as RDF Site Summary in 1999 by a couple of guys at Netscape for Netscape’s my.netscape.com portal. Later that year, it was renamed to RSS (Rich Site Summary) with the v0.91 update. Development of the project changed hands several times, but RSS version 1.0 was released by December of 2000. With the v1.0 update, RSS included support for XML.

During 2002 v2.0 was released in September as RSS (Really Simple Syndication) and began to evolve into a major internet technology. In it’s early history, RSS feeds (and the XML data they contained) were read by humans in the raw format. Blogs and other news sources used RSS feeds and XML to output continuously updated information. Since XML was being read by mere mortals (non-programmers), XPath and XSLT also needed to be easily understandable, so that these mere mortals would not be overwhelmed by complexity when interacting with it. That is why XPath mimics the style of URIs, which is something that end-users were already familiar with. One of the concessions made for the purpose of readability by users, was to use old-fashioned numbering techniques i.e. 1-based indexes instead of 0-based indexes. That is the same concession that you mentioned with VBScript, and it was made for similar reasons.

Although RSS feeds and XML were made to be readable for most people, RSS readers were developed to provide a more pleasant interface for humans to read RSS feeds. Now, raw RSS and XML data are read almost exclusively with some sort of reader or graphical interface. XML is still in frequent (perhaps permanent) use across the web, but it is masked by fancy graphical user interfaces to provide a better experience for end users.

*The term, "mere mortals," refers to humans who are not programers