351
votes

Maybe I am not from this planet, but it would seem to me that the following should be a syntax error:

int a[] = {1,2,}; //extra comma in the end

But it's not. I was surprised when this code compiled on Visual Studio, but I have learnt not to trust MSVC compiler as far as C++ rules are concerned, so I checked the standard and it is allowed by the standard as well. You can see 8.5.1 for the grammar rules if you don't believe me.

enter image description here

Why is this allowed? This may be a stupid useless question but I want you to understand why I am asking. If it were a sub-case of a general grammar rule, I would understand - they decided not to make the general grammar any more difficult just to disallow a redundant comma at the end of an initializer list. But no, the additional comma is explicitly allowed. For example, it isn't allowed to have a redundant comma in the end of a function-call argument list (when the function takes ...), which is normal.

So, again, is there any particular reason this redundant comma is explicitly allowed?

20
Every one seems to be agreeing to 'ease of adding new line' - but are people defining language specifications really bother about such things? If they are really that understanding then why don't they ignore a missing ; when it's clear next token is actually a next statement.YetAnotherUser
@YetAnotherUser: Yes, language designers consider such things. Allowing you to drop semicolons would have a much larger impact and would be highly ambiguous in many parts of the language (remember, whitespace is not semantic in C). An extra comma is this case is not ambiguous. An extra semicolon is almost never ambiguous, and so is allowed as well. In the case where it is ambiguous (after a for() for instance), adding it throws a compiler warning.Rob Napier
@Tomalak: It is ambiguous to a human reader, and is often a mistake. That is why it throws a warning. Similarly if (x = 1) is not ambiguous in the grammar, but it is very ambiguous to humans, and thus throws a warning.Rob Napier
@Rob: Your if example is not ambiguous either. I don't think "ambiguous" means what you think it means!Lightness Races in Orbit
As long as we agree that it is something useful for the compiler to protect us from, while a trailing comma in an array declaration is not something useful for the compiler to protect us from.Rob Napier

20 Answers

447
votes

It makes it easier to generate source code, and also to write code which can be easily extended at a later date. Consider what's required to add an extra entry to:

int a[] = {
   1,
   2,
   3
};

... you have to add the comma to the existing line and add a new line. Compare that with the case where the three already has a comma after it, where you just have to add a line. Likewise if you want to remove a line you can do so without worrying about whether it's the last line or not, and you can reorder lines without fiddling about with commas. Basically it means there's a uniformity in how you treat the lines.

Now think about generating code. Something like (pseudo-code):

output("int a[] = {");
for (int i = 0; i < items.length; i++) {
    output("%s, ", items[i]);
}
output("};");

No need to worry about whether the current item you're writing out is the first or the last. Much simpler.

128
votes

It's useful if you do something like this:

int a[] = {
  1,
  2,
  3, //You can delete this line and it's still valid
};
37
votes

Ease of use for the developer, I would think.

int a[] = {
            1,
            2,
            2,
            2,
            2,
            2, /*line I could comment out easily without having to remove the previous comma*/
          }

Additionally, if for whatever reason you had a tool that generated code for you; the tool doesn't have to care about whether it's the last item in the initialize or not.

31
votes

I've always assumed it makes it easier to append extra elements:

int a[] = {
            5,
            6,
          };

simply becomes:

int a[] = { 
            5,
            6,
            7,
          };

at a later date.

21
votes

Everything everyone is saying about the ease of adding/removing/generating lines is correct, but the real place this syntax shines is when merging source files together. Imagine you've got this array:

int ints[] = {
    3,
    9
};

And assume you've checked this code into a repository.

Then your buddy edits it, adding to the end:

int ints[] = {
    3,
    9,
    12
};

And you simultaneously edit it, adding to the beginning:

int ints[] = {
    1,
    3,
    9
};

Semantically these sorts of operations (adding to the beginning, adding to the end) should be entirely merge safe and your versioning software (hopefully git) should be able to automerge. Sadly, this isn't the case because your version has no comma after the 9 and your buddy's does. Whereas, if the original version had the trailing 9, they would have automerged.

So, my rule of thumb is: use the trailing comma if the list spans multiple lines, don't use it if the list is on a single line.

16
votes

I am surprised after all this time no one has quoted the Annotated C++ Reference Manual(ARM), it says the following about [dcl.init] with emphasis mine:

There are clearly too many notations for initializations, but each seems to serve a particular style of use well. The ={initializer_list,opt} notation was inherited from C and serves well for the initialization of data structures and arrays. [...]

although the grammar has evolved since ARM was written the origin remains.

and we can go to the C99 rationale to see why this was allowed in C and it says:

K&R allows a trailing comma in an initializer at the end of an initializer-list. The Standard has retained this syntax, since it provides flexibility in adding or deleting members from an initializer list, and simplifies machine generation of such lists.

15
votes

Trailing comma I believe is allowed for backward compatibility reasons. There is a lot of existing code, primarily auto-generated, which puts a trailing comma. It makes it easier to write a loop without special condition at the end. e.g.

for_each(my_inits.begin(), my_inits.end(),
[](const std::string& value) { std::cout << value << ",\n"; });

There isn't really any advantage for the programmer.

P.S. Though it is easier to autogenerate the code this way, I actually always took care not to put the trailing comma, the efforts are minimal, readability is improved, and that's more important. You write code once, you read it many times.

13
votes

One of the reasons this is allowed as far as I know is that it should be simple to automatically generate code; you don't need any special handling for the last element.

13
votes

I see one use case that was not mentioned in other answers, our favorite Macros:

int a [] = {
#ifdef A
    1, //this can be last if B and C is undefined
#endif
#ifdef B
    2,
#endif
#ifdef C
    3,
#endif
};

Adding macros to handle last , would be big pain. With this small change in syntax this is trivial to manage. And this is more important than machine generated code because is usually lot of easier to do it in Turing complete langue than very limited preprocesor.

12
votes

It makes code generators that spit out arrays or enumerations easier.

Imagine:

std::cout << "enum Items {\n";
for(Items::iterator i(items.begin()), j(items.end); i != j; ++i)
    std::cout << *i << ",\n";
std::cout << "};\n";

I.e., no need to do special handling of the first or last item to avoid spitting the trailing comma.

If the code generator is written in Python, for example, it is easy to avoid spitting the trailing comma by using str.join() function:

print("enum Items {")
print(",\n".join(items))
print("}")
6
votes

The reason is trivial: ease of adding/removing lines.

Imagine the following code:

int a[] = {
   1,
   2,
   //3, // - not needed any more
};

Now, you can easily add/remove items to the list without having to add/remove the trailing comma sometimes.

In contrast to other answers, I don't really think that ease of generating the list is a valid reason: after all, it's trivial for the code to special-case the last (or first) line. Code-generators are written once and used many times.

6
votes

It allows every line to follow the same form. Firstly this makes it easier to add new rows and have a version control system track the change meaningfully and it also allows you to analyze the code more easily. I can't think of a technical reason.

6
votes

The only language where it's - in practice* - not allowed is Javascript, and it causes an innumerable amount of problems. For example if you copy & paste a line from the middle of the array, paste it at the end, and forgot to remove the comma then your site will be totally broken for your IE visitors.

*In theory it is allowed but Internet Explorer doesn't follow the standard and treats it as an error

6
votes

It's easier for machines, i.e. parsing and generation of code. It's also easier for humans, i.e. modification, commenting-out, and visual-elegance via consistency.

Assuming C, would you write the following?

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    puts("Line 1");
    puts("Line 2");
    puts("Line 3");

    return EXIT_SUCCESS
}

No. Not only because the final statement is an error, but also because it's inconsistent. So why do the same to collections? Even in languages that allow you to omit last semicolons and commas, the community usually doesn't like it. The Perl community, for example, doesn't seem to like omitting semicolons, bar one-liners. They apply that to commas too.

Don't omit commas in multiline collections for the same reason you don't ommit semicolons for multiline blocks of code. I mean, you wouldn't do it even if the language allowed it, right? Right?

5
votes

This is allowed to protect from mistakes caused by moving elements around in a long list.

For example, let's assume we have a code looking like this.

#include <iostream>
#include <string>
#include <cstddef>
#define ARRAY_SIZE(array) (sizeof(array) / sizeof *(array))
int main() {
    std::string messages[] = {
        "Stack Overflow",
        "Super User",
        "Server Fault"
    };
    size_t i;
    for (i = 0; i < ARRAY_SIZE(messages); i++) {
        std::cout << messages[i] << std::endl;
    }
}

And it's great, as it shows the original trilogy of Stack Exchange sites.

Stack Overflow
Super User
Server Fault

But there is one problem with it. You see, the footer on this website shows Server Fault before Super User. Better fix that before anyone notices.

#include <iostream>
#include <string>
#include <cstddef>
#define ARRAY_SIZE(array) (sizeof(array) / sizeof *(array))
int main() {
    std::string messages[] = {
        "Stack Overflow",
        "Server Fault"
        "Super User",
    };
    size_t i;
    for (i = 0; i < ARRAY_SIZE(messages); i++) {
        std::cout << messages[i] << std::endl;
    }
}

After all, moving lines around couldn't be that hard, could it be?

Stack Overflow
Server FaultSuper User

I know, there is no website called "Server FaultSuper User", but our compiler claims it exists. Now, the issue is that C has a string concatenation feature, which allows you to write two double quoted strings and concatenate them using nothing (similar issue can also happen with integers, as - sign has multiple meanings).

Now what if the original array had an useless comma at end? Well, the lines would be moved around, but such bug wouldn't have happened. It's easy to miss something as small as a comma. If you remember to put a comma after every array element, such bug just cannot happen. You wouldn't want to waste four hours debugging something, until you would find the comma is the cause of your problems.

4
votes

Like many things, the trailing comma in an array initializer is one of the things C++ inherited from C (and will have to support for ever). A view totally different from those placed here is mentioned in the book "Deep C secrets".

Therein after an example with more than one "comma paradoxes" :

char *available_resources[] = {
"color monitor"           ,
"big disk"                ,
"Cray"                      /* whoa! no comma! */
"on-line drawing routines",
"mouse"                   ,
"keyboard"                ,
"power cables"            , /* and what's this extra comma? */
};

we read :

...that trailing comma after the final initializer is not a typo, but a blip in the syntax carried over from aboriginal C. Its presence or absence is allowed but has no significance. The justification claimed in the ANSI C rationale is that it makes automated generation of C easier. The claim would be more credible if trailing commas were permitted in every comma-sepa-rated list, such as in enum declarations, or multiple variable declarators in a single declaration. They are not.

... to me this makes more sense

2
votes

In addition to code generation and editing ease, if you want to implement a parser, this type of grammar is simpler and easier to implement. C# follows this rule in several places that there's a list of comma-separated items, like items in an enum definition.

1
votes

It makes generating code easier as you only need to add one line and don't need to treat adding the last entry as if it's a special case. This is especially true when using macros to generate code. There's a push to try to eliminate the need for macros from the language, but a lot of the language did evolve hand in hand with macros being available. The extra comma allows macros such as the following to be defined and used:

#define LIST_BEGIN int a[] = {
#define LIST_ENTRY(x) x,
#define LIST_END };

Usage:

LIST_BEGIN
   LIST_ENTRY(1)
   LIST_ENTRY(2)
LIST_END

That's a very simplified example, but often this pattern is used by macros for defining things such as dispatch, message, event or translation maps and tables. If a comma wasn't allowed at the end, we'd need a special:

#define LIST_LAST_ENTRY(x) x

and that would be very awkward to use.

0
votes

So that when two people add a new item in a list on separate branches, Git can properly merge the changes, because Git works on a line basis.

-4
votes

If you use an array without specified length,VC++6.0 can automaticly identify its length,so if you use "int a[]={1,2,};"the length of a is 3,but the last one hasn't been initialized,you can use "cout<