10
votes

I came upon a strange behavior that has left me curious and without a satisfactory explanation as yet.

For simplicity, I've reduced the symptoms I've noticed to the following code:

import java.text.SimpleDateFormat;
import java.util.GregorianCalendar;

public class CalendarTest {
    public static void main(String[] args) {
        System.out.println(new SimpleDateFormat().getCalendar());
        System.out.println(new GregorianCalendar());
    }
}

When I run this code, I get something very similar to the following output:

java.util.GregorianCalendar[time=-1274641455755,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=1929,MONTH=7,WEEK_OF_YEAR=32,WEEK_OF_MONTH=2,DAY_OF_MONTH=10,DAY_OF_YEAR=222,DAY_OF_WEEK=7,DAY_OF_WEEK_IN_MONTH=2,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=55,SECOND=44,MILLISECOND=245,ZONE_OFFSET=-28800000,DST_OFFSET=0]
java.util.GregorianCalendar[time=1249962944248,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2009,MONTH=7,WEEK_OF_YEAR=33,WEEK_OF_MONTH=3,DAY_OF_MONTH=10,DAY_OF_YEAR=222,DAY_OF_WEEK=2,DAY_OF_WEEK_IN_MONTH=2,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=55,SECOND=44,MILLISECOND=248,ZONE_OFFSET=-28800000,DST_OFFSET=3600000]

(The same thing happens if I provide a valid format string like "yyyy-MM-dd" to SimpleDateFormat.)

Forgive the horrendous non-wrapping lines, but it's the easiest way to compare the two. If you scroll to about 2/3rds of the way over, you'll see that the calendars have YEAR values of 1929 and 2009, respectively. (There are a few other differences, such as week of year, day of week, and DST offset.) Both are obviously instances of GregorianCalendar, but the reason why they differ is puzzling.

From what I can tell the formatter produces accurate when formatting Date objects passed to it. Obviously, correct functionality is more important than the correct reference year, but the discrepancy is disconcerting nonetheless. I wouldn't think that I'd have to set the calendar on a brand-new date formatter just to get the current year...

I've tested this on Macs with Java 5 (OS X 10.4, PowerPC) and Java 6 (OS X 10.6, Intel) with the same results. Since this is a Java library API, I assume it behaves the same on all platforms. Any insight on what's afoot here?

(Note: This SO question is somewhat related, but not the same.)


Edit:

The answers below all helped explain this behavior. It turns out that the Javadocs for SimpleDateFormat actually document this to some degree:

"For parsing with the abbreviated year pattern ("y" or "yy"), SimpleDateFormat must interpret the abbreviated year relative to some century. It does this by adjusting dates to be within 80 years before and 20 years after the time the SimpleDateFormat instance is created."

So, instead of getting fancy with the year of the date being parsed, they just set the internal calendar back 80 years by default. That part isn't documented per se, but when you know about it, the pieces all fit together.

5

5 Answers

6
votes

I'm not sure why Tom says "it's something to do with serialization", but he has the right line:

private void initializeDefaultCentury() {
    calendar.setTime( new Date() );
    calendar.add( Calendar.YEAR, -80 );
    parseAmbiguousDatesAsAfter(calendar.getTime());
}

It's line 813 in SimpleDateFormat.java, which is very late in the process. Up to that point, the year is correct (as is the rest of the date part), then it's decremented by 80.

Aha!

The call to parseAmbiguousDatesAsAfter() is the same private function that set2DigitYearStart() calls:

/* Define one-century window into which to disambiguate dates using
 * two-digit years.
 */
private void parseAmbiguousDatesAsAfter(Date startDate) {
    defaultCenturyStart = startDate;
    calendar.setTime(startDate);
    defaultCenturyStartYear = calendar.get(Calendar.YEAR);
}

/**
 * Sets the 100-year period 2-digit years will be interpreted as being in
 * to begin on the date the user specifies.
 *
 * @param startDate During parsing, two digit years will be placed in the range
 * <code>startDate</code> to <code>startDate + 100 years</code>.
 * @see #get2DigitYearStart
 * @since 1.2
 */
public void set2DigitYearStart(Date startDate) {
    parseAmbiguousDatesAsAfter(startDate);
}

Now I see what's going on. Peter, in his comment about "apples and oranges", was right! The year in SimpleDateFormat is the first year of the "default century", the range into which a two-digit year string (e.g, "1/12/14") is interpreted to be. See http://java.sun.com/j2se/1.4.2/docs/api/java/text/SimpleDateFormat.html#get2DigitYearStart%28%29 :

So in a triumph of "efficiency" over clarity, the year in the SimpleDateFormat is used to store "the start of the 100-year period into which two digit years are parsed", not the current year!

Thanks, this was fun -- and finally got me to install the jdk source (I only have 4GB total space on my / partition.)

2
votes

You are investigating internal behaviour. If this goes outside the published API then you are seeing undefined stuff, and you should not care about it.

Other than that, I belive that the year 1929 is used for considering when to interpret a two digit year as being in the 19xx instead of the 20xx.

2
votes

SimpleDateFormat has mutable internal state. This is why I avoid it like the plague (I recommend Joda Time). This internal calendar is probably used during the process of parsing a date, but there's no reason it would be initialized to anything in particular before it has parsed a date.

Here's some code to illustrate:

import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.GregorianCalendar;

public class DateTest {
    public static void main(String[] args) {
        SimpleDateFormat simpleDateFormat = new SimpleDateFormat();
        System.out.println("sdf cal: " + simpleDateFormat.getCalendar());
        System.out.println("new cal: " + new GregorianCalendar());
        System.out.println("new date: " + simpleDateFormat.format(new Date()));
        System.out.println("sdf cal: " + simpleDateFormat.getCalendar());
    }
}
1
votes

Looking through SimpleDateFormat it seems like it's something to do with serialization:

/* Initialize the fields we use to disambiguate ambiguous years. Separate
 * so we can call it from readObject().
 */
private void initializeDefaultCentury() {
    calendar.setTime( new Date() );
    calendar.add( Calendar.YEAR, -80 );
    parseAmbiguousDatesAsAfter(calendar.getTime());
}
0
votes
System.out.println(new SimpleDateFormat().getCalendar());
System.out.println(new GregorianCalendar());

comparing above code is comparing apples and pears

The first provides you a tool to parse String into Dates and vice versa The second is a DateUtility that allows you to manipulate Dates

There is not really a reason why the should provide similar output.

Compare it with the following

System.out.println(new String() );
System.out.println(new Date().toString() );

both lines will output a String but logicly you wouldnt expect the same result