## Leap Years are Stupid

Actually, leap years are *absurd*, and consequently *extremely* stupid. From the day some long-dead forbear of astronomers^{1} and astrologers^{2} peered up at both the Sun and Moon in the same sky, put two and two together, and realized that they were in the same place as the last time it was *this bloody cold*^{3}, we’ve been *stuck* with them, whether we realized it or not. In a sad quest to pretend that an *ellipse*^{4} is, nevertheless, a *circle*, we doomed ourselves to forever having to *maybe* offset our mental clocks when looking sufficiently far forward or back in time, lest *history* and *reality* drift dangerously out of sync.

In the modern West the leap year is usually attributed to Julius Caesar, who so succesfully promoted the concept that it *only* took the Romans fifty years after his death^{5} to reliably incorporate it into what history now tries to forget as the Julian calendar. Unfortunately Caesar and his scholars – being pretty poor programmers – had made a *wee* bit of a rounding error, and so in 1582 Pope Gregory XIII was forced to “correct” the situation^{6} by shifting the average year from 365.25 to 365.2425 days. And so was born the Gregorian calendar we all know and use today^{7}. This approximation was *good enough* for the Church, and so history has been positively *littered* with February 29th every so often ever since.

I feel *contempt* for the leap year. So why am I writing about it?

### It’s an Ideal Programming Exercise

It turns out that determining if a particular Gregorian calendar year^{8} is, or is not, a leap year is a problem with some *unusual* properties that make it an *excellent* test of a programmer’s mastery of some of the most fundamental tools they have available to them. That, in turn, means that it’s not only a very good challenge for the budding programmer, but it could – and I’d argue *should* – also be used as an impressively diagnostic question to ask a candidate in a technical interview.

In every programming language that I’ve yet encountered there exists a short, simple, and performance-optimal solution for this problem. Indeed in most cases the optimum implementation is a single line long^{9}, but most who attempt it will end up with something quite far from elegant on the first try. And, unlike most problems with such an ideal answer, if solved inelegantly it’s also almost certainly solved inefficiently.

The beauty of the exercise is that the pathway to the ideal solution can easily be discovered just by employing the sort of straightforward reasoning that is the bread and butter of professional programming. It encourages you to *think* through your solution by first *understanding* what you’re trying to solve.

#### Why does all this matter?

From the masochist to the sadist: "Hurt me."

Clive Barker, for one

From the sadist to the masochist: "No."

Well maybe you’ve just joined Exercism. Or perhaps you’re just looking for a good code kata. Or you’ve been assigned to a team working with calendars. Maybe you’re simply a masochist. Or worse, perhaps your interviewer is a sadist.

One way or another, let’s assume you’ve just been assigned the task of implementing a function `is_leap`

that takes an *integer* argument **year** and returns a *Boolean* result.

### Luckily There’s an Algorithm for That

Every year that is exactly divisible by 4 is a leap year, except for years that are exactly divisible by 100, but these centurial years are leap years if they are exactly divisible by 400.

United States Naval Observatory

Unusually for a programming problem the basic algorithm will most likely be presented to you up front, and you need merely implement it. The algorithm is very old, and has been stated in more or less precise ways over the centuries, so it helps to transcribe it into pseudocode:

```
IF the year is a multiple of 4 THEN
IF the year is a multiple of 100 THEN
IF the year is a multiple of 400 THEN
RETURN True
RETURN False
RETURN True
RETURN False
```

Notice the chevron-like **>** shape that results from the multiple layers of nesting? It turns out that’s fairly typical of a first-pass mental model of the algorithm, at least in a language where indentation is meaningful. It’s not the most elegant form – we’ll get to that – but it conveys some important properties about our algorithm:

**It’s**. You have one argument, you perform (up to) three tests on it using a basic math operation, and it’s the*simple**same*operation for all three tests. The most complicated thing you*need*to know to solve it is*how*your language performs the modulo operation. Once you’ve got that, comparison with zero, simple conditions, and how to return a Boolean-equivalent value in your language of choice you’re*golden*. After five minutes of well-focussed instruction a novice to*any*high-level language should be able to bash out a general solution that mimics the above pseudocode.**It’s**. After each successive test you either know the answer or you know what to ask next. Most algorithms – indeed most non-trivial problems – aren’t nearly so cut and dried. There is no need for complicated branching, or your language’s version of*progressive*`else_if`

, you need merely proceed forward until the moment you are*certain*you know the answer.**It’s**. There are no bugs, no weird or exceptional edge cases; every possible (integer) input maps directly to a single, unwavering output. Additionally there are no side effects, no state to mutate… you don’t even need to carry forward the results of each modulo test. It’s as*robust**pure*as a function can get, and so constrained that you can know from glancing at it if it will always return the correct answer.

### Know Thine Enemy and Thine Self

Understanding the algorithm is great, but understanding the problem it’s addressing is *better*, and the insights gained above tell us nothing about the *domain* we’re working with. We’ve treated the various modulo tests as black boxes, but really they’re specialized filters that apply to specific, progressively smaller, subsets of the overall domain of Gregorian years.

So what can we glean from looking a little more closely at those tests?

**The answer is**: From the very first test we know that 3 out of every 4 years*usually*no*won’t*be a leap year. By looking at the rest we can see that an additional 3 out of every 4*centuries*won’t be either… returning False by default will correctly answer >75% of all possible questions without any additional effort.**They**: Just 1 in every 4 years is even a*rapidly*cull the herd*potential*leap year, just 1 year in every 100 is a century, and just 1 year in 400 is a century that’s also a leap year… we learn a*lot*fast about our year.**They have a**: Each test is a special case of the previous one, and from the above we know that each applies to a progressively smaller number of cases. Sure, if we test division by 400 first we solve the problem very quickly in an*natural*order*extremely*small number of cases, but if it*doesn’t*give us the answer it also doesn’t give us any useful knowledge about the question being asked. Proceed in the natural order, however, and each step refines the last.

Taking the first of these to heart we can rewrite our pseudocode so that we need only ever consider the paths to a True answer:

```
IF the year is a multiple of 4 THEN
IF the year is NOT a multiple of 100 THEN
RETURN True
IF the year is a multiple of 400 THEN
RETURN True
RETURN False
```

But this is interesting. Glancing at the above we can now see that there are *exactly* two criteria that result in a year being a leap year, and that those criteria are exclusive. So for any given year the answer is True if *either* of the following is True:

- the year is a multiple of 4, but not of 100
- the year is a multiple of 4, 100, and 400

### The Path of the Righteous

From either of the above the path to the ideal form can pretty neatly unroll in front of you. Especially if we introduce some abstraction and call our three tests **a**, **b**, and **c**.

For instance, if we progressively introduce *Boolean logic* operations to our pseudocode:

```
IF a THEN
IF NOT b OR c THEN
RETURN True
RETURN False
```

… becomes…

```
IF a AND (IF NOT b OR c) THEN
RETURN True
RETURN False
```

… which, in a language that allows you to return an expressions becomes:

`RETURN a AND (NOT b OR c)`

Oooh, that’s pretty concise.

### Or We Could Go The Other Way

And if we’d started from the criteria we derived alone?

```
IF a AND NOT b THEN
RETURN True
IF a AND b AND c THEN
RETURN True
RETURN False
```

… can be combined with another Boolean logic operator as:

```
IF (a AND NOT b) OR (a AND b AND c) THEN
RETURN True
RETURN False
```

… which allows us to extract the common operation:

```
IF a AND ((NOT b) OR (b AND c)) THEN
RETURN True
RETURN False
```

… but there’s no reason to perform the same Boolean test twice, so:

```
IF a AND (NOT b OR c) THEN
RETURN True
RETURN False
```

… and since we’re returning expressions brings us right back to:

`RETURN a AND (NOT b OR c)`

### All Roads Lead Here

And here lies the ultimate, optimal form. In a language that has *short-circuiting* Boolean operations this is also the most performant form, because neither side of the `or`

will be evaluated for the vast majority of cases. Thankfully most languages either do short-circuit Boolean operations, or they tend to provide an alternate short-circuiting operator like `and_also`

and `or_else`

, so if you’re not sure yours does you’ll need to do a bit of research.

Speaking of research, you *might* be tempted to remove those parentheses, which certainly does look nicer. But therein lies the final lesson `is_leap`

can teach you; your ability to elide those parentheses depends *entirely* on the evaluation strategy and operator precedence rules for your language. If your language evaluates left to right and binds `or`

at the same or lower precedence as `and`

, then no, you cannot remove those *significant* parentheses. At least not without paying the price of checking division by 400 for a year that doesn’t even divide by 4.

Which, like leap years, would be *stupid*.

- Those who look up at the night sky and find it full of
*stuff*. ↑ - Those who look up at the night sky and find it full of
*nonsense*. ↑ - They’ve never
*actually*occupied the same absolute positions twice, of course, and never will, but neither angular momentum or orbital mechanics were yet available to our prototypical*yutz*. ↑ - Actually something closer to an ellipse being frantically sketched over and over on an imaginary two dimensional plane being pulled away from the artist in four dimensional spacetime. ↑
- Caesar’s assassination, which famously happened on March 15th, 44 BCE, rather threw off the adoption of leap years. Things got a bit wonky until 8 CE and, since no one in this period could be relied on to write anything down consistently, we know exactly when he died, just not when
*in our calendar*. ↑ - Gregory XIII may have been liturgically infallible, but the fact that we’ve since had to introduce
*leap seconds*tends to imply he wasn’t mathematically infallible as well. ↑ - Unless you use the Hindu, Islamic, or Buddhist calendars, are Thai or Chinese, subscribe to the Juche ideology of North Korea, or measure everything in Unix time. ↑
- Specifically, for the
*truly*pedantic, the*proleptic*Gregorian calendar with*astronomical year numbering*and a year zero. Which is*fine*for the purposes of an exercise. To accurately reflect leap years actually*observed*by humans you’d need at minimum a quite convoluted lookup table before the 9th century CE, and probably a time machine. ↑ - Your mileage may vary; in Perl it may even be less, but no one will know because of the braces. If you’re writing in 8086 Assembly Language then, honestly,
*why*are you even reading this? ↑