On transcending Excel, building a better world, and achieving inner peace on social media

Historians should not rely upon Excel for temporal data. In the first place, its assumptions are not always entirely logical; type in 04-05-30 and it will assume you mean 1930, whereas entering 04-05-25 will signal 2025—and this will likely change in the next iteration of the programme, as we creep ever closer towards mid-century, leaving those upgrading to remember an ever-advancing turning point. Of course, simply remembering to include the 19 or 20—precision is always to be applauded—should solve this minor quibble; alas, it does not.

Despite its ubiquity, the programme was simply not designed for (and seems to chafe at the idea of) analysing dates before 1900. For those of us working on earlier centuries, we are left with two options, neither ideal. The first is to write our dates as string literals, that is, as words rather than calculable dates. They will appear correctly but will not allow for algorithmic manipulation, becoming simple tables of text—any calculations about days between or days since must be done manually. The other is to live with the fiction that when you see 01-01-54 you are working with 1854 or 1754 rather than 1954. Cross the century mark, however, and such fictions become more difficult as 2000 was a leap year, whereas 1700, 1800 and 1900 were not.

Other solutions, of course, exist. Many website offer the ability to calculate the distance between two dates and Google can now directly respond to queries such as “How many days are between 2 July 1776 and the 3 September 1783?” Yet, despite the seeming omnipresence of Wi-Fi, there will always be moments when Google is indisposed. Having trained as a historian across the late 1990s and early 2000s, I have seen the introduction of many off- and on-line spreadsheet programmes but, interestingly, they all consider 1 January 1900 the gold standard of starting dates. Works, Excel, Calc, and Google Sheets all set their dates to either 1 January 1900 or 30 of December 1899, using a single integer (whole number) to represent each day since, thus allowing for basic calculations between dates; Sheets, moreover, prides itself on the ability to move negatively in time, though the 0-mark remains at the turn of the 20th century, likely to maintain interoperability with Grandfather Excel.

My current research revolves around time, and the passage of it as news and its agents move from one location to another. Thus, I often need to determine the precise distance between two dates, both inclusive and exclusive of the beginning and end of a journey. Moreover, I must make these calculations at scale, across hundreds of thousands of data points, with slight but crucial variations, in order to understand trends in the physical transit and the freshness (or preservation) of news as it reappeared across different networks. Because of this, neither the convenient fictions of Excel nor the on-line computations (and frequent time-outs) of Sheets are suitable: I need accurate and consistent calculations across my entire existing dataset, now approaching five gigabytes, that I can reshape and reconfigure quickly and responsively to developments in my methodology and, as of writing, the number of discrete differences being calculated has consistently outstripped the ability of all of these programmes to provide timely results. It, thus, became apparent that the most sensible course of action was to develop my own tool for date calculation; the benefit value of writing my own date-distance calculator was threefold.

1. A sense of control in a world full of chaos

The difficulty of relying upon third-party software, tools that you had no part in building, is that they are rarely exactly what you need. They will often do what you need them to do, but just as often require tweaking, hacking or workarounds to get where you need to go. This is chaos, or at least very poor practice. The problem with hacky solutions is that however cleverly they may have been devised, or however efficient they are in terms of the time spent building them, they are always imperfect solutions. I rely on several imperfect solutions in my research; the plagiarism detection software Copyfind, upon which I have built my data set, has been grotesquely misused in the furtherance of detecting scissors-and-paste journalism and my entire work-flow is based upon stringing together a combination of bespoke and third-party tools. Yet, that does not mean that I do not see the value in differentiating between long and short-term efficiencies.

I could have built or downloaded a macro for Excel to allow pre-1900 dates. I could have written a macro (or, more likely, several macros) to load, process and save my many data files while I prepared and dined upon a Sunday roast. Yet, in the long-term, this was navigating (or hiding from) the chaos rather than conquering it. By taking a moment to really define what I needed from my date calculations, I was able to devise a very simple C++ function to achieve it—one that, in the long term, will take far less time to implement than any Excel workaround. Moreover, because the programme is mine, and because I intimately understand its logic, I can redeploy it, again and again, in situations that I have not yet even imagined, tweaking the variables and combining it with other small tools to undertake quite monumental tasks. And on a base, emotional level, it absolutely gives me a sense of control, of empowerment, in a world that could otherwise feel frustrating or even hostile.

2. A deeper understanding of half-remembered rules

There were historical lessons to be learned from this exercise as well. Until last month, I was abstractly aware of the rules surrounding leap years and the shift from the Julian to the Gregorian calendar, but these were facts filed under “I can look these up when and if I need them” rather than for immediate recall. Having to programme a set of instructions that would provide accurate date information that would take into account not only the addition of February 29th when and as appropriate but also the missing dates of 1752 (1 January—24 March and 3—13 September) burned this information into my long-term memory. As a historian of the British Empire, I could make the assumption of a single transition to the Gregorian calendar, but had to make this clear in my code, lest I take it for granted at some future date. All this exercised long-atrophied corners of my mind. Moreover, the programme itself had to be compact, take simple inputs (day, month and year) and provide simple outputs, challenging me to think carefully about the steps it should take. Here is how it works:

Take a given date and determine its relationship to 14 September 1752 (after which dates were relatively stable) and 2 September 1752 (before which dates followed different rules). Also, take into account that 1751 was a “short year” in the British calendar, running only from 25 March to 31st of December. Using Excel and Sheets as a guide, I made 14 September 1752 my zero-date and the 2nd my -1.

  • If it is on or after 14 September 1752
    • If inclusive, add 1, then
    • If in September 1752
      • Take the day and subtract 14 to get the ‘date integer’
    • Otherwise, if after September 1752 but before 1 January 1753,
      • Add 16 (September)
      • If the date is in October
        • Add the “day integer” of the date
      • Otherwise, if the date is in November
        • Add 31 (for October) and the day integer
      • Otherwise, if the date is in December
        • Add 61 (October and November) and the day integer
    • Otherwise, if after 1752
      • Add 109 (for the remainder of 1752)
      • Determine how many days have occurred between the zero-date and date year
        • Subtract 1753 from the date year
        • Multiply the difference by 365
        • Add the product (for example, 2017 – 1753 = 264; 264 X 365 = 96,360, so add 96360)
      • Determine how many leap days have occurred between the zero-date and date year
        • Divide the date year, minus 1753, by 4
        • Add the quotient (for example, 2017 – 1753 = 264; 264 / 4 = 66; so add 66)
        • If the date is after 2000
          • Subtract 1
        • Otherwise, if the date is between 1753-1999
          • Subtract 2
      • Determine if the date year is a leap year (divisible by 4, but not 100, unless also by 100)
        • Find the number of “days in” up to the month of the date. This is a look-up table for leap and non-leap years
      • Add the day integer of the date

That part was relatively easy. It was the earlier (Old Style) dates that made me a bit nervous, not having worked with them to any great extent. After a bit of reference work, which is always good for the soul, I determined the following rules:

  • If in 1752 before 14 September
    • If inclusive, subtract 1
    • If 1 September 1752
      • Subtract 2
    • Otherwise if 2 September 1752
      • Subtract 1
    • Otherwise, if 1 January—31 August 1752
      • Subtract 2 for September 1-2
      • Subtract the relevant month value for the months following the date month, up through August: 31 for July, 62 for June, 92 for May, or 123 for April and so on
      • Subtract the day integer from the month value for the date month (for example, 14 April would be 30 – 14) and then subtract the difference (16)
  • If in 1751 (the short year)
    • Subtract the days from 2 September – 1 January 1752
    • Determine how many days have occurred between the zero-date and date year
      • Subtract the date year from 1751, multiply the difference by 365 and subtract the product
    • Determine how many leap days have occurred between the zero-date and date year
      • Subtract the date year from 1751 and divide the difference by 4
      • Subtract the quotient
    • Determine if the date year is a leap year (Is it divisible by 4?)
      • Subtract the number of days to the month of the date. This is a look-up table for leap and non-leap years
      • Subtract the day integer of the date, minus the value for that month (for example, 14 April would be 30-14, or 16)
    • Stop at 25 March 1751
  • If before 1751 (that is 24 March 1750 and before)
    • Subtract the days from 2 September – 25 March 1751
    • Determine how many days have occurred between the zero-date and date year
      • Subtract the date year from 1751, multiply the difference by 365 and subtract the product
    • Determine how many leap days have occurred between the zero-date and date year
      • Subtract the date year from 1751 and divide the difference by 4
      • Subtract the quotient
    • Determine if the date year is a leap year (Is it divisible by 4?)
      • Subtract the number of days to the month of the date. This is a look-up table for leap and non-leap years
      • Subtract the day integer of the date, minus the value for that month (for example, 14 April would be 30-14, or 16)

You can see the fully implemented code here: https://github.com/mhbeals/datedistance

Thus, entering a day, month and year will return a ‘day number’ relative to 14 September 1752. With two of these numbers, I can now compare any two dates—at least for Britain after 1155.

3. A reinforcing my own logic circuits

The more I learn to programme, the more I understand the truth in the phrase “software engineer”. Every function, every small bit of my programme is a tiny machine—I pour in data, turn the crank and answers (or more interesting questions) stream forth along precisely chiselled flumes. Over the past fifty years, particularly with the rise of postmodernism and the cultural turn, the discipline of history has seemingly strayed ever further from its mid-century home in the social sciences towards a more abstract, more intangible, collection of overlapping interpretations and perspectives. While the expansion of source materials and the development of new interpretive frameworks has undoubtedly enriched the discipline, I cannot help but feel that our collective dedication to logical deduction has in some ways diminished, if for no other reason than we are no longer speaking a common methodological language and are relying ever more dangerously upon apophenia to connect our disparate conclusions.

Collaborating with my computer provides a much needed opportunity to truly test the strength of my premises with no benefit of the doubt from kindly colleagues. The computer is both always right and infuriating obtuse. It takes me at my word, my precise word, and nothing else. While a sampling and sounding of evidence can create within my mind a general impression of a trend, a trend that I can explain and appears logical to my human compatriots, this is simply insufficient when working with a computer. Even for my simple date-distance calculator, I was forced to really think about inclusive and exclusive date ranges in a way that I had simply not been doing consistently and to think about every single step I was taking to reach my conclusions. While most were straightforward, it provided an opportunity to stop and think, to practice deliberate logic in a way that I had not been allowing myself to do owing to the (poor excuse) of insufficient time.

Bonus Benefit: Beautiful Builders

The final, and in some ways most important, benefit to creating this small tool was the ability to share it with my fellow historians (download it here for Windows) and to encourage them to be builders as well. A few weeks ago, I declared my undying love and affection for those who build—those who can imagine something they wish to exist in the world and are willing to learn the skills and expend the effort to make it a reality. Whether a widget or a monument, the artistry—the hard work and passion—that goes into building is something that should always be encouraged. Whether around the water-cooler or on social media, there seems to be an unending stream of complaints about the fundamental unjustness of the universe and our collective inability to do much about it. But I poked the world, I said “I want this to exist”, and now it does. Imagine if all of us took one little gripe, one Twitter zinger, and instead of releasing it into the social media ether, actually did something about it?

Studies have shown that even the most wonky flat-pack furniture is often prized above factory-produced equivalents because of the value we attach to our ability to effect the world, the space we inhabit, and to transform even the most minor chaos into splendid order. So go ahead and make your world a little nicer. You’ll be amazed out unattractive yelling into the ether becomes.

Leave a Reply