Cobol Data Windowing and the Y2K Millenium Bug
When application systems were first developed in the 1960s and onwards, hardware resources such as memory were hundreds of times more expensive than they are today. It was imperative to save as much memory as possible, both internal working memory as well as external disk storage memory.
One accepted solution to achieve this was applied to variables representing dates.
If, in the context and use of data fields within a specific system, it could be assumed that all dates were associated with the same century, then a shortcut could be implemented whereby the first two digits of the year (representing the century) could be disregarded, in practical terms not defined, either in memory or in data structures/files. Instead of a date field being defined as 8 numeric digits (CCYYMMDD), it would be defined as 6 (YYMMDD). The century was dropped.
For example, the year “1970” could be defined in working memory and data structures as “70”, instead of “1970”. The 2 bytes saved for each of the multiple date fields in a system, within millions of records in hundreds of data files added up to countless megabytes of memory saved, and seemed to justify the solution – for example, in 1970 the cost of one megabyte of memory was approximately $3,000,000 dollars.
What were the proposed solutions for Y2K?
A solution was needed for those cases where, in the context of a specific business system or process, a date’s century could possibly be related to one or more centuries.
A simple case would be a system processing payment transactions for a company that began operating in 1980 – the system could assume that these transactions had occurred in the 20th century. However, as the end of the 20th century approached, and it became apparent that the company and the system would continue to be active into the 21st century, there arose a need for a sustainable solution.
Most organizations were posed with the following options:
1. Replace – An existing system would be replaced with a new system that would support the 4-digit year. Besides replacing software, this would involve converting the database as well. In many cases the organization might have to migrate to a new hardware or software infrastructure, thus requiring new skillsets and resources.
2. Date expansion – Considered a “technical” solution that can be done primarily by external technical resources who would adjust the existing software to support 4-digit years – expanding all date field definitions in both software and data files, and updating date fields in data files to include the correct century.
3. Data Windowing – This solution was based on the surgical insertion of code in very specific points within the system, without having to resort to any physical or technical expansion of date fields. Based on the internal business logic of the organization, the code would make assumptions and deduce the century to which a date value “belonged”.
The three options differed in the level of risk, cost, complexity, resources, time constraints and so on.
In many cases, there simply was not enough time or resources to implement the first option – testing Y2K readiness had to begin in 1999 or earlier. While the second option was more of a “technical” conversion of the software and data to support a 4-digit year, it had to be done across all systems in the organization at the same time, and posed a higher level of risk as well.
The third option of Data Windowing appeared to pose the least amount of risk, was less costly, less invasive, took less time, and allowed for more focused testing. It did however require resources who were familiar with the embedded business logic in the system, people who could pinpoint exactly where the windowing needed to be placed, and where it did not. We will take a deeper look into this solution in the next section.
The Y2K Cobol Data Windowing solution
In the case of the company mentioned earlier that began operations in 1980, the Data Windowing solution could be implemented as follows:
All transactions that occurred with a date with the year value being “80” (1980) thru “99” (1999) would be assumed to have occured in the 20th century, while all other years (from “00” thru “79”) would be considered as having occured in the 21st century.
In this case, the year “80” was the cutoff year: Any date whose year was equal to or greater than 80 would be considered a date from the 20th century; any date whose year was below the cutoff year was considered to be a date belonging to the 21st century.
For many organizations, things were not so simple. Some had historical dates on record going back to the start of the 20th century. In other cases, such as dates of birth, they occurred in the 19th century as well. Before the end of the 20th century many financial calculations were processing dates that went beyond the year 1999.
There was no single standard regarding the pivot year – some organizations used 20 (meaning 1920) as their pivot year, others used 30, 40, 50 or 60.
Here’s an example taken from IBM documentation relating to a pivot year of 40, which was widely adopted:
- If the 2-digit year is greater than or equal to 40, the century used is 1900. In other words, 19 becomes the first 2 digits of the 4-digit year.
- If the 2-digit year is less than 40, the century used is 2000. In other words, 20 becomes the first 2 digits of the 4-digit year
What is the problem with the Y2K Data Windowing solution ?
The Date Windowing solution could only be used to distinguish between two centuries, for example to assign a date either to the 20th century (where the century value is 19) or the 21st century (where the century value is 20). As time progressed, if the need arose for identifying a third century, such as the 22nd century (with 21 being the century value), then Data Windowing could no longer be a sustainable solution.
Looking back at the IBM example above – if the pivot year was 40, then a year with a value from 40 to 99 would be associated with the 20th century – meaning the century value would be 19. For the most part that has worked well since the year 2000. But over 20 years have passed since then, and the year 2040 is just around the corner. Not to mention cases where the cutoff year is 30, or even 20.
Programs that perform future projection calculations for interest, savings, loans and so on are already reaching the pivot year that served the Y2K Windowing Solution. If a program performs a 30-year mortgage calculation in 2020, it will cross over the pivot year 40, and may find it is relating to years 40, 41 and 42 as 1940, 1941, and 1942 instead of 2040, 2041 and 2042.
I checked with a number of colleagues who manage large Cobol systems that still run today.
One told me that he had already encountered a batch job crashing because the “old” Data Windowing algorithm was used in a calculation where the date went beyond 2040. He checked and found that although his company had implemented Date Expansion in many places, and had commented out statements implementing Data Windowing as part of their Y2K conversion, it seemed that after the year 2000 there were programmers who continued to use this logic in newly developed code, in existing and new programs. It seemed that programmers were unaware of the impact.
How can the Data Windowing solution be fixed ?
It is safe to assume that there are places in your organization’s core legacy (Cobol) systems that are still implementing the Data Windowing method, and now would be a good time to start scanning the relevant system sources to get a feel for the magnitude and effort that may be needed to mitigate the issue. Data Windowing should be considered a new risk factor that needs to be addressed as soon as possible.
Feel free to contact us to see how we can help!



