In today’s column, I examine the emerging trend of developing generative AI and large language models (LLMs) based on data exclusively from the past. Yes, you read that correctly, the trend involves crafting AI via out-of-date data. Whereas most prevailing LLMs are devised by scanning up-to-date data across the Internet, including patterning on the latest posted information, these specialized LLMs are exclusively given data that has a cutoff date of some years ago.

It seems an odd thing to do. Well, as you’ll see in a moment, there is method to the madness. A recently touted instance is an LLM based on data before 1931. The developers collected data that had been published before the 1930s. They fed this as the training data into a fresh LLM. Thus, the only information patterned on was limited to no more recent than 1931.

What do you think such an LLM would say, and is this merely folly or does it have a useful purpose?

This analysis of AI breakthroughs is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here ).

When developing an LLM, you usually want to feed it as much data as you can find. The wider and deeper the data, the more pronounced the LLM will likely be. The LLM can be responsive to questions of all kinds. The data is usually found by scanning the Internet. This includes scanning the latest postings at the time of the scanning process.

We all know that the amazing power of current LLMs is their fluency. You can converse with generative AI in a natural, flowing way. Part of this fluency is due to the wideness of the scanning during initial data training. The AI is patterning on a diverse range of books, stories, novels, narratives, poems, and just about any form of human writing. This provides the basis for algorithmically being able to mimic human writing and languages.

Suppose that instead of scanning everything, you decided to selectively scan data of a certain kind. If you opted to scan data that was entirely and only in French, the resultant LLM would be unlikely to converse in English. The possibility of restricting what data is used to train an LLM has spurred some interesting efforts at seeing what happens when doing so.

One restriction would be to use a time-based cutoff. You might decide that you will only use data that was published before a particular date. This means that the resultant LLM won’t have any context about what happens after that cutoff date. The LLM will be stuck in the past, as it were. Some refer to these as vintage LLMs. Personally, I don’t favor that phrasing since “vintage” might also seem to suggest LLMs that are part of the heritage of the LLM era, such as GPT-1, GPT-2, and other initial LLMs.

I typically refer to these specialized LLMs as AI time machines. The LLM lets you go back in time to explore what the world was like in those prior days, without seemingly being tarnished or tainted by what subsequently occurred. That being said, I don’t want to overplay the naming since the LLM isn’t an actual time machine. You might say it is a date-based simulation of sorts.

Exploring The AI Time Machine LLMs

There is a newly released LLM that is based on data that had been published before the 1930s, and it is freely available for usage if you’d like to see what it does. The LLM is described in a paper entitled “Introducing talkie: a 13B vintage language model from 1930” by Nick Levine, David Duvenaud, Alec Radford, April 2026, available at the link here , and included these salient points (excerpts):

  • “We introduce talkie-1930-13b-base, a 13B language model trained on 260B tokens of historical pre-1931 English text.”
  • “We have collected hundreds of billions of pre-1931 English-language tokens. These include books, newspapers, periodicals, scientific journals, patents, and case law.”
  • “We chose the end of 1930 as the cutoff date because that is when works enter the public domain in the United States. “
  • “For this version of the model, we also limited ourselves to primarily English-language texts, because validating the data pipeline requires deep familiarity with source documents, and we are native English speakers.”
  • “While we have tried to post-train talkie free from modern influence, reinforcement learning with AI feedback inevitably shapes talkie’s behavior anachronistically.”

As noted above, the AI model has 13 billion parameters. This is generally considered a small-sized AI, often referred to as an SLM (small language model). The rule of thumb is that an SLM is around 4B to 40B in size. A medium-sized version is typically in the 40B to 150B range. The customary LLMs that you use are beyond 150B parameters in size. Though this specially trained AI model is an SLM, I’ll continue to refer to it as an LLM for ease of reference (most people are comfortable with “LLM” and aren’t as familiar with the acronym “SLM”).

Another aspect of the AI model is that the developers used 260B tokens to train the AI. You can roughly think of tokens as words (for my detailed explanation, see the link here ), in the sense that 260 billion words were scanned during the pattern-making process. A modern-era LLM such as ChatGPT, GPT-5, Claude, CoPilot, Gemini, and others are typically being trained on a much larger base of tokens, perhaps 10T to 15T tokens.

Not Fully Pure Due To Leakages

One important caution is that these AI time machine LLMs are not necessarily a pure indicator of the past. There are various impurities or other considerations that can impact the LLM.

First, the scanning approach usually consists of relying on copyright dates that suggest when a written item was published. Though this is probably effective overall, there are likely instances of published items that might have a falsely stated date. The date might have been intended to say 1962, but the date got flipped to indicate 1926 instead.

Second, documents from the past might have been updated or revised. Imagine a published paper in 1925 that was later updated in 1950. The original published date still says 1925. Unfortunately, the data in that paper is no longer purely at a cutoff of 1925 since it contains added remarks or changes that took place in 1950.

Third, during the making of the LLM, it is possible that some of the tuning actions can steer the AI toward modern times. Suppose that while tuning the LLM, the hired people doing the tuning tell the AI that particular words are no longer considered acceptable in contemporary society. The LLM might mathematically and computationally suppress those words, even though they were commonly used before the 1930s.

Fourth, the collective set of dated data could be skewed at the get-go. For example, it might be difficult to find works published before 1930 that are digitized and available online. Those that are available online are perhaps a fraction of the true volume of such books. In that sense, the LLM is skewed.

All told, be cautious in assuming that an AI time machine LLM is an accurate portrayal of the world before the stipulated cutoff date.

Ways To Leverage AI Time Machine Chatbots

In the instance of a 1930 cutoff date, the expectation is that such an LLM would not have any information about World War II, the atom bomb, smartwatches, cell phones, and all manner of modern-day capabilities. I played with the LLM and tried to gauge whether any of those post-date events or inventions were in the LLM. Fortunately, they didn’t seem to have wormed their way into the LLM.

I say that this is nice because one use of such an LLM is to not only grasp what the past was like, but also to see if the LLM can make viable predictions about the future. Can an LLM that dates from the 1930s create an accurate prediction about the emergence of World War II or the many inventions and technologies that we have available today?

I tried this. The predictions were mainly vague and could be interpreted as potentially veering into estimating that another widespread war would arise and identifying sci-fi-like inventions. It wasn’t specific enough that you could make any solid bets on the future.

Discovery Of Novel Inventions

The allied question is whether this tells us anything about using contemporary LLMs to predict the future. In other words, if the AI time machine LLM were to be any good at making predictions of future events and outcomes, perhaps an LLM trained on data up through 2026 could do likewise, predicting what will happen or arise in the 2030s, 2040s, 2050s, and so on.

Another twist would be to see if the AI time machine LLM could invent from scratch a machine or invention that wasn’t designed and built until after the 1930s. Again, if this is feasible, you might assume that a modern-era LLM could do likewise and invent a device that we otherwise wouldn’t by human hand invent for another ten or twenty years.

Sorry to say that the AI time machine LLM, in this case, didn’t especially showcase a capability at doing so. The concocted inventions were more akin to broad ideas than to the specifics of what it would take to actually produce the item. I’m not suggesting that I exhaustively tried this, so please just take this as a cursory inspection.

As the old saying goes, your mileage might vary.

The bottom line is that devising LLMs that have specific cutoff dates is an intriguing proposition and can potentially provide insights across a wide range of considerations. To some extent, restricting the dates of the source content is an interesting challenge of its own, along with seeing whether the fluency of the AI will be on par with modern-era LLMs. This can tell us something about the building of LLMs and how the volume and nature of the source data shape what the LLM can do.

I sincerely hope that anyone using any of these date-based LLMs will realize that they aren’t entering into a time portal that is a full portrayal of the past. My points earlier about the leakages and impurities are a cautionary note to be mindful of how you use such LLMs. If students are told to use a date-based LLM, aiming to look into the past, please inform them not to overstate what they find.

Benjamin Franklin famously made this remark: “Lost time is never found again.” That is certainly a profound insight. Maybe via the use of AI, we can find lost time, at least to the degree that we can learn lessons from the past and aim to secure a superbly devised future.