I'm curious why we seem convinced that this is a task that is possible or something worthy of investigation.
I've worked on language models since 2018, even then it was obvious why language was a useful and transferable task. I do not at all feel the same way about general univariate time series that could have any underlying process.
whimsicalism
12 days ago
Time series data are inherently context sensitive, unlike natural languages which follow predictable grammar patterns. The patterns in time series data vary based on context. For example, flight data often show seasonal trends, while electric signals depend on the type of sensor used. There's also data that appear random, like stock data, though firms like Rentech manage to consistently find unlerlying alphas. Training a multivariate time series data would be challenging, but I don't see why not for specific applications.
wuj
12 days ago
Is Rentech the only group that genuinely manages to predict stock price? Seems like the very observation that it’s still possible would be enough motivation for other groups to catch up over such a long period.
Also, the first realistic approximation of Solomonoff induction we achieve is going to be interesting because it will destroy the stock market.
Xcelerate
11 days ago
Rentech does not seem to be able to predict the stock market for their customers...
I’m referring to RenTech’s well known Medallion fund, which I believe is now only available internally to longtime employees. Even in the article you linked, it says this fund has “continued to shine”.
Xcelerate
10 days ago
If you think about it a little bit...And you read "Fooled By Randomness", there are 20 other tricks they could be playing here...Instead of "predicting" the market.
belter
10 days ago
Maybe that would be a good thing. I wouldn't mourn the destruction of the stock market as it's just a giant wealth-gap increasing casino. Trading has nothing to do with underlying value.
amelius
11 days ago
I fully agree. The stock market is just a giant machine that pulls money out of systems.
pvorb
11 days ago
>The stock market is just a giant machine that pulls money out of systems.
So you think the multi-trillion dollar stock market, consisting of thousands of global companies, has no use beyond "pulling money out of systems"? Weird.
Agreed, if stock prices were predictable by some technical means, they would be quickly driven to unpredictability by people trading on those technical indicators.
icapybara
11 days ago
This is that old finance chestnut. Two finance professors are walking down the hall and one of them spots a twenty dollar bill. He goes to pick it up but the other professor stops him and says "no don't bother. If there was twenty dollars there someone would have already picked it up"
Yes, people arbitrage away these anomalies, and make billions doing it.
frankc
11 days ago
My point was more that the arbitrage is not systematically predictable, not that the arbitrage doesn’t exist.
icapybara
10 days ago
And that would make the stock markets accessible to fewer people, further widening the wealth gap.
amelius
10 days ago
Why do you think language is so special?
There's an extensive body of literature across numerous domains that demonstrates the benefits of Multi-Task Learning (MTL). Actually I have a whole folder of research papers on this topic, here's one of the earliest references on hand that I feel captures the idea succinctly in the context of modern ML:
“MTL improves generalization by leveraging the domain-specific information contained in the training signals of related tasks" [Caruana, 1998]
I see repetition and structure everywhere in life. To me it's not far fetched that a model trained on daily or yearly trends could leverage that information in the context of e.g. biological signals which are influenced by circadian rhythm etc.
Disclaimer: my background is in ML & bio-signals, I work with time series too much.
refibrillator
11 days ago
For those who haven't read it, Rich Caruana's thesis on multi-task learning is beautifully written (the cited 1998 paper here). It's amazing to see how far the field has come, and, at the same time, how advanced the thinking was in the 90s too.
owl_brawl
11 days ago
The things that we are typically interested in have very clear patterns. In a way, if we find that there are no patterns, we don't even try to do any forecasting.
"The Unreasonable Effectiveness of Mathematics in the Natural Sciences" [1] hints that there might be some value here.
Exactly, so for example, I think the use of this model is in cases where you want user count to have some pattern around timing. And be alerted if it has spike.
But you wouldn't want this model for file upload storage usage which only increases, where you would put alerts based on max values and not patterns/periodic values.
yonixw
12 days ago
Why not? There are plenty of time series that have underlying patterns which means you can do better than a total guess even without any knowledge of what you are predicting.
Think about something like traffic patterns. You probably won't predict higher traffic on game days, but predicting rush hour is going to be pretty trivial.
IshKebab
12 days ago
There is potential for integrating ML with time series data in industrial applications (things like smelters, reactors etc.), where you have continuous stream of time series measurements from things like gauges and thermocouples. If you can detect (and respond) to changing circumstances faster then a humans in control room reacting to trends or alarms then potential big efficiency gains...
Operator guidance is often based on heuristics - when metric A exceeds X value for Y seconds take action Z. Or rates of change if the signal is changing at a rate of more than x etc.
So in these areas there exists potential for ML solution, especially if it's capable of learning (i.e. last response overshot by X so trim next response appropriately).
bigger_cheese
11 days ago
Every time i've actually tried something like this it has not outperformed statistical process control.
It's not just that control charts are great signal detectors, but also managing processes like that takes a certain statistical literacy one gets from applying SPC faithfully for a while, and does not get from tossing ML onto it and crossing fingers.
kqr
11 days ago
> Every time i've actually tried something like this it has not outperformed statistical process control.
Interesting. Could you point me to where it is compared against SPC? I didn't find it from a cursory read.
kqr
11 days ago
task specific model
whimsicalism
11 days ago
Fundamentally, the pre-trained model would need to learn a "world model" to predict well in distinct domains. This should be possible not regarding compute requirements and the exact architecture.
After all, the physical world (down to the subatomic level) is governed by physical laws. Ilya Sutskever from OpenAI stated that next-token prediction might be enough to learn a world model (see [1]). That would imply that a model learns a "world model" indirectly, which is even more unrealistic than learning the world model directly through pre-training on time-series data.
But the data generating process could be literally anything. We are not constrained by physics in any real sense if we predicting financial markets or occurrences of a certain build error or termite behavior.
whimsicalism
12 days ago
Sure, there are limits. Not everything is predictable, not even physics. But that is also not the point of such a model. The goal is to forecast across a broad range of use cases that do have underlying laws. Similar to LLM, they could also be fine-tuned.
shaism
11 days ago
"predicting the next token well means that you understand the underlying reality that led to the creation of that token"
People on the AI-hype side of things tend to believe this, but I really fundamentally don't.
It's become a philosophical debate at this point (what does it mean to "understand" something, etc.)
wavemode
11 days ago
> I'm curious why we seem convinced that this is a task that is possible or something worthy of investigation.
There's a huge industry around time series forecasting used for all kinds of things like engineering, finance, climate science, etc. and many of the modern ones incorporate some kind of machine learning because they deal with very high dimensional data. Given the very surprising success of LLMs in non-language fields, it seems reasonable that people would work on this.
zeroxfe
12 days ago
Task specific time series models, not time series “foundation models” - we are discussing different things.
whimsicalism
12 days ago
I don't think we are. The premise of this is that the foundation model can learn some kind of baseline ability to reason about forecasting, that is generalizable across different domains (each which needs fine tuning.) I don't know if it will find anything, but LLMs totally surprised us, and this kind of thing seems totally worthy of investigation.
zeroxfe
11 days ago
Foundational time series models have been around since 2019 and show competitive levels of performance with task specific models.
They made one of the best time series models and it later became one of the best language models too (Mamba).
cma
11 days ago
I have already watched that talk and know Albert Gu. His work is not about a “foundational” time series model but rather a task specific one.
whimsicalism
11 days ago
OK right, same architecture, but different trained models.
cma
10 days ago
well... if you look at a language in a certain way, it is just a way to put bits in a certain order. if you forget about the 'language' part, it kinda makes sense to try because why shouldn't it work?
baq
12 days ago
I think there are some generalizable notions of multiscale periodicity that could get embedded into some kind of latent space.
klysm
11 days ago
as you say, without knowing anything about the underlying process, we can't predict generally. Some other comments point to contexts in which we do know something about the underlying. For instance, I don't think finance is something where you can apply this kind of stuff.
fedeb95
11 days ago
There was a paper written a while back that proved mathematically how you can correlate any time series with any other time series, thus vaporizing any perception of value gained by correlating time series (at least for those people that read the paper.) just wanted to share
itronitron
11 days ago
What does that mean "you can correlate"? That phrase is meaningless.
bdjsiqoocwk
11 days ago
I would like to read more. Feels sort of like an expression of certain “universal truths” like the 80/20 rule or golden ratio
notnaut
11 days ago
The only other timeseries paper I am aware of is TimeGPT
Not really. It's true it would usually need more context than a single series dataset but you can predict broadly accurate-ish bandwidth usage trends just using simple statistical extrapolation, we've been doing that since the early 90s. If you give a model your subscriber numbers and usage data as time series it should be able to tell you quite accurately how much electricity|bandwidth|gas|road traffic levels| metro passenger levels at station Z... you'll be using at 4pm on January 4th 2026.
matt-p
11 days ago
On a related note, Amazon also had a model for time series forecasting called Chronos.
And like all deep learning forecasting models thus far, it makes for a nice paper but is not worth anyone using for a real problem. Much slower than the classical methods it fails to beat.
claytonjy
12 days ago
That’s what people said about CV models in 2011.
p1esk
12 days ago
That's fair, but they stopped saying it about CV models in 2012. We've been saying this about foundational forecasting models since...2019 at least, probably earlier. But it is a harder problem!
Something I've had issues with time series has been having to use relatively custom models.
It's difficult to use off the shelf tools when starting with math models.
toasted-subs
12 days ago
Seems like a pretty small (low latency) model. Would be interesting to hook up to mouse input (x and y) and see how well it predicts where I’m gonna move the mouse (maybe with and without seeing the predicted path)
nwoli
12 days ago
Curious George here: why are you trying to predict where the mouse is going? :)
throwtappedmac
12 days ago
Game developers are constantly trying to minimize lag. I have no idea if computers are so fast these days that it is a "solved" problem, but I knew a game developer ages ago who used a predictive mouse model to reduce the apparent lag by guessing where the mouse would be at the time the frame was displayed (considering it took 30 ms or whatever to render the frame).
tasty_freeze
12 days ago
Quake internet play only became acceptable when client side prediction was implemented, I'm sure it would be better to have real prediction instead of simple interpolation.
What an amazing look into one of the greatest minds in programming!
Thank you for this treasure.
The relevant bits:
> I am now allowing the client to guess at the results of the users movement until the authoritative response from the server comes through. This is a biiiig architectural change. The client now needs to know about solidity of objects, friction, gravity, etc. I am sad to see the elegent client-as-terminal setup go away, but I am practical above idealistic.
> The server is still the final word, so the client is allways repredicting it's movement based off of the last known good message from the server.
ukuina
11 days ago
Competitive online games commonly predict the player's movement. Network latencies have improved and are now usually <16ms (useful milestone since at 60fps you render a frame every 16.6ms), but players expect to still be able to smoothly play when joining from the other side of the continent to play with their friends. You usually want every client to agree where everyone is, and predicting movement leads to less disagreement than what you would get from using "outdated" state because of speed-of-light delays.
If you want to predict not just position but also orientation in a shooter game, that's basically predicting the mouse movements.
wongarsu
12 days ago
The only thing worse than lag is uneven lag, which is what you're going to end up with. Constant lag can be dealt with by players, jitter can't.
orbital-decay
12 days ago
Just to see how good the model is (maybe it’s creepily good in a fun way)
nwoli
12 days ago
There's a fun game idea in there! Imagine having to outmaneuver a constantly learning model. Not to mention the possibilities of using this in genres like bullet hell...
Timon3
12 days ago
Catching cheaters in games might seem like a good use.
brigadier132
12 days ago
Think of the sweet sweet ad revenue!
teaearlgraycold
12 days ago
Haha as if advertisers don't know me better than I know me
throwtappedmac
12 days ago
What is the latency?
jarmitage
12 days ago
"Time series" is such an over-subscribed term. What sorts of time series is this actually useful for?
For instance, will it be able to predict dynamics for a machine with thousands of sensors?
uoaei
12 days ago
Specifically, its referring to univariate, contiguous point forecasts. Honestly, I'm a little puzzled by the benchmarks.
techwizrd
12 days ago
Even if it was for multivariate time series, the model would first need to infer what machine are we talking about, then its working conditions, and only then make a reasonable forecast based on an hypothesis of its dynamics. I don’t know, seems pretty hard.
sarusso
12 days ago
Indeed. An issue I ran into over and over while doing research for semiconductor manufacturing.
My complaint was more illustrative than earnest.
uoaei
12 days ago
"Why would you even try to predict the weather if you know it's going to be wrong?"
- most OCs on this thread
chaos_emergent
12 days ago
I have a few qualms with this app:
1. For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.
2. It doesn't actually replace a USB drive. Most people I know e-mail files to themselves or host them somewhere online to be able to perform presentations, but they still carry a USB drive in case there are connectivity problems. This does not solve the connectivity issue.
3. It does not seem very "viral" or income-generating. I know this is premature at this point, but without charging users for the service, is it reasonable to expect to make money off of this?
Dear googler or meta-er or timeseries transformer startup something-er: Please make a ChatGPT/chat.lmsys.org style interface for one of these that I can throw data at and see what happens.
This one looks pretty easy to setup, in fairness, but some other models I've looked at have been surprisingly fiddly / locked behind an API.
Perhaps such a thing already exists somewhere?
mhh__
12 days ago
It seems to me that predicting something based on time is rarely accurate and meaningful.
Suppose you want to buy stocks? Would you look on a time based graph and buy according to that? Or you rather look at financial data, see earnings, profits? Wouldn't a graph that has financial performance on x-axis be more meaningful that one that has time?
What if you research real estate in a particular area? Wouldn't be square footage a better measure than time?
DeathArrow
11 days ago
> Would you look on a time based graph and buy according to that? Or you rather look at financial data, see earnings, profits?
Things affecting financials happen through time.
Terretta
11 days ago
All things happen through time, but my argument is that time might not be the best parameter to model relations.
DeathArrow
11 days ago
Is anyone using neural networks for anomaly detection in observability? If so, which model and how many metrics are you supporting per core?
How data hungry is it, or what is the minimum volume of data needed before its worth investigating?
morkalork
12 days ago
[deleted]
11 days ago
The more complex the data is, the more you need. If your values are always 5, then you need only one data point.
viraptor
11 days ago
If your values were always 5,you wouldn't use an LSTM to model it either. So presumably there's a threshold for when LSTM becomes practical and useful, no?
morkalork
11 days ago
Sure, that was an extreme example. The point was that the minimum of data is 1 point, maximum is "all of it". It entirely depends on your use case.
viraptor
10 days ago
What do you mean by “observability”?
sarusso
12 days ago
Telemetry. Dashboards. The application is knowing when a signal is anomalous.
Oh, yes I am working on that. Usually LSTM, exploring encoder-decoders and generative models, but also some simpler models based on periodic averages (which are surprisingly useful in some use cases). But I don’t have per-core metrics.
sarusso
12 days ago
Depending on how stable your signal is, I've had good experience with seasonal ARIMA and LOESS (but it's not neural networks)
tiagod
12 days ago
how good is it on stocks?
polskibus
12 days ago
The next index fund should use AI. What could possibly go wrong?
svaha1728
12 days ago
I promise you your market-making counterparties already are.
whimsicalism
12 days ago
What kind of things are they doing with AI?
hackerlight
11 days ago
Predicting price movements, finding good hedges, etc.
whimsicalism
11 days ago
it doesn't apply. Checkout the Incerto by Nassim Nicholas Taleb.
fedeb95
11 days ago
if I knew it was good, why would I tell you that?
claytonjy
12 days ago
I'm not sure I understand two things here. Could someone clarify: 1. This is a foundation model, so you're expected to fine tune for your use case, right? (But readme doesn't mention tuning) 2. When submitting two series, do they impact each other in predictions?
viraptor
11 days ago
How can time series model be pre-trained ? I think I’m missing something.
iamgopal
12 days ago
If you have a univariate series, just single values following each other -
[5, 3, 3, 2, 2, 2, 1, …]
What is the next number? Well let’s start with the search space - what is the possible range of the next number? Assuming unsigned 32bit integers (for explanation simplicity) it’s 0-(2^32-1)
So are all of those possible outputs equally likely? The next number could be 1, or it could be 345,654,543 … are those outputs equally likely?
Even though we know nothing about this sequence, most time series don’t make enormous random jumps, so no, they are not equally likely, 1 is the more likely of the two we discussed.
Ok, so some patterns are more likely than others, let’s analyse lots and lots of time series data and see if we can build a generalised model that can be fine tuned or used as a feature extractor.
Many time series datasets have repeating patterns, momentum, symmetries, all of these can be learned. Is it perfect? No, but what model is? And things don’t have to be perfect to be useful.
There you go - that’s a pre-trained time series model in a nutshell
malux85
12 days ago
Third paragraph of the introduction of the mentioned paper[1] in the first paragraph of the repo.
I guess they pre-trained the model to exploit common patterns found in any time-series (e.g., seasonalities, trends, etc.)... What would be interesting, though, is to see if it spots patterns that are domain-specific (e.g., the ventricular systole dip in an electrocardiogram), and possibly transfer those (that would be obviously useless in this specific example, but maybe there are interesting domain transfers out there)
jurgenaut23
12 days ago
My understating is that, while your eye can naturally spot a dependency over time in time series data, machines can’t. So as we did for imaging, where we pre-trained models to let machines easily identify objects in pictures, now we are doing the same to let machines “see” dependencies over time. Then, how these dependencies work, this is another story.
sarusso
12 days ago
[deleted]
11 days ago
Anyone have insights working with Ikigai’s “Large Graphical Model” and how well it does on time-series? It’s proprietary, but I’m curious how well it performs.
hm-nah
11 days ago
Would this be useful in predicting lat/long coordinates along a path? To mitigate issues with GPS drift.
If not, what would be a useful model?
aantix
12 days ago
Map matching to a road network might be helpful here. For example, a Hidden Markov Model gives good results. See for instance this paper:
"Hidden Markov map matching through noise and sparseness" (2009)
I imagine they're both worse than good old exponential smoothing or SARIMAX.
VHRanger
12 days ago
Depends on use case. Hybrid approaches have been dominating the M-Competitions, but there are generally small percentage differences in variance of statistical models vs machine learning models.
And exponentially higher cost for ML models.
Pseudocrat
12 days ago
At the end of the day, if training or doing inference on the ML model is massively more costly in time or compute, you'll iterate much less with it.
I also think it's a dead end to try to have foundation models for "time series" - it's a class of data! Like when people tried to have foundation models for any general graph type.
You could make foundation models for data within that type - eg. meteorological time series, or social network graphs. But for the abstract class type it seems like a dead end.
VHRanger
12 days ago
These models may be helpful if they speed up convergence when fine tuned on business-specific time series.
rockinghigh
12 days ago
so this TimesFM is also in the same category as TimeGPT from nixtlaverse?
dangerclose
9 days ago
is there a ranking of the methods that actually work on benchmark datasets? Hybrid, "ML" or old stats? I remember eamonnkeogh doing this on r/ML a few years ago.
SpaceManNabs
12 days ago
Prophet was pretty bad so yes, but it doesn't seem much better than ARIMA
efrank3
11 days ago
what about neuralprophet came after prophet? some companies like mixpanel mentioned in their documentation that they are using prophet for forecasting/anomaly detection
dangerclose
9 days ago
If I give this model the first 100 prime numbers, does it give me back the rest of it? If so what is the circuit?
celltalk
11 days ago
how is the series of the first 100 prime numbers a time series ?
fedeb95
11 days ago
everything can be time series if you want
celltalk
2 days ago
Prophet 2.0
htrp
11 days ago
When it comes to time series forecasting, if the method actually works, it sure as hell isn't being publicly released.
optimalsolver
12 days ago
Some times series are more predictable than others. Being good at predicting the predictable ones is useful.
For example you can easily predict the weather with descent accuracy. Tomorrow is going to be about the same than today. From there you can work on better models.
Or predicting a failure in a factory because a vibration pattern on an industrial machine always ended up in a massive failure after a few days.
But I agree that if a model is good at predicting the stock market, it’s not going to be released.
speedgoose
12 days ago
and yet we have those huge llamas publicly available. these are computers that talk, dammit
I'm curious why we seem convinced that this is a task that is possible or something worthy of investigation.
I've worked on language models since 2018, even then it was obvious why language was a useful and transferable task. I do not at all feel the same way about general univariate time series that could have any underlying process.
whimsicalism
12 days ago
Time series data are inherently context sensitive, unlike natural languages which follow predictable grammar patterns. The patterns in time series data vary based on context. For example, flight data often show seasonal trends, while electric signals depend on the type of sensor used. There's also data that appear random, like stock data, though firms like Rentech manage to consistently find unlerlying alphas. Training a multivariate time series data would be challenging, but I don't see why not for specific applications.
wuj
12 days ago
Is Rentech the only group that genuinely manages to predict stock price? Seems like the very observation that it’s still possible would be enough motivation for other groups to catch up over such a long period.
Also, the first realistic approximation of Solomonoff induction we achieve is going to be interesting because it will destroy the stock market.
Xcelerate
11 days ago
Rentech does not seem to be able to predict the stock market for their customers...
"Jim Simons' Renaissance Technologies suffers $11 billion of client withdrawals in 7 months" - https://markets.businessinsider.com/news/stocks/jim-simons-r...
belter
11 days ago
I’m referring to RenTech’s well known Medallion fund, which I believe is now only available internally to longtime employees. Even in the article you linked, it says this fund has “continued to shine”.
Xcelerate
10 days ago
If you think about it a little bit...And you read "Fooled By Randomness", there are 20 other tricks they could be playing here...Instead of "predicting" the market.
belter
10 days ago
Maybe that would be a good thing. I wouldn't mourn the destruction of the stock market as it's just a giant wealth-gap increasing casino. Trading has nothing to do with underlying value.
amelius
11 days ago
I fully agree. The stock market is just a giant machine that pulls money out of systems.
pvorb
11 days ago
>The stock market is just a giant machine that pulls money out of systems.
So you think the multi-trillion dollar stock market, consisting of thousands of global companies, has no use beyond "pulling money out of systems"? Weird.
itsoktocry
10 days ago
https://en.wikipedia.org/wiki/Tulip_mania
Just to say, weirdness happens.
amelius
10 days ago
Agreed, if stock prices were predictable by some technical means, they would be quickly driven to unpredictability by people trading on those technical indicators.
icapybara
11 days ago
This is that old finance chestnut. Two finance professors are walking down the hall and one of them spots a twenty dollar bill. He goes to pick it up but the other professor stops him and says "no don't bother. If there was twenty dollars there someone would have already picked it up"
Yes, people arbitrage away these anomalies, and make billions doing it.
frankc
11 days ago
My point was more that the arbitrage is not systematically predictable, not that the arbitrage doesn’t exist.
icapybara
10 days ago
And that would make the stock markets accessible to fewer people, further widening the wealth gap.
amelius
10 days ago
Why do you think language is so special?
There's an extensive body of literature across numerous domains that demonstrates the benefits of Multi-Task Learning (MTL). Actually I have a whole folder of research papers on this topic, here's one of the earliest references on hand that I feel captures the idea succinctly in the context of modern ML:
“MTL improves generalization by leveraging the domain-specific information contained in the training signals of related tasks" [Caruana, 1998]
I see repetition and structure everywhere in life. To me it's not far fetched that a model trained on daily or yearly trends could leverage that information in the context of e.g. biological signals which are influenced by circadian rhythm etc.
Disclaimer: my background is in ML & bio-signals, I work with time series too much.
refibrillator
11 days ago
For those who haven't read it, Rich Caruana's thesis on multi-task learning is beautifully written (the cited 1998 paper here). It's amazing to see how far the field has come, and, at the same time, how advanced the thinking was in the 90s too.
owl_brawl
11 days ago
The things that we are typically interested in have very clear patterns. In a way, if we find that there are no patterns, we don't even try to do any forecasting.
"The Unreasonable Effectiveness of Mathematics in the Natural Sciences" [1] hints that there might be some value here.
[1] https://en.m.wikipedia.org/wiki/The_Unreasonable_Effectivene...
smokel
12 days ago
Exactly, so for example, I think the use of this model is in cases where you want user count to have some pattern around timing. And be alerted if it has spike.
But you wouldn't want this model for file upload storage usage which only increases, where you would put alerts based on max values and not patterns/periodic values.
yonixw
12 days ago
Why not? There are plenty of time series that have underlying patterns which means you can do better than a total guess even without any knowledge of what you are predicting.
Think about something like traffic patterns. You probably won't predict higher traffic on game days, but predicting rush hour is going to be pretty trivial.
IshKebab
12 days ago
There is potential for integrating ML with time series data in industrial applications (things like smelters, reactors etc.), where you have continuous stream of time series measurements from things like gauges and thermocouples. If you can detect (and respond) to changing circumstances faster then a humans in control room reacting to trends or alarms then potential big efficiency gains...
Operator guidance is often based on heuristics - when metric A exceeds X value for Y seconds take action Z. Or rates of change if the signal is changing at a rate of more than x etc.
So in these areas there exists potential for ML solution, especially if it's capable of learning (i.e. last response overshot by X so trim next response appropriately).
bigger_cheese
11 days ago
Every time i've actually tried something like this it has not outperformed statistical process control.
It's not just that control charts are great signal detectors, but also managing processes like that takes a certain statistical literacy one gets from applying SPC faithfully for a while, and does not get from tossing ML onto it and crossing fingers.
kqr
11 days ago
> Every time i've actually tried something like this it has not outperformed statistical process control.
There are clear counterexamples to your experience, most notably in maintaining plasma stability in tokamak reactors: https://www.nature.com/articles/s41586-021-04301-9
chaos_emergent
11 days ago
Interesting. Could you point me to where it is compared against SPC? I didn't find it from a cursory read.
kqr
11 days ago
task specific model
whimsicalism
11 days ago
Fundamentally, the pre-trained model would need to learn a "world model" to predict well in distinct domains. This should be possible not regarding compute requirements and the exact architecture.
After all, the physical world (down to the subatomic level) is governed by physical laws. Ilya Sutskever from OpenAI stated that next-token prediction might be enough to learn a world model (see [1]). That would imply that a model learns a "world model" indirectly, which is even more unrealistic than learning the world model directly through pre-training on time-series data.
[1] https://www.youtube.com/watch?v=YEUclZdj_Sc
shaism
12 days ago
But the data generating process could be literally anything. We are not constrained by physics in any real sense if we predicting financial markets or occurrences of a certain build error or termite behavior.
whimsicalism
12 days ago
Sure, there are limits. Not everything is predictable, not even physics. But that is also not the point of such a model. The goal is to forecast across a broad range of use cases that do have underlying laws. Similar to LLM, they could also be fine-tuned.
shaism
11 days ago
"predicting the next token well means that you understand the underlying reality that led to the creation of that token"
People on the AI-hype side of things tend to believe this, but I really fundamentally don't.
It's become a philosophical debate at this point (what does it mean to "understand" something, etc.)
wavemode
11 days ago
> I'm curious why we seem convinced that this is a task that is possible or something worthy of investigation.
There's a huge industry around time series forecasting used for all kinds of things like engineering, finance, climate science, etc. and many of the modern ones incorporate some kind of machine learning because they deal with very high dimensional data. Given the very surprising success of LLMs in non-language fields, it seems reasonable that people would work on this.
zeroxfe
12 days ago
Task specific time series models, not time series “foundation models” - we are discussing different things.
whimsicalism
12 days ago
I don't think we are. The premise of this is that the foundation model can learn some kind of baseline ability to reason about forecasting, that is generalizable across different domains (each which needs fine tuning.) I don't know if it will find anything, but LLMs totally surprised us, and this kind of thing seems totally worthy of investigation.
zeroxfe
11 days ago
Foundational time series models have been around since 2019 and show competitive levels of performance with task specific models.
https://arxiv.org/abs/1905.10437
cscurmudgeon
11 days ago
+1 for “any underlying process”. It would be interesting what use case they had in mind.
sarusso
12 days ago
Watch this talk from Albert Gu:
Efficiently Modeling Long Sequences with Structured State Spaces
https://www.youtube.com/watch?v=luCBXCErkCs
They made one of the best time series models and it later became one of the best language models too (Mamba).
cma
11 days ago
I have already watched that talk and know Albert Gu. His work is not about a “foundational” time series model but rather a task specific one.
whimsicalism
11 days ago
OK right, same architecture, but different trained models.
cma
10 days ago
well... if you look at a language in a certain way, it is just a way to put bits in a certain order. if you forget about the 'language' part, it kinda makes sense to try because why shouldn't it work?
baq
12 days ago
I think there are some generalizable notions of multiscale periodicity that could get embedded into some kind of latent space.
klysm
11 days ago
as you say, without knowing anything about the underlying process, we can't predict generally. Some other comments point to contexts in which we do know something about the underlying. For instance, I don't think finance is something where you can apply this kind of stuff.
fedeb95
11 days ago
There was a paper written a while back that proved mathematically how you can correlate any time series with any other time series, thus vaporizing any perception of value gained by correlating time series (at least for those people that read the paper.) just wanted to share
itronitron
11 days ago
What does that mean "you can correlate"? That phrase is meaningless.
bdjsiqoocwk
11 days ago
I would like to read more. Feels sort of like an expression of certain “universal truths” like the 80/20 rule or golden ratio
notnaut
11 days ago
The only other timeseries paper I am aware of is TimeGPT
https://news.ycombinator.com/item?id=37874891
jimmySixDOF
11 days ago
Do you have a link to the paper?
nextaccountic
11 days ago
Not really. It's true it would usually need more context than a single series dataset but you can predict broadly accurate-ish bandwidth usage trends just using simple statistical extrapolation, we've been doing that since the early 90s. If you give a model your subscriber numbers and usage data as time series it should be able to tell you quite accurately how much electricity|bandwidth|gas|road traffic levels| metro passenger levels at station Z... you'll be using at 4pm on January 4th 2026.
matt-p
11 days ago
On a related note, Amazon also had a model for time series forecasting called Chronos.
https://github.com/amazon-science/chronos-forecasting
wuj
12 days ago
And like all deep learning forecasting models thus far, it makes for a nice paper but is not worth anyone using for a real problem. Much slower than the classical methods it fails to beat.
claytonjy
12 days ago
That’s what people said about CV models in 2011.
p1esk
12 days ago
That's fair, but they stopped saying it about CV models in 2012. We've been saying this about foundational forecasting models since...2019 at least, probably earlier. But it is a harder problem!
claytonjy
12 days ago
They also have Amazon Forecast with different algos - https://aws.amazon.com/forecast/
belter
11 days ago
Something I've had issues with time series has been having to use relatively custom models.
It's difficult to use off the shelf tools when starting with math models.
toasted-subs
12 days ago
Seems like a pretty small (low latency) model. Would be interesting to hook up to mouse input (x and y) and see how well it predicts where I’m gonna move the mouse (maybe with and without seeing the predicted path)
nwoli
12 days ago
Curious George here: why are you trying to predict where the mouse is going? :)
throwtappedmac
12 days ago
Game developers are constantly trying to minimize lag. I have no idea if computers are so fast these days that it is a "solved" problem, but I knew a game developer ages ago who used a predictive mouse model to reduce the apparent lag by guessing where the mouse would be at the time the frame was displayed (considering it took 30 ms or whatever to render the frame).
tasty_freeze
12 days ago
Quake internet play only became acceptable when client side prediction was implemented, I'm sure it would be better to have real prediction instead of simple interpolation.
https://raw.githubusercontent.com/ESWAT/john-carmack-plan-ar...
aeyes
12 days ago
What an amazing look into one of the greatest minds in programming!
Thank you for this treasure.
The relevant bits:
> I am now allowing the client to guess at the results of the users movement until the authoritative response from the server comes through. This is a biiiig architectural change. The client now needs to know about solidity of objects, friction, gravity, etc. I am sad to see the elegent client-as-terminal setup go away, but I am practical above idealistic.
> The server is still the final word, so the client is allways repredicting it's movement based off of the last known good message from the server.
ukuina
11 days ago
Competitive online games commonly predict the player's movement. Network latencies have improved and are now usually <16ms (useful milestone since at 60fps you render a frame every 16.6ms), but players expect to still be able to smoothly play when joining from the other side of the continent to play with their friends. You usually want every client to agree where everyone is, and predicting movement leads to less disagreement than what you would get from using "outdated" state because of speed-of-light delays.
If you want to predict not just position but also orientation in a shooter game, that's basically predicting the mouse movements.
wongarsu
12 days ago
The only thing worse than lag is uneven lag, which is what you're going to end up with. Constant lag can be dealt with by players, jitter can't.
orbital-decay
12 days ago
Just to see how good the model is (maybe it’s creepily good in a fun way)
nwoli
12 days ago
There's a fun game idea in there! Imagine having to outmaneuver a constantly learning model. Not to mention the possibilities of using this in genres like bullet hell...
Timon3
12 days ago
Catching cheaters in games might seem like a good use.
brigadier132
12 days ago
Think of the sweet sweet ad revenue!
teaearlgraycold
12 days ago
Haha as if advertisers don't know me better than I know me
throwtappedmac
12 days ago
What is the latency?
jarmitage
12 days ago
"Time series" is such an over-subscribed term. What sorts of time series is this actually useful for?
For instance, will it be able to predict dynamics for a machine with thousands of sensors?
uoaei
12 days ago
Specifically, its referring to univariate, contiguous point forecasts. Honestly, I'm a little puzzled by the benchmarks.
techwizrd
12 days ago
Even if it was for multivariate time series, the model would first need to infer what machine are we talking about, then its working conditions, and only then make a reasonable forecast based on an hypothesis of its dynamics. I don’t know, seems pretty hard.
sarusso
12 days ago
Indeed. An issue I ran into over and over while doing research for semiconductor manufacturing.
My complaint was more illustrative than earnest.
uoaei
12 days ago
"Why would you even try to predict the weather if you know it's going to be wrong?"
- most OCs on this thread
chaos_emergent
12 days ago
I have a few qualms with this app: 1. For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.
2. It doesn't actually replace a USB drive. Most people I know e-mail files to themselves or host them somewhere online to be able to perform presentations, but they still carry a USB drive in case there are connectivity problems. This does not solve the connectivity issue.
3. It does not seem very "viral" or income-generating. I know this is premature at this point, but without charging users for the service, is it reasonable to expect to make money off of this?
david_shi
11 days ago
Blog link (Feb 2024): https://research.google/blog/a-decoder-only-foundation-model...
Previous discussion: https://news.ycombinator.com/item?id=39235983
l2dy
12 days ago
Dear googler or meta-er or timeseries transformer startup something-er: Please make a ChatGPT/chat.lmsys.org style interface for one of these that I can throw data at and see what happens.
This one looks pretty easy to setup, in fairness, but some other models I've looked at have been surprisingly fiddly / locked behind an API.
Perhaps such a thing already exists somewhere?
mhh__
12 days ago
It seems to me that predicting something based on time is rarely accurate and meaningful.
Suppose you want to buy stocks? Would you look on a time based graph and buy according to that? Or you rather look at financial data, see earnings, profits? Wouldn't a graph that has financial performance on x-axis be more meaningful that one that has time?
What if you research real estate in a particular area? Wouldn't be square footage a better measure than time?
DeathArrow
11 days ago
> Would you look on a time based graph and buy according to that? Or you rather look at financial data, see earnings, profits?
Things affecting financials happen through time.
Terretta
11 days ago
All things happen through time, but my argument is that time might not be the best parameter to model relations.
DeathArrow
11 days ago
Is anyone using neural networks for anomaly detection in observability? If so, which model and how many metrics are you supporting per core?
esafak
12 days ago
LSTM is common for this.
also https://facebook.github.io/prophet/
leeoniya
12 days ago
How data hungry is it, or what is the minimum volume of data needed before its worth investigating?
morkalork
12 days ago
11 days ago
The more complex the data is, the more you need. If your values are always 5, then you need only one data point.
viraptor
11 days ago
If your values were always 5,you wouldn't use an LSTM to model it either. So presumably there's a threshold for when LSTM becomes practical and useful, no?
morkalork
11 days ago
Sure, that was an extreme example. The point was that the minimum of data is 1 point, maximum is "all of it". It entirely depends on your use case.
viraptor
10 days ago
What do you mean by “observability”?
sarusso
12 days ago
Telemetry. Dashboards. The application is knowing when a signal is anomalous.
https://en.wikipedia.org/wiki/Observability_(software)
esafak
12 days ago
Oh, yes I am working on that. Usually LSTM, exploring encoder-decoders and generative models, but also some simpler models based on periodic averages (which are surprisingly useful in some use cases). But I don’t have per-core metrics.
sarusso
12 days ago
Depending on how stable your signal is, I've had good experience with seasonal ARIMA and LOESS (but it's not neural networks)
tiagod
12 days ago
how good is it on stocks?
polskibus
12 days ago
The next index fund should use AI. What could possibly go wrong?
svaha1728
12 days ago
I promise you your market-making counterparties already are.
whimsicalism
12 days ago
What kind of things are they doing with AI?
hackerlight
11 days ago
Predicting price movements, finding good hedges, etc.
whimsicalism
11 days ago
it doesn't apply. Checkout the Incerto by Nassim Nicholas Taleb.
fedeb95
11 days ago
if I knew it was good, why would I tell you that?
claytonjy
12 days ago
I'm not sure I understand two things here. Could someone clarify: 1. This is a foundation model, so you're expected to fine tune for your use case, right? (But readme doesn't mention tuning) 2. When submitting two series, do they impact each other in predictions?
viraptor
11 days ago
How can time series model be pre-trained ? I think I’m missing something.
iamgopal
12 days ago
If you have a univariate series, just single values following each other -
[5, 3, 3, 2, 2, 2, 1, …]
What is the next number? Well let’s start with the search space - what is the possible range of the next number? Assuming unsigned 32bit integers (for explanation simplicity) it’s 0-(2^32-1)
So are all of those possible outputs equally likely? The next number could be 1, or it could be 345,654,543 … are those outputs equally likely?
Even though we know nothing about this sequence, most time series don’t make enormous random jumps, so no, they are not equally likely, 1 is the more likely of the two we discussed.
Ok, so some patterns are more likely than others, let’s analyse lots and lots of time series data and see if we can build a generalised model that can be fine tuned or used as a feature extractor.
Many time series datasets have repeating patterns, momentum, symmetries, all of these can be learned. Is it perfect? No, but what model is? And things don’t have to be perfect to be useful.
There you go - that’s a pre-trained time series model in a nutshell
malux85
12 days ago
Third paragraph of the introduction of the mentioned paper[1] in the first paragraph of the repo.
[1] https://arxiv.org/abs/2310.10688
melenaboija
12 days ago
I guess they pre-trained the model to exploit common patterns found in any time-series (e.g., seasonalities, trends, etc.)... What would be interesting, though, is to see if it spots patterns that are domain-specific (e.g., the ventricular systole dip in an electrocardiogram), and possibly transfer those (that would be obviously useless in this specific example, but maybe there are interesting domain transfers out there)
jurgenaut23
12 days ago
My understating is that, while your eye can naturally spot a dependency over time in time series data, machines can’t. So as we did for imaging, where we pre-trained models to let machines easily identify objects in pictures, now we are doing the same to let machines “see” dependencies over time. Then, how these dependencies work, this is another story.
sarusso
12 days ago
11 days ago
Anyone have insights working with Ikigai’s “Large Graphical Model” and how well it does on time-series? It’s proprietary, but I’m curious how well it performs.
hm-nah
11 days ago
Would this be useful in predicting lat/long coordinates along a path? To mitigate issues with GPS drift.
If not, what would be a useful model?
aantix
12 days ago
Map matching to a road network might be helpful here. For example, a Hidden Markov Model gives good results. See for instance this paper:
"Hidden Markov map matching through noise and sparseness" (2009)
https://www.microsoft.com/en-us/research/wp-content/uploads/...
smokel
12 days ago
Kalman filter
bbstats
11 days ago
12 days ago
is it better than prophet from meta?
dangerclose
12 days ago
I imagine they're both worse than good old exponential smoothing or SARIMAX.
VHRanger
12 days ago
Depends on use case. Hybrid approaches have been dominating the M-Competitions, but there are generally small percentage differences in variance of statistical models vs machine learning models.
And exponentially higher cost for ML models.
Pseudocrat
12 days ago
At the end of the day, if training or doing inference on the ML model is massively more costly in time or compute, you'll iterate much less with it.
I also think it's a dead end to try to have foundation models for "time series" - it's a class of data! Like when people tried to have foundation models for any general graph type.
You could make foundation models for data within that type - eg. meteorological time series, or social network graphs. But for the abstract class type it seems like a dead end.
VHRanger
12 days ago
These models may be helpful if they speed up convergence when fine tuned on business-specific time series.
rockinghigh
12 days ago
so this TimesFM is also in the same category as TimeGPT from nixtlaverse?
dangerclose
9 days ago
is there a ranking of the methods that actually work on benchmark datasets? Hybrid, "ML" or old stats? I remember eamonnkeogh doing this on r/ML a few years ago.
SpaceManNabs
12 days ago
Prophet was pretty bad so yes, but it doesn't seem much better than ARIMA
efrank3
11 days ago
what about neuralprophet came after prophet? some companies like mixpanel mentioned in their documentation that they are using prophet for forecasting/anomaly detection
dangerclose
9 days ago
If I give this model the first 100 prime numbers, does it give me back the rest of it? If so what is the circuit?
celltalk
11 days ago
how is the series of the first 100 prime numbers a time series ?
fedeb95
11 days ago
everything can be time series if you want
celltalk
2 days ago
Prophet 2.0
htrp
11 days ago
When it comes to time series forecasting, if the method actually works, it sure as hell isn't being publicly released.
optimalsolver
12 days ago
Some times series are more predictable than others. Being good at predicting the predictable ones is useful.
For example you can easily predict the weather with descent accuracy. Tomorrow is going to be about the same than today. From there you can work on better models.
Or predicting a failure in a factory because a vibration pattern on an industrial machine always ended up in a massive failure after a few days.
But I agree that if a model is good at predicting the stock market, it’s not going to be released.
speedgoose
12 days ago
and yet we have those huge llamas publicly available. these are computers that talk, dammit
baq
12 days ago
12 days ago