So this talk is about Time Series Databases AKA Temporal Databases you might see them referred to as TSDB or TSDS
Now if your job is very boring (or perhaps on the contrary extremly exciting!) you may have met those as "Operational Historians or Enterprise Historians "
This very short talk is a public service of sorts from the BuzzProphets.
Those of you who know me know I have my ear firmly stuck to the ground.
OH probably the vast majority doesn't know me so...
Hello my name is Ori Pekelman. I am OriPekelman everywhere (twitter/github/linked-in).
My blog is on http://blog.constellationmatrix.com
My company, Constellation Matrix, does software architecture and I made myself a nice logo.
Back to our business of holding one's ear to the ground in a very uncomfortbale position. And to TSDS's
So a Time Series Database is a speciality thing, a domain specific piece of software. These has been around for a while.
You can see them in control and industrial systems to record every last damn thing that is happening.
In financial systems, in energy. Companies like GE and Honywell produce this kind of software.
There are even guys in this domain that get to have a 2 letters .com domain.. http://kx.com
This is a very short talk so I can't really delve into this. But consider that the only operation we know how to do, reliably, in a distributed system is insert.
We can basically always suceed in writing somewhere
User A has Expressed Intent B on Data Piece C
your operations or binary log, your write side cache, your append only file.. whatever.
But this information is basically useless if we don't put this in a timeframe. User
A has Expressed Intent B on Data Piece C at moment D
SOO....
Yet I think that everybody is kind of going to wake up in the coming months slapping their fronts and saying, we are all implementing at a stupidly low level the same patterns over the same types of data. Can't we get something off the shelf for this?
I got interested in this while thinking about the following: On an e-commerce site, how do I identify, in real-time, customers that are hesitating. Considering they have probably multiple tabs open.
You can see this emerge already with the very intersting way ElasticSearch is developing (look at Kibana3 if you haven't)
Very recently druid.io went out
Look at what is a first class object with them
This is the canonical example on their homepage
More of the purists, look at https://tempo-db.com
You can see this also in very interesting projects like http://skydb.io/ and this will probably be even a stronger trend, Time Series Databases for behavioural data.
Because the whole big data thing... well its around that right?
Keen.io is another service geared specifically to analytics.
You might also want to have a look at the two years old MongoDB based solution from Square Cube http://square.github.io/cube/
There are of course OpenTSDB and Kairo (respectively for Hbase and Cassandra) we even see emerging the "as a service crowd" (have a look at TempoDB, Keen) on this segment.
You know its cool when even the Zindoz guyz have one http://openhistorian.codeplex.com/
I am OriPekelman everywhere (twitter/github/linked-in).