A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.
How do I model daily stock price data to support the following use case?
I want to be able to find the high or low for a given n-day period. For example, the 30-day period ending on June 19, 2006. The problem is that not every day is a trading day, so you cannot just subtract thirty days to get the start date.
My idea was to use a sequential artifical id:
So, to get my high/low:
a = select id from price where date='06-19-2006'
b = a - 30
select max(close) from price where id >=b and id <=a
The sequential id's would be generated when the data is imported.
This is feasible, but it seems unduly limiting and klunky. Am I missing some obvious solution? Should I store a calendar of business days in my database to enable calculation of date ranges? Does SQL already support what I need to do - with cursors, maybe? Is it possible to say "select the previous 50 rows ordered by date" in SQL (MySql specifically)?
Relational theory and by proxy, SQL, is extremely unsuited for dealing with the concept of "order". Extensions to SQL that make such tasks more reasonable have been proposed, and I believe one is more or less accepted as a standard (SQL extensions for OLAP), but it is not practically available, and the way things are going it's doubtful it ever will.
<http://www.cs.nyu.edu/~lerner/aqueryseminar.ppt> is a good reference for the problems and a possible solution.
Oracle has a syntax for recursive lookups that will give you an answer, if you have for each record a pointer to the prev/next trading day ("CONNNET BY"). Other vendors probably have similar extensions, but by-and-large, the problem you are trying to solve is not properly addressed within the RDBMS model.
Kx's database kdb+ solves this very elegantly, but with a significantly different language called 'q' (successor to their older 'ksql')
Saturday, June 17, 2006
Oh, and -- all that said, for such a specific query you can: "select * from ... where date<=$wanted order by desc date limit 50" (or something like that, my MySQL is rusty).
What I wrote above mostly relates to the things you're going to meet down the line, such as moving averages, computing "x week high/lo" and stuff.
Saturday, June 17, 2006
I'm a big fan of a time/date dimension table, which is basically an ordered list of dates with various attributes, such as workday?, trading day?, day of the week, month, year, etc. It's a very flexible, non-partisan solution that works well in the data warehousing environments.
This topic is archived. No further replies will be accepted.Other recent topics
Powered by FogBugz