The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

data model for stock prices

How do I model daily stock price data to support the following use case?

I want to be able to find the high or low for a given n-day period.  For example, the 30-day period ending on June 19, 2006.  The problem is that not every day is a trading day, so you cannot just subtract thirty days to get the start date.

My idea was to use a sequential artifical id:

price
-----
id
date
open
high
low
close

So, to get my high/low:

a = select id from price where date='06-19-2006'
b = a - 30
select max(close) from price where id >=b and id <=a
 
The sequential id's would be generated when the data is imported.

This is feasible, but it seems unduly limiting and klunky.  Am I missing some obvious solution?  Should I store a calendar of business days in my database to enable calculation of date ranges?  Does SQL already support what I need to do - with cursors, maybe?  Is it possible to say "select the previous 50 rows ordered by date" in SQL (MySql specifically)?

Thanks!
NPR Send private email
Saturday, June 17, 2006
 
 
Relational theory and by proxy, SQL, is extremely unsuited for dealing with the concept of "order". Extensions to SQL that make such tasks more reasonable have been proposed, and I believe one is more or less accepted as a standard (SQL extensions for OLAP), but it is not practically available, and the way things are going it's doubtful it ever will.

<http://www.cs.nyu.edu/~lerner/aqueryseminar.ppt> is a good reference for the problems and a possible solution.

Oracle has a syntax for recursive lookups that will give you an answer, if you have for each record a pointer to the prev/next trading day ("CONNNET BY"). Other vendors probably have similar extensions, but by-and-large, the problem you are trying to solve is not properly addressed within the RDBMS model.

Kx's database kdb+ solves this very elegantly, but with a significantly different language called 'q' (successor to their older 'ksql')
Ori Berger
Saturday, June 17, 2006
 
 
Oh, and -- all that said, for such a specific query you can: "select * from ... where date<=$wanted order by desc date limit 50"  (or something like that, my MySQL is rusty).

What I wrote above mostly relates to the things you're going to meet down the line, such as moving averages, computing "x week high/lo" and stuff.
Ori Berger
Saturday, June 17, 2006
 
 
Thank you, Ori.  That solves my current problem.  For other functions, I have considered stored procedures, although at this point I want to avoid having my code tied to a single database platform.
NPR Send private email
Saturday, June 17, 2006
 
 
I'm a big fan of a time/date dimension table, which is basically an ordered list of dates with various attributes, such as workday?, trading day?, day of the week, month, year, etc. It's a very flexible, non-partisan solution that works well in the data warehousing environments.
Steve Hirsch Send private email
Tuesday, June 20, 2006
 
 
Perhaps you might like to read this out-of-print book:
http://www.cs.arizona.edu/people/rts/tdbbook.pdf
hosted at the author's site:
http://www.cs.arizona.edu/people/rts/
(can also download zip of the cd-rom that went with the book)>
Peter
Tuesday, June 20, 2006
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz