The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Report across databases

I have two different web applications each with their own database. I now have a requirement to join data from both databases in one report in one of the applications. Both apps have a good separation of the data layer, business layer, etc and I wouldn’t like to break everything by joining tables from both databases.

My other option is to expose the column information in App2 through, for example, a web service and call it repeatedly from App1 for each record returned for the report. The problem is that the user can configure the report so, potentially, it can return a few thousand records. In this scenario the report can take a long time to run (did not test yet, though).

Any other way I can go about it in a loosely way?
BD Send private email
Tuesday, November 29, 2005
How exactly would the join break things?

Why not simply create a view in app A that joins the tables from A with the tables from B? Now it looks as if you're reporting from a table in A.
George Jansen Send private email
Tuesday, November 29, 2005
Well, the idea is to keep both apps separated as much as possible. If I create the view then if I change the database name then I would have to update the view. Of course this would be easy to do but it creates a "hard" link between the two applications that I don't want to create. Also, the databases may be in different servers.
BD Send private email
Tuesday, November 29, 2005
Some DBs allow remote access, Oracle has dblink.

Would it be possible to run the report separately on each DB than merge results in your app?

Honu Send private email
Tuesday, November 29, 2005
You might want to tell us which database you are using, as this sort of thing can be db specific.

Interbase, MS Access, SQL, Oracle all have built in features to allow this sort of thing, with varying levels of dammit involved. (ex: MS access would be very chatty over the network, but would provide you the quickest point and click development time.).

I'm not sure about MySQL. (I use it frequently, but haven't had a need for remote joining)
Michael Johnson Send private email
Tuesday, November 29, 2005
The textbook reply is to consolidate in a third database and then report from that.
In your case feels like shooting a fly with a Patriot.

Some reporting tools can syncronize data from different queries (Business Objects for example). May be is possible to find one suited to your app.
Sevenoaks Send private email
Tuesday, November 29, 2005
Thanks for the feedback.

I am using SQL Server and I know I can link servers and so on. But the point here is: Is that a best practice for large systems from an architectural point of view?

Although I simplified the example, in fact I have several applications that support a manufacturing company. At the moment everything reports and works across databases and across linked servers. As you can expect this has led to spaghetti SQL making migrations a long and painfull exercise.
BD Send private email
Tuesday, November 29, 2005
SQL Server Analysis Services ?
Justin Send private email
Tuesday, November 29, 2005
As a best practice, when I get a bunch of reports that need to span systems, I'll look into a data warehouse.

You don't need to go psycho on it at the start, but it can save you a lot of trouble if you have everything in one place.
Lou Send private email
Tuesday, November 29, 2005
If each application has its own data layer, what's to prevent you from allowing application A to have two data layers?

Essentially open a connection object to database A, read the records, close the connection, open another object to database B, read those records, close the connection, and then do the merge at the business logic layer.

I will be the first to claim yes, this is horribly unefficient BUT this may be one of those situations where you just need "good enough".  For example, if you're generating these reports once a month, and only two or three people would actually build the reports, it doesn't make much sense to engineer a complex solution.

I would only try creating shared tables, views, whatchamacallits IF you needed to generate a lot of reports, you need relatively quick response time, and you will have to repeat the calculations throughout the day.

Like Sevenoaks says, you don't need to use a Patriot missile to kill a fly if a flyswatter will do.
Anonymous Coward
Tuesday, November 29, 2005
I've done something like this with SQL server when the company just wouldn't accept building an OLAP server box. One of the queries was this page and a half select statement, peppered with coalesce and joining 2 separate database servers. One table was growing about a million rows per month. "Which customers (of category_1) have not sent a message over Network_A this month?" was the pretty much the query.

select blah from server1.databaseA.dbo.table_1 left join server2.databaseB.dbo.table_2


select blah from server2.databaseB.dbo.table_2 left join server1.databaseA.dbo.table_1

Gave wildly different query times. Like 15 min one way and 4+ hours the other way.

Do what works. Don't play artist with making it pretty. As much as I like pretty, pretty doesn't pay the bills.
Wednesday, November 30, 2005
I have to step back a bit.
You mentioned "serveral applications" and "spaghetti SQL" for a company. This is the case when thinking to a datawarehouse becomes viable.

The point is the fault level and the maintenance costs you have. You should evaluate carefully wether costs and efforts to set up a dw may be lower then the costs of keeping your current architechture in the medium-long run.

I you want to mail me personally with some details maybe I can help you.
Sevenoaks Send private email
Wednesday, November 30, 2005
As per the posts so far I seem to have two options:

- Data warehouse
- Two data layers for one of the apps

But I still have a problem, all the databases are OLTP in structure and some of the reporting must be done in real time (let's call them online reports).

So, I can have an ETL process that updates a reporting database(s) daily and in addition I can create a small secondary data layer for the online reports. The length of the online reports will have to be controlled (paging, I guess) to avoid problems.

Thanks again for the feedback!
BD Send private email
Wednesday, November 30, 2005
data warehouse is a heavy solution. If all you want is to create reports that can join from multiple db or other data sources, many highend reporting tools support that directly. Just saw a demo from style report and they actually have a spreadsheet like interface so you can join/merge/union data from any data source you want. Think bo  supports that too but not sure.
Sunday, December 04, 2005

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz