The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Database structure obfuscation techniques ?

I'm designing a shrink wrapped application and I need to "hide" somehow the database structure (SQL server database) from the eyes of the competitors that may attempt casual reverse engineering.

Most of the application uniqueness and business logic can be understood just by looking at the db structure.

So what I want to do is to obfuscate the fields and table names in order to make their names meaningless at first sight.
Eg, instead of CustomerID to have CXIYZ and instead of Invoices to have PPLQW for example. Or something like that.

(Of course, using advanced hackind and debugging techniques, one could still trace API calls, COM calls, etc and figure out what goes where in the end, but I just want to discourage the casual hacker to get the business logic too easily.)

Does anybody know if a product or component to do something like that exists ? Any other ideas on how to achieve this ?
Stefan Send private email
Monday, February 21, 2005
Take all the energy, time and committment you plan on putting into this and put it into creating application features (stuff the users experience) that your users will value and lock them into you.

And keep on innovating.

There are very few golden bullet data models that haven't already been designed.

In fact, none.
Simon Lucy Send private email
Monday, February 21, 2005
I appreciate the suggestion and I certainly don't intend to spend more time and effort than it's worth to protect my intelectual property.

That's why I ask this question here, hopefully such tool already exists or people that had the same problem could provide me useful ideas on how they did it and avoid reinventing the wheel.
Stefan Send private email
Monday, February 21, 2005
If you really think you need to do this, then just do it.  It should be that hard.  Just generate random legal strings and use whatever language mechanism is appropriate.  Like:

I've got to go with Simon though.  The db schema is rarely hard to reverse engineer even from just knowing what the application does.  And if I can access the tables, there have to be a lot of them before it becomes hard to rename tables and columns.  (you aren't going to sacrifice your foreign key constraints, are you?).  You can encrypt your stored procedures (if you have any), which is actually worth something (AFAIK).

Don't waste your time.
Brian Send private email
Monday, February 21, 2005
rather than make use random names etc, just swap them about.

The stock table becomes out_of_stock, the order_lines becomes invoices etc.

Much more confusing.
i like i
Tuesday, February 22, 2005
The ontly trouble is that it's not much work to take a look inside the table, and for the most part the contents will then give you the meaning.  Unless you plan on encrypting the contents of the tables in some way then obfuscating the names is a waste of time and effort...
Mat Hall Send private email
Tuesday, February 22, 2005
Having an open database structure is a plus point for me when I am evaluating software. Keep it open and you will gain more customers than you loose through your competitors reverse engineering your product.
Tony Edgecombe Send private email
Tuesday, February 22, 2005
This all sounds a bit big-head vs. little-head (Rosh-katan/gadol?) to me. It's very easy for techies to get bogged down in stuff like this because paranoia seems to be an occupational hazard. You have to try to think with a different mindset from this to run a successful business.
Ross Send private email
Tuesday, February 22, 2005
If you obfuscate the database no one is ever going to write custom reports or integrate other smaller systems into your database.  You've just become an island.

By contrast I've worked on relatively open systems with great layouts.  Those are the ones that we can adapt to meet our needs as they change rather than changing out the system.
Lou Send private email
Tuesday, February 22, 2005
The only database schema that is really difficult to reverse engineer is one that is not very good.

If this is very important to you I recommend denormalizing to such a great extent that it is undecipherable. How about only one really big table with a lot of columns?

Or better yet, have one table with only two columns, an id and a value. The id can be a foreign key to itself (recursive) and you can maintain everything in your code!! Doesn't that sound great? Or maybe two tables just like this and some data is (randomly) stored in one and some in the other.

This way you can spend all your time writing a really big interface layer instead of improving your product. And when you have performance problems, you can then spend all your time fixing them and eventually you will have a completely custom database! Maybe you should make that your product instead of whatever it is you wrote.

Obfuscation is the weakest form of security. If you are really paranoid, use an embedded database and be done.
Steven A Bristol Send private email
Tuesday, February 22, 2005
If the database is clear enough that the business logic just "pops out" as the only thing logical, shouldn't all the competing systems eventually approach this solution?

As others have said, keep your database open and use that as one of your selling points.  Demonstrate how easy it is to write reports, add modules, etc.

Instead of just "selling your app", start calling it "one module of a growing framework."
KC Send private email
Tuesday, February 22, 2005
Joel just used an open mdb file for CityDesk didn't he?

The way I see it, if the main value in your software is in its data structure then you should be concentrating on creating value elsewhere that can't be replicated trivially.
Colm O'Connor Send private email
Tuesday, February 22, 2005
One more thing - creating an open database structure that the customer can read easily ensures interoperability across different platforms. This adds value for the customer, who can use it to create reports or port the data elsewhere.

So really, obfuscation *destroys* value in your software.
Colm O'Connor Send private email
Tuesday, February 22, 2005
Take the effort you would have spent making it obfuscated and spend it in making it easy to use.  You can bet that your competition will do that, and it doesn't matter to the end user which has the harder to decipher back end, just which one is easier to use.

Your time is finite.  Spend it wrong and you lose to the competition.

Really - how does this benefit the end user?  Slowing down the competition like this is dumb.  If they *really* want to get at it, they'll just hire a person whose sole purpose in life is to tear apart your database structure, and if that doesn't work, they'll hire another, and so on.  If your value is in the structure of your database, you've got much bigger problems than this to contend with.

Make your software dirt simple to use.
Aaron F Stanton Send private email
Tuesday, February 22, 2005
==>The only database schema that is really difficult to reverse engineer is one that is not very good.

A "good" schema flows naturally from the application domain. Hide the actual schema, and all a user's gotta to is play with the front end and the schema "falls out" of the app. Not too difficult to do with any application. You can pretty much guess the entities, attributes, and their relations just by spending some time playing around in the front end.

Obfuscating the schema is absolutely *insane* in this day and age of application integration. You want apps to be able to smoothely integrate, passing data back and forth as if they were not separate islands of data -- you want as much flexibilility in passing data in and out as you can get. Obfuscating the schema simply hinders this process and makes it difficult to do anything for the app that you, the developer, have not explicitely exposed to the user.

I would go nowhere near an app that hides its schema. I would run away screaming from such a beast.
Tuesday, February 22, 2005
My question is rather a technical one than a business one.
I probably should have explained that from the beginning.

My database is not a data store. And it only contains numbers (IDs, dates, flag bits, etc). It is a transfer system linking 2 other popular applications.
The niche for my application is very narrow.
There are already 2 other products on the market which solve the same business problem as mine does, and they have been around for a few years. However, mine has a few key features which allow solving  the business problem in a much better way.
My competitors never bothered to improve their products either by ignorance, either by lack of enough combined technical and business knowledge or who knows why.

If I enter the market with the app unobfuscated and wide open, it will take them a minimunm of 6 to 9 months to react and add the same features I have. Since they already have a customer base it will be much harder for me to capture market share from them in that situation. But if the db is cryptic they will need more time to reverse engineer it. Which can gain me some few more months before they come with the exact same features.

That's all I need: to gain some time.
Stefan Send private email
Tuesday, February 22, 2005
No, what you need is a better app, if that's what you have you'll get people moving over (given all the other factors of price, availability and suitability).
Simon Lucy Send private email
Tuesday, February 22, 2005
Obfuscate the db format?  It seems to happen naturally if you let most developers define the tables.
Art Send private email
Tuesday, February 22, 2005
"Which can gain me some few more months before they come with the exact same features.

That's all I need: to gain some time. "

You are overestimating the benefit of a couple of months. How many more sales do you expect to rack up in that extra couple of months? How many will you lose because of having a completely obfuscated database? You must have really high expectations of your ability to not only finish this thing quickly but to effectively market it as well. As soon as your competitors see you start gaining any market share at all they could reverse engineer you product faster than you could make your next sales call. But honestly, they probably won't have to bother.

Also, do you really need a database? You say that your database is not a data store. So why use a database then? Maybe you should consider just using binary flat files. Or some short of binary hashing file system. Much harder for the competition to figure out.

However, I have to agree with the rest of the group. You are making your product less sellable and increasing your own maintenace costs for what will turn out to be zero benefit in the long run.

As for your technical question, I don't know of such a tool (that should be an indication that what you are doing is probably not worth it). The big issue is that if a tool like that did exist, it would not only need to manage obfuscation of database artifacts, but would also have to maintain mappings in source code to each object/table/column. This is simply too much to ask since there are an infinite number of programming languages and methodologies out there.
Tuesday, February 22, 2005
Component Source used to do what you are describing. They shipped CD-ROMs with lots of product on it (back in the days when dial-up for a multi floppy download was an excercise in frustration). Most was full product (very few demos). Figure out the real names of the subdirectories, files and install away. Some had unlock passwords and keys in the database.

They used a password protected mdb (which was renamed) and plenty of obfuscated table/column names. "Security by obscurity" is the name of what they did. With a hex editor, you can remove the password from any mdb. With a little knowledge of Access, one can figure out quite quickly what is going on.

It would take part of Saturday afternoon to unpack and decode each month's/quarter's CDs, and to decide if there was anything there that I was interested in. If I was able to get plenty of freebies from the CDs, any skilled/competant cracker would be able to get far more.

Personally, I think you will spend/waste more time hiding stuff than you will save in getting a working product to the market. In the end, it is *your* decision as to what goes on with your product.
Tuesday, February 22, 2005
"Obfuscate the db format?  It seems to happen naturally if you let most developers define the tables. "

Amen!  :)
Tuesday, February 22, 2005
Stefan, it sounds like you've already made your decision.

That said, I'm not the type to hand a razor blade to someone who tells me they want to slash their wrists.  Figure it out yourself.
Aaron F Stanton Send private email
Tuesday, February 22, 2005
script the db objects.  do a global replace of the text you want to rename.  do the same global replace on your code.  execute your scripts.
Tuesday, February 22, 2005
""Obfuscate the db format?  It seems to happen naturally if you let most developers define the tables. "

Amen!  :)"

Does the earth actually rotate round the sun in 'dba universe'? Me thinks not.

Hey Stefan, you want obsfucation, hand your schema to a dba. Tell them to 'standardize' the naming conventions and normalize everything in sight. That's usually enough to add an hour more to every production debugging session (trying to juggle confusing/obscure naming abbreviations in your head) and a month or more to your development schedule (handling all the new joins and creating sub-sub-queries so you can actually get to the data you used to). And your competitors, they'll look at the schema and think you designed it in the 80's, and won't want to revese engineer it...  ;-)
DBA Phobic Send private email
Friday, February 25, 2005
One other technique might be to "hide the needle in the haystack".  If you have core tables that you can significantly obfuscate, then simply add 6000 more, of every type and size, containing both regular and encrypted data.  You could just take every table from a SAP installation (thousands!), and dump that in with your application (well, assuming that that wouldn't make SAP's copyright lawyers upset!).  Then just have your app open a file handle to every table on startup, and read and write random CRAP from the dummy tables at the same time as your app reads and writes regular data from your core tables.

Will it work?  Probably not! :-)  But if nothing else, you can't say that you do not have an "opportunity" for creative thinking on your hands, can you?

Good luck,

Peter Sherman Send private email
Monday, February 28, 2005

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz