The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Algo to strip <table>s ?

I have written a html renderer (in c++) for my software.  Now I want to browse the internet using this.  Most webpages use <table> tags to bracket the entire page to create a left or right navigation bar.  Also they use width=600 or some such stupid value to restrict the display area of the web page (even this forum uses that value). So even though the screen is huge, the page occupies a narrow column leaving lots of space on both sides.

Now I want to **turn the tables** on these web pages.  Turning the left/right nav bar into a top/bottom nav bar by putting </table> at the end of a left nav bar.  That way the remaining page will use the entire screen width instead of screen width - width of nav bar.

The question is how to differentiate between tables used to format the page and genuine tables that contain tabular data or those that format forms?

Any Suggestions?
1.Length of html inside a <td>
2.Tables inside <td>
3.Tables at start of body.
4.???
Donald Duck
Saturday, May 19, 2007
 
 
You could simply ignore explicit sizes in top-level table tags. Similarly, you could convert explicit sizes into relative proportions only for horizontal dimensions (again, possibly only for top-level table tags). I'd make these behaviors configurable by a public API.
Jeff Dutky Send private email
Saturday, May 19, 2007
 
 
Navbars often have class or ID set to something with "nav" or "sidebar" or something like that in it and content elements often are set to something like "content". 

If you have two TD elements side by side and the left one is 120 pixels and the right one is 600, it's likely the left is nav and the right is content. 

The amount of text versus number of links within a given element would be a good hint.  Maybe some sort of text to links ratio.  Above a certain number, it's probably content. 

You could combine all of these into some sort of weighted sum to compute a navbar vs. content likelyhood score.

Expect anything you do to fail miserably on a large number of sites since this is all just guesswork.
SomeBody Send private email
Sunday, May 20, 2007
 
 
Oh, this viewer is only for my private use and I can tweak it when necessary, so the failing in the wild doesn't bother me.
Donald Duck
Sunday, May 20, 2007
 
 
Real table has many rows. Layout tables have just one or two.

Layout table contains all, or almost all, text on the page.

I'd try this way
- if you see more than X words outside the table, it's definitely data table
- if table contains more than 5 rows it's probably data table
- else it's layout table
yury at xtransform Send private email
Sunday, May 20, 2007
 
 
> The question is how to differentiate between tables used to format the page and genuine tables that contain tabular data or those that format forms?

Maybe you can do it based on end-user input: if you see a table that you want to transform, then <Alt>-<Click> on it.
Christopher Wells Send private email
Friday, May 25, 2007
 
 
I would address this in a more general way: the problem are not the tables,but that the resulting page is not adjusted to the actual screen size (The utilization of tables lead to this problem). So, I would address this page sizing problem directly.

First, I would add some option to the API to adjust page size to screen size (this will apply to tables, basically). You can expose this method to user as a botton for "adjust page" (or a Ctrl-somthing short key)

Now, implementing this will be laborous. First, I would modify the redender to generate the page view based on a certain scale. This scale will be calculaded as the bigest element size in the page. In your example (table with width=600) the scale will be 1.0 = 600.

When the page is adjusted, this scale is changed and all elements will be re-displayed to the proper size.

The advantage of this approach is that you have a more general solution that will help you in such things like  zooming in/out pages (I like this). Also, design is cleaner as no particular cases are considered. The problem is that you must change a lot of things.

I ALWAYS preffer general solutions because:
1. At the end, are easier to implement (no special cases)
2. Generaly Have a lot of beneficial side-effects (like allowing additional functionallity)
3. Tend to be more resiliant to changes
4. Are more re-usable


Pablo
Pablo Chacin Send private email
Wednesday, May 30, 2007
 
 
But wouldn't implementing scaling introduce horizontal scrollbars?  I abhor horizontal scroll bars more than pagination tables.
Donald Duck
Wednesday, May 30, 2007
 
 
"First, I would modify the redender to generate the page view based on a certain scale. This scale will be calculaded as the bigest element size in the page. In your example (table with width=600) the scale will be 1.0 = 600.
"

This wouldn't introduce horizontal scrollbars.  You would use the scale along with the size of your window.  So if your window was 1200 wide, then you would increase the table width to 1200 (scale of 2).  If it was 900, then you would increase the scale by 1.5.

That would allow you to use more of your available window to show the content.  The downside is that the menu area would scale along with it.

You may want to hack around on it some.  Maybe assume that the larger area is content, and only scale that portion.  Then if you run into a site that has 50% width menus, you can adapt your code at that point.  Since this is for your personal use, you can work with it.
Michael Nebinger Send private email
Thursday, May 31, 2007
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz