A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.
Does anyone know how many web sites (like web-based email clients, such as Hotmail, Gmail, etc.) render the HTML emails within the web page. I suspect they do some sanitizing or something, but it seems very non-standard I couldn't find much about it.
Certainly, they wouldn't just throw another page embedded in theirs, would they?
Any links or advice is appreciated.
"I suspect they do some sanitizing or something, but it seems very non-standard I couldn't find much about it."
"Certainly, they wouldn't just throw another page embedded in theirs, would they?"
Nope that would allow for cross-site scripting attacks because any script code would run in the context of the host site.
So, basically, what you're saying is that it's a very imperfect science which is based almost solely on trial-and-error?
i.e.: strip tags, see if the page still looks okay, strip more tags, see if pages still look okay, repeat..
Hmm, kind of lame, but then again, I don't see any other way of really doing it.
Thanks for the advice,
"So, basically, what you're saying is that it's a very imperfect science which is based almost solely on trial-and-error?"
Not really. Strip the dangerous tags/attributes and leave everything else. It has nothing to do with whether or not the page looks ok. If you strip dangerous tags/attributes and the page doesn't render correctly well that's just tough. You can't just leave the dangerous tags in there.
"Hmm, kind of lame, but then again, I don't see any other way of really doing it."
I'd create a quick a dirty parser of HTML tags -- nothing fancy, just something that can break down each individual tag and all the attributes. This is a good task for regular expressions. There is need to maintain the structure of the document -- just process it linearly.
Then you have a whitelist of tags and attributes and any tag not in the whitelist is removed and any attribute not in the whitelist is removed from the individual tags. That should be good enough.
Go through the w3schools site and look at all the tags and attributes to create your whitelists.
I agree. I conceptually understand what you're getting at. And I totally concur with the fact that security is paramount.
But, at the end of the day, users are going to *use* my application. And while security should most definitely come first, I want my users to actually enjoy using my program. So, I'm going to make sure it looks good.
I think it's quite missing the point of building something for people to use if you totally disregard how they're going to use it. :-/
As well as security, remember privacy as well. Links to offsite files such as images, css, js, etc. can be used by spam to comfirm that a address is valid and likely being read by a real human.
On another node, it may well be that HTML email is actually *easier* to read if you strip much of the formatting, as with Thunderbird's "Simple HTML" display mode. Mail will appear in the same font, size, and colours, making your app look more consistent and likely prettier.
Wednesday, May 11, 2005
This topic is archived. No further replies will be accepted.Other recent topics
Powered by FogBugz