The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

automation of IE or Firefox or ?

I need to automate the opening of web pages, selection of screen criteria and downloading of the resulting data from a temp file the server creates. since the file name and contents are dynamically created in response to user interface clicks, I think I need a 3rd party scripting tool to do it. The site does not have any sort of open api (xml-rpc, soap, whatever) and I'm not able to act questions of the site's designers. Can anyone recommend a tool to do this? Licensing might be a issue as the script will need to be deployed to client-sites.

TIA,
Don
Don Dickinson Send private email
Monday, June 09, 2008
 
 
What you want to do is called screen scraping or web scraping.

You can do this in firefox using the GreaseMonkey extension.  Python has some decent libraries, and I've done a lot of screen scraping in .Net as well.
Joel Coehoorn Send private email
Monday, June 09, 2008
 
 
Oh- and if you're building a product that will depend on the availability and consistency of a completely separate web site with which you are not at all affiliated, you're asking for trouble.
Joel Coehoorn Send private email
Monday, June 09, 2008
 
 
You don't need anything third-party at all.  You don't even need anything that isn't part of a standard Windows installation!  Just use the Windows Script Host to control Internet Explorer.

Though you might _prefer_ to use .NET or some other language with a more sophisticated screen scraping library.  VBScript isn't exactly the most expressive language in the world.
Iago
Monday, June 09, 2008
 
 
Take a look at WatiR (Web Automation Testing in Ruby) or WatiN (Web Automation Testing in .NET). They're libraries that provide an easy automation interface over IE.

They're more for testing (hence the name) but can be put to use for other web automation tasks as well.

Your other option would be to do screen scraping - download the HTML, parse it yourself looking for the bits of interest,  etc.
Chris Tavares Send private email
Monday, June 09, 2008
 
 
We sell a software (written in C#) that works with 30+ hotel booking websites. Internally it uses iMacros to automate IE and Firefox. So when a website changes, we just have to update the macros. No need to change our code. That works well for us. iMacros is a commercial component: http://www.iopus.com/imacros/web-scraping.htm

The runtime license can be distributed royalty free.
Matt Send private email
Monday, June 09, 2008
 
 
thanks for the advice. i'm aware of the troubles associated with automatically grabbing data from a site beyond my control. unfortunately its the only option i have. i can do it with greasemonkey or with my own code (i do have winhttp wrappers to do this), but was hoping for some sort of scripting language. i'll check out watin, thanks!
don
Don Dickinson Send private email
Monday, June 09, 2008
 
 
You could try the Python Automated Module For I.E. http://pamie.sourceforge.net/

I've used it and it works very well.
Bruce Pearson Send private email
Monday, June 09, 2008
 
 
XPath is awesome if you're working with a XHTML site with relatively valid syntax.

Monday, June 09, 2008
 
 
XPath is awesome *if* it works. IMHO it can not be used with the majority of websites. The same applies to Perl scripting (www:Mechanize etc.). This approach works well with plain sites like google.com, but becomes tricky or impossible with most other websites. We looked at XPath, WatiN, Mechanize, iMacros, FicStare, Lencon and even Kapow. The only tool that was able to handle all "our" websites was iMacros, mainly because it supports AJAX/Flash well.
Jim
JB
Tuesday, June 10, 2008
 
 
thanks for all the feedback. imacro looks excellent. i'll check out the python thing. the site is not xhtml, so that one's out. thanks to all, i have a good starting point.

-don
Don Dickinson Send private email
Tuesday, June 10, 2008
 
 
the iMacro thing looks great. for $699 (the 1 developer enterprise license) i can get the ability to create macros and a license to give the run-time to as many customers as a i need. now, i just have to get their trial to execute as i need:) seems easy enough.

thanks
don
Don Dickinson Send private email
Tuesday, June 10, 2008
 
 

Friday, June 13, 2008
 
 
Don, would you mind clarifying what iMacro can do that e.g. Watin or others cannot? Much appreciated.
watin-oriented
Tuesday, June 17, 2008
 
 
Selenium is nice for local sites, but it's got problems when in cross-domain for it uses jscript. I don’t know whether they managed to resolve it elegantly. WatiN, on the other hand, uses automation in IE and it controls it just as user would.
watin_oriented
Tuesday, June 17, 2008
 
 
Some of the reasons why we decided to use iMacros and not WatiN:

1. Works well with AJAX and Flash (Watir does not support Flash at all!)

2. Same macros work in IE6/7/8 and Firefox 1/2/3

3. Visual recording (saves time, and allows our users to adapt macro even without programming knowledge).

Basically using iMacros instead of a home-grown solution saved us a lot of time. And since iMacros can be distributed royalty free, there are no further costs in addition to the initial license fee.
Mike B.
Tuesday, June 24, 2008
 
 
Selenium is working well for me.
curdDeveloper
Tuesday, June 24, 2008
 
 
You should use Selenium for automating the clicking and filling and all that stuff that you do on the browser. You even have bindings for python, ruby, perl, etc...
There might be some tiny issues because it works with javascript, but all in all it works absolutely great.

cya!

--
Nicolás Miyasato
http://nmiyasato.blogspot.com
Nicolás Miyasato Send private email
Sunday, June 29, 2008
 
 
Hi. I'm working on the same kind of thing. We have to test Outlook Web Access 2007 by automating clicks and checking responses. Watir cannot find the ids associated with some of the OWA elements. Will iMacros be able to fix this. Also, does iMacros record the responses to a http request.
Aditya Send private email
Tuesday, July 01, 2008
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz