Probably you know about www.octopus.com. The idea behind it is that this
server fetches many sites (news, weather, stock, search englines results),
then allows a user to make a custom portal from these information blocks.
I'm going to make something like that. In short, I need to parse different
web pages (like these news, weather, etc) based on (sometimes complicated)
regular expressions, and show results in that heavily processed shape. It
looks very similar to www.octopus.com. I'm going to use XML as intermediate
format between web pages and user interface. I'll put parsing results to
simple XML database, then apply some XSL to get final result.
I believe that it's better to use some existing XML schema as internal XML
format to get closer to worldwide standards and eliminate future problems
possible with non-standard XML. The question is, what to choose ? For now I
believe NewsML can be applied well to store parsed information, but probably
there is better or more common way.
Can anybody give me piece of advice about possible XML schema to store
parsed bits of web sites ?
Author of Chameleon Clock - a tray clock replacement
with support of Winamp skins, alarms, and atomic time.
mailto:[log in to unmask]