Welcome to my HTTP filter & Mobipocket Web Companion Support Page!

 

Download the latest (1.7.8) build

 

Version history (really worth checking out for warnings/announcements!)

 

Download the Word manuscript of this article (it may be a bit newer than the HTML if I forget to update the latter)

 

 

 

Please note that this software suite is still evolving. Use it AT YOUR OWN RISK. I've provided the sources so that you can check the code (and even recompile it). As I'm still adding new features and fixing bugs, the suite is NOT guaranteed to work.

 

You should also note that my HTTPProxy's functionality, except for some particular areas (page merging, POST-based logins etc) is not as mature as that of iSiloX or, even better, Sitescooper. However, you will definitely find some utilities in this suite perfect for making especially the Mobipocket Reader the best and most versatile e-News reader.

 

Just a sneak preview and some new info, before I completely rewrite the reader comparison section:

 

Some of my mails to the MP mailing lists about the latest version of MPR (4.7 build 408):

 

"I have another idea: what about implementing a text highlighting schema similar to that in uBook (http://www.gowerpoint.com/)? On a lot of PDA's, it's pretty hard to select a large about of text in a given pages. Newer models (I'm able to select a lot of text on both the h2210 and the Palm Zire 71, but not on any of the iPAQ 36xx models) don't suffer from the relative insensitivy of the touch screen, but older models do.

 

On older models, sometimes it takes 3 or 4 repeated tries to select some text because you have to drag the stylus all around the screen. It is very tiring (not to say isn't very good for the touch screen either - think of the scratches!). This is why some innovative text selecting capability would be great."

 

"what about the full screen toggling in the PPC version? The current solution is VERY bad IMHO for PPC's. PPC's are not like Smartphones - PPC's do have touch screens, which make the current (as of 4.7), Smartphone-optimized cursor movement scheme superfluous on PPC's. This is OK on Smartphones, but not on PPC's, where users can tap the screens if they want to follow a hyperlink.

 

What about releasing an updated 4.7 PPC build with the old Full screen-toggle? I think the vast majority of PPC users would prefer that one to the new, Smartphone-biased button scheme."

 

 

 

Contents

Welcome to my HTTP filter & Mobipocket Web Companion Support Page! 1

Contents. 1

Introduction to e-News. 2

iSiloX, MWC, SiteScooper and my HTTP proxy: capabilities. 4

Other readers without an official Web extractor utility. 10

HTML-capable readers. 10

uBook (http://www.gowerpoint.com/) 10

Team One's Reader v3.0 (http://www.teamonesoft.com/en/Products.htm) 11

Non-HTML-capable readers. 11

Microsoft Reader 2.0. 11

Haali Reader (http://haali.cs.msu.ru/pocketpc/) 12

TomeRaider (http://www.tomeraider.com/) 12

Palm Reader (http://www.palmdigitalmedia.com/S=2f7c86ffed7244ce36a1cf2ac89a995cPrkrDcCoAAsAACOeM0c3046011/product/reader/browse/free) 12

Why did I choose Mobipocket? Why do I recommend it?. 12

Mobipocket's e-News/e-Book format 13

Problems with Mobipocket Web Companion. 16

How does MWC work?. 17

.enews files. 17

action="download-site". 21

.XSL files. 22

Offline MB utilities. 25

.enews generators. 25

.in generators for forum softwares. 26

Picture utilities. 27

Resizing and no high/true-color support 27

Animated GIF's. 28

BMP support 30

Proxy-based, fully automatic solution?. 30

PRC reconstruction. 31

MBHelper 33

Installation, running. 34

Setting up your MWC/browser to use the proxy. 34

MBHelper.conf 35

The target URL. 35

Available actions. 37

substitute. 37

mergeallpagesfollownextlink. 39

changeURL. 43

killURL. 43

onlyAllowURL. 44

returnonlyhtmlbetweenbeginend. 44

prekiller 46

uniquepictures. 46

addURL. 47

POSTLogin. 47

Using HTTPSnoopProxy to get login URL's. 48

When it won't work (JavaScript, POST-only) 50

forceUTF8Conversion. 51

addServerNameToAllURLsWhenNecessary. 51

converttables. 51

cacheSimulation. 52

enableBloggerConversion. 52

URLDecode. 53

Some tips and tricks to using MBHelper 54

Future Plans. 56

DISCLAIMER.. 56

 

Introduction to e-News

 

One of the most compelling advantages of any PDA's (both Palm-, H/PC- and Pocket PC-based) and Symbian / PPC Phone Edition-based mobile phones is the ability to read electronic documents.

 

Using a lightweight PDA for reading 'traditional' e-books is even topped by their ability to store and offer the daily news or forum archives. It's one of the greatest capabilities of a PDA.

 

There're three major players in the e-News field. The first is AvantGo (http://avantgo.com/frontdoor/index.html; for reviews, see e.g. http://www.epinions.com/cmsw-PalmSoftware-All-Avantgo_3_1/display_~reviews),  probably the most known service, because Pocket Internet Explorer (PIE), the built-in Internet Explorer in Pocket PC (2002), has a link to it on its main page.

 

The second is Mazingo (http://www.mazingo.net/) and the third is Mobipocket (http://www.mobipocket.com/).

 

The former two companies only offer PDA products for news reading (but no book reading). Mobipocket's reader product, Mobipocket Reader, is a bit different: it's not just a plain news reader. Actually, it's more of a full-fledged e-book reader than just a plain tool to make Pocket Internet Explorer into an offline news reader and to synchronize e-News.

 

The other e-News compliant players are iSilo with its extremely capable HTTP downloader, iSiloX.

 

There're other site extractors too; for example, the Perl language Sitescooper (also see this and this articles). It supports tons of native output formats: HTML, Palm DOC, iSilo etc (the latter two with external conversion tools - just like XDOCGenerator).

 

Sitescooper is a very good extractor; except for its table handling, it's much superior to both MWC (even with my toolkit) and iSiloX. However, because it handles table-based  pages much worse than iSiloX and doesn't really support page merging as well as my toolkit, there may be cases when it shoulnd't be used.

 

Sitescooper, of course, has its own shortcomings (not just the inferior table handling when compared to iSiloX). For example, while MWC can have any depth of page structures, Sitescooper can only handle site structures up to 3 levels of depth. This means it can't extract sites that have more than 3 levels of depth.

 

Actually, mainly because e-News was (or, may have been) added to Mobipocket's battery of applications as an afterthought, e-book support in Mobipocket's and iSilo's products are far superior to their e-News synchronization support. Had Mobipocket's e-News support been as good as their e-book support, I woulnd't have developed a complete toolkit to fix the shortcomings and bugs in their e-News support.

 

My tookit has two kinds of tools. On the one side, I've written a HTTP proxy that has advanced filtering (e.g. page merging) capabilities and Palm DOC/Mobipocket XDOC creators/decompressors.

 

Being a HTTP proxy also means that my HTTP filter can be used with iSiloX. This means that the already-excellent iSiloX just got even better by introducing other features in a proxy it connects to.

 

On the other side, I've written several tools for Mobipocket's products because I find their Pocket PC-based reader superior to all the competing products. It's just their web extractor and PRC creator tools that are very immature and almost useless for serious work. One of my aims was to enhance Mobipocket's Web extractor and PRC creator tool, Mobipocket Web Companion (MWC) to be even better than iSiloX.

 

iSiloX, MWC, SiteScooper and my HTTP proxy: capabilities

 

Here is a table of what iSiloX, Sitescooper and MWC can do with and without my HTTP proxy in terms of HTML extraction.

 

 

 

 

MWC 4.5

MWC 4.5 + tookit

isSiloX 3.35b4

isSiloX 3.35b4 + toolkit

SiteScooper 3.1.2

SiteScooper 3.1.2 + toolkit

Tables

-

Limited (given table header; ASCII)