<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: YASU 0.0.2.15</title>
	<atom:link href="http://bclary.com/blog/2008/03/24/yasu-00215/feed/" rel="self" type="application/rss+xml" />
	<link>http://bclary.com/blog/2008/03/24/yasu-00215/</link>
	<description>Bob Clary&#039;s ramblings about Mozilla, everything and nothing</description>
	<lastBuildDate>Sun, 27 Sep 2009 00:09:35 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
	<item>
		<title>By: Mohamed</title>
		<link>http://bclary.com/blog/2008/03/24/yasu-00215/comment-page-1/#comment-92</link>
		<dc:creator>Mohamed</dc:creator>
		<pubDate>Tue, 16 Dec 2008 19:02:49 +0000</pubDate>
		<guid isPermaLink="false">http://bclary.com/blog/2008/03/24/yasu-00215/#comment-92</guid>
		<description>Hello,

It seems like the document is missing some frames because at the time userOnAfterPage() is called not all the content has been loaded.

This can be solved by increasing the delay in doGrab() or adding a delay in userOnAfterPage() before retrieving the document using gSpider.mDocument. 

Thank you for your help.</description>
		<content:encoded><![CDATA[<p>Hello,</p>
<p>It seems like the document is missing some frames because at the time userOnAfterPage() is called not all the content has been loaded.</p>
<p>This can be solved by increasing the delay in doGrab() or adding a delay in userOnAfterPage() before retrieving the document using gSpider.mDocument. </p>
<p>Thank you for your help.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mohamed</title>
		<link>http://bclary.com/blog/2008/03/24/yasu-00215/comment-page-1/#comment-91</link>
		<dc:creator>Mohamed</dc:creator>
		<pubDate>Mon, 15 Dec 2008 19:20:48 +0000</pubDate>
		<guid isPermaLink="false">http://bclary.com/blog/2008/03/24/yasu-00215/#comment-91</guid>
		<description>Hello,

I added the following line in userOnAfterPage() to see what happens.

window.open(&quot;chrome://browser/content/pageinfo/pageInfo.xul&quot;);

Interestingly the popped out pageInfo window did not have the iframe-embedded image that I could see the in the parent spider window! So it means what we get is not what we see.

It happens on many sites including www.aol.com.

May be it is happening for elements inside DIV elements?</description>
		<content:encoded><![CDATA[<p>Hello,</p>
<p>I added the following line in userOnAfterPage() to see what happens.</p>
<p>window.open(&#8220;chrome://browser/content/pageinfo/pageInfo.xul&#8221;);</p>
<p>Interestingly the popped out pageInfo window did not have the iframe-embedded image that I could see the in the parent spider window! So it means what we get is not what we see.</p>
<p>It happens on many sites including <a href="http://www.aol.com" rel="nofollow">http://www.aol.com</a>.</p>
<p>May be it is happening for elements inside DIV elements?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: bc</title>
		<link>http://bclary.com/blog/2008/03/24/yasu-00215/comment-page-1/#comment-90</link>
		<dc:creator>bc</dc:creator>
		<pubDate>Mon, 15 Dec 2008 12:36:16 +0000</pubDate>
		<guid isPermaLink="false">http://bclary.com/blog/2008/03/24/yasu-00215/#comment-90</guid>
		<description>Can you send your userhook script to me at feedback @ this domain?</description>
		<content:encoded><![CDATA[<p>Can you send your userhook script to me at feedback @ this domain?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mohamed</title>
		<link>http://bclary.com/blog/2008/03/24/yasu-00215/comment-page-1/#comment-89</link>
		<dc:creator>Mohamed</dc:creator>
		<pubDate>Sun, 14 Dec 2008 23:26:57 +0000</pubDate>
		<guid isPermaLink="false">http://bclary.com/blog/2008/03/24/yasu-00215/#comment-89</guid>
		<description>Hello,

Thank for your response. Yes, you are right. It is an wrappedJSObject. But it appears still  I do not get all the frames.

What my hook is trying to do is download all the images in each page the spider visits. Since this is identical to the function of firefox pageInfo, I am trying to use the flow of pageInfo.js (I cannot think of a way of directly using pageInfo.js itself).

I can see my hook is missing out some embedded flash images which pageInfo is able to capture. I did some debugging and the difference I notice is pageInfo.js gets more frames out of the window. The way pageInfo receives the window object is 
====
gWindow = window.opener.gBrowser.contentWindow;
gDocument = gWindow.document;
====
The way my hook obtains them is
====
var aDocument = gSpider.mDocument;
var aWindow = gSpider.mDocument.defaultView;		
if (aWindow.wrappedJSObject)
{aWindow = aWindow.wrappedJSObject;
}
====

The processes after that are identical: go through all the frames and use tree-walker.

To cite an example, if I use spider to visit http://home.live.com after signing in to hotmail, the hook says there are 2 frames and it misses out the embedded flash image that appears at the right bottom corner. If I use pageInfo, it says there are four frames and it captures the said flash.</description>
		<content:encoded><![CDATA[<p>Hello,</p>
<p>Thank for your response. Yes, you are right. It is an wrappedJSObject. But it appears still  I do not get all the frames.</p>
<p>What my hook is trying to do is download all the images in each page the spider visits. Since this is identical to the function of firefox pageInfo, I am trying to use the flow of pageInfo.js (I cannot think of a way of directly using pageInfo.js itself).</p>
<p>I can see my hook is missing out some embedded flash images which pageInfo is able to capture. I did some debugging and the difference I notice is pageInfo.js gets more frames out of the window. The way pageInfo receives the window object is<br />
====<br />
gWindow = window.opener.gBrowser.contentWindow;<br />
gDocument = gWindow.document;<br />
====<br />
The way my hook obtains them is<br />
====<br />
var aDocument = gSpider.mDocument;<br />
var aWindow = gSpider.mDocument.defaultView;<br />
if (aWindow.wrappedJSObject)<br />
{aWindow = aWindow.wrappedJSObject;<br />
}<br />
====</p>
<p>The processes after that are identical: go through all the frames and use tree-walker.</p>
<p>To cite an example, if I use spider to visit <a href="http://home.live.com" rel="nofollow">http://home.live.com</a> after signing in to hotmail, the hook says there are 2 frames and it misses out the embedded flash image that appears at the right bottom corner. If I use pageInfo, it says there are four frames and it captures the said flash.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: bc</title>
		<link>http://bclary.com/blog/2008/03/24/yasu-00215/comment-page-1/#comment-88</link>
		<dc:creator>bc</dc:creator>
		<pubDate>Sun, 14 Dec 2008 17:17:23 +0000</pubDate>
		<guid isPermaLink="false">http://bclary.com/blog/2008/03/24/yasu-00215/#comment-88</guid>
		<description>You may be getting an &lt;a href=&quot;https://developer.mozilla.org/en/XPConnect_wrappers&quot; rel=&quot;nofollow&quot;&gt;XPConnect wrapper&lt;/a&gt;. You can get the underlying wrapped object using &lt;a href=&quot;https://developer.mozilla.org/en/wrappedJSObject&quot; rel=&quot;nofollow&quot;&gt;wrappedJSObject&lt;/a&gt;.

try this:

if (aWindow.wrappedJSObject)
{
    aWindow = aWindow.wrappedJSObject;
}</description>
		<content:encoded><![CDATA[<p>You may be getting an <a href="https://developer.mozilla.org/en/XPConnect_wrappers" rel="nofollow">XPConnect wrapper</a>. You can get the underlying wrapped object using <a href="https://developer.mozilla.org/en/wrappedJSObject" rel="nofollow">wrappedJSObject</a>.</p>
<p>try this:</p>
<p>if (aWindow.wrappedJSObject)<br />
{<br />
    aWindow = aWindow.wrappedJSObject;<br />
}</p>
]]></content:encoded>
	</item>
</channel>
</rss>

