Friday, November 17, 2006

Height of the Page, no questions asked

When creating a popover for a web app I was writing, I needed to get the internal height of the page, so I could set the popover to the pages height. There are other techniques to achieve this such as setting

html {
overflow: hidden;

body {
overflow: auto;

then absolutely positioning the div, but this can result in rendering bugs in IE anytime you try to relatively position an element. (usually the element doesn't move when you scroll the page)
So the simple solution is to use javascript to get the pages height when the popover is supposed to be show. For the popover I was working on, I did not want to change the underlying page's doctype (or lack there of) The problem was that document.documentElement.clientHeight only works with a doctype. To make a long story short, I came up with this line of code that should return the pages complete height no matter what.

Math.max((window.innerHeight || 0), document.body.clientHeight, document.documentElement.clientHeight, document.body.scrollHeight)

Simple, parts might be repetitive, but it works

Sunday, November 12, 2006

WWW::Mechanize get current page

So I have always done a lot of screen scraping, and typically whatever language I was working with I would build a framework to get the job done. I built one for java, which was a nightmare. Next I created one in php. It was a lot simpler, but just took to much time to really do right. When I moved to ruby I was supprised to find the WWW::Mechanize library. It did everything I had been building into these other frameworks. The nice thing about mechanize is that it takes care of following redirects, and parsing the html into an easy to follow structure. Something I would always build into my frameworks was the ability to psuedo-submit forms on the page. Typically in the form of (php example):

$cForm = $page=>forms[2]
$cForm=>login = 'bob';
$cForm=>password = 'testpass';

You can do very similiar things in mechanize, but the thing that stumped me for to long was how you got the current url of the page. Turns out it isn't that hard, but it is poorly documented.

(ruby example)

agent =
agent.user_agent = 'Mozilla/4.0 (compatible; MSIE 7.0b; Windows NT 5.1)'
form = browse.forms[1]
form.fields.find {|f| == 'location'}.value = 'MT'
page1 = @agent.submit(form, form.buttons.first)

Note the last line. This should always return the page that the agent is "at" in the browser paradigm, even after multiple redirects