How to insert ads?



  • A buddy of mine is interested in providing free web hosting (which is practically a WTF in itself). He asked me to come up with a way to insert a Google ad on every page.

    Since the pages can come from one of many sources, the easiest solution seems to abuse Apache itself. [url=http://httpd.apache.org/docs/2.0/mod/mod_ext_filter.html]mod_ext_filter[/url] seems to be the way to do it - write a c/c++ program that reads the page on stdin, inserts the ad after the opening <body> tag, and writes it back to stdout.

    So I pose the question: is this a good way to insert ads on a web page, or am I setting myself up for a huge WTF here?



  • Here's your first challenge: what's a page?



  • <html>

    <head><title>Whee</title></head>

    <!-- <body>silly ads</body> -->

    <body>

    Free content!

    </body>

    </html>

    ofcourse you could pull the same trick with script/div and quite some more tags I think.



  • Here's a question -- are you forcing every page to have ads even if it doesn't specify a place? If so, how are you planning to place them such that it doesn't royally screw over the page layout?



  • Set up a proxy server to serve the domains and let it fetch the pages from the real webserver.

    Add filtering that adds javascript code to each page that creates big, blinking, annoying DIV's that cover the whole screen with advertising.

    Add filtering that writes text ads onto every image served.

    Mission accomplished, and I'm just getting started. 



  • Here's your first challenge: what's a page?

    Anything identified as text/html will probably constitute a page.

    Here's a question -- are you forcing every page to have ads even if it doesn't specify a place?

    Yes, unfortunately.

    If so, how are you planning to place them such that it doesn't royally screw over the page layout?

    Right now we just insert them after the opening <body> tag or before the closing </body> tag. I think I'll add tags users can put where they want ads to fit with their layout - if the ad script doesn't find the tags, then do it the generic way.


  • @niteice said:

    Right now we just insert them after the opening <body> tag or before the closing </body> tag. I think I'll add tags users can put where they want ads to fit with their layout - if the ad script doesn't find the tags, then do it the generic way.
    'right now'... does that imply you have something in place already?

    Care for some testers? I guess there are quite some guys around here that know how to break stuff ;-)
     



  • Please oh please, if you have your banners at the top of everything, wrap them in iframes. It's so annoying watching the page freeze in it's tracks, waiting for some banner at the very top to load it's javascript from a lagged out adserver.



  • 'right now'... does that imply you have something in place already?

    Care for some testers? I guess there are quite some guys around here that know how to break stuff ;-)

    Yep. Python one-liner. It's only installed on my home server though, which I don't feel like exposing to the Internet at large. :)
    Please oh please, if you have your banners at the top of everything, wrap them in iframes. It's so annoying watching the page freeze in it's tracks, waiting for some banner at the very top to load it's javascript from a lagged out adserver.

    Which is why they can optionally be put on the bottom as well.


  • @niteice said:

    Right now we just insert them after the opening <body> tag or before the closing </body> tag. I think I'll add tags users can put where they want ads to fit with their layout - if the ad script doesn't find the tags, then do it the generic way.

     What happens if the page source doesn't have <body> tags? Because you know that someone is going to figure out that the ads are placed like that, and are going to abuse the way browsers handle broken HTML.
     



  • @niteice said:

    If so, how are you planning to place them such that it doesn't royally screw over the page layout?

    Right now we just insert them after the opening <body> tag or before the closing </body> tag. I think I'll add tags users can put where they want ads to fit with their layout - if the ad script doesn't find the tags, then do it the generic way.


    You realize... the <body> tags are optional in the HTML spec...



  • Indeed they are.

    The sadistic side of me wants to force the ad links in at the beginning of the page if the <body> tags are missing. Basically, users can do it our way (compliant HTML or even designing their pages to work with our system) or screw up the entire page.



  • You missed his point.  This is a 100% valid HTML page:

       <title>My Page</title>
       <p>So I was out walking my dog today...</p>

    It even has a body element (which is required), which begins between </title> and <p>.  It doesn't have a body tag (which is not required), that's all.  If you want to correctly position ads as the first child of the document's body element, you have to be prepared for this (I repeat, 100% valid) case.

    (Sorry to leave you hanging on what happened with the dog.)
     



  • Your case would become:

    <a href="http://www.some-ad-site.com">ad text</a>
    <title>My Page</title>
    <p>So I was out walking my dog today...</p>

    So either way, the would-be freeloader can't win.



  • i'm going to assume the use of javascript is not restricted, because if it isn't the freeloaders will win.

    but i wonder, what makes your friend think there is even a market for free hosting anymore, i would think geocities, tripod, angelfire, every decent ISP whom offers web space has all ready taken 99% of the pie like at least 5 years ago.



  • Then your code can take a validating HTML document and transform it into a non-validating pseudo-HTML document.  That's a bug, in my world.  It should insert the ad *after* the title, not before it, and it should be placed inside a DIV or P or other block-level element that BODY is allowed to contain.  Naked A elements aren't allowed inside BODY, as far as I know.

    For reference, BODY may contain any number of any of: P, H1..H6, UL, OL, DL, PRE, DIV, NOSCRIPT, BLOCKQUOTE, FORM, HR, TABLE, FIELDSET, and ADDRESS.  And nothing else.



  • I'm waiting for the next thread titled: "How to disable ad blockers?"


Log in to reply