0

Geocoding in Sharepoint Lists

Today I was thinking it would be nice to do some Geocoding in Sharepoint. Specifically, I wanted to make it so list items could have longitude and latitude fields that could be populated with a button click from the EditItem page. Geocoding is pretty easy to do with both Google and Yahoo. For my situation, Yahoo seemed more appropriate.

My first intuition was that this should be easy to do with Javascript. Just call the Yahoo Maps API with the right data and parse the response. Except.. this usage violates the same site origin policy for Javascript. Drat. Well, there are several things that can be done. If you’re handy with C# and M$ technologies, you can just create a proxy on the same server. Unfortunately, I don’t know the first thing about the Microsoft stack and I’m too lazy to learn. As an alternative, I opted to create a PHP proxy on another server and force it to return JSON, which skirts around the same site origin problem.

First, the code for the PHP proxy which lives somewhere besides the Sharepoint server. This proxy forwards requests to Yahoo, parses the response, and emits JSON back to the caller.

<?php
//get params from request
$appid = 'YOUR_YAHOO_APPID';
$street = $_GET['street'];
$city = $_GET['city'];
$state = $_GET['state'];
 
//build new request
$req = 'http://local.yahooapis.com/MapsService/V1/geocode?';
$req .= 'appid=' . $appid;
$req .= '&amp;street=' . urlencode($street);
$req .= '&amp;city=' . urlencode($city);
$req .= '&amp;state=' . urlencode($state);
 
//fetch XML using cURL
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $req);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$result = trim(curl_exec($ch));
curl_close($ch);
 
//parse XML
$xml = simplexml_load_string($result);
$lat = $xml-&gt;Result[0]-&gt;Latitude;
$lng = $xml-&gt;Result[0]-&gt;Longitude;
 
//return JSON
echo $_GET['jsoncallback'] . "({lat: \"$lat\", lng: \"$lng\"})";
?>

Now, the Javascript part of this uses jQuery to make life much much easier. It basically injects a new Button onto the page. When clicked, it builds a request from the location data on the page, sends the request to the proxy, parses the response, and puts the latitude and longitude into the form fields. Then the user can click ‘Save’ as usual. This code needs to be added to the EditItem.aspx page for that particular list. You also need to have jQuery on the page. In my case, I just included them both as external scripts to keep my changes to .aspx pages to a minimum.

$(document).ready(function() {
	//create a new Button, match Sharepoint styles
	var geoButton = $('&lt;input type="button"&gt;').attr({'class':'ms-ButtonHeightWidth', 'value':'GeoCode'});
 
	//add our Button after the default 'Cancel' Button
	$('.ms-formtoolbar .ms-toolbar:last').after($('&lt;td&gt;&amp;nbsp;&lt;/td&gt;').attr({'class':'ms-separator'}));
	$('.ms-formtoolbar .ms-separator:last').after(geoButton);
 
	//wrap Button in a Table to match Sharepoint's style
	geoButton.wrap($('&lt;td&gt;&lt;/td&gt;').attr({'class': 'ms-toolbar', 'no-wrap':'true'}))
		 .wrap($('&lt;table&gt;&lt;/table&gt;').attr({'cellspacing':'0','cellpadding':'0','width':'100%'}))
		 .wrap($('&lt;tbody&gt;&lt;/tbody&gt;'))
	         .wrap($('&lt;tr&gt;&lt;/tr&gt;'))
                 .wrap($('&lt;td&gt;&lt;/td&gt;').attr({'nowrap':'','align':'right','width':'100%'}));
 
	//onClick, perform geoCode and put Long/Lat into form fields
	geoButton.click(function() {
 
	//get data from form fields
	var street = $('input[title=Street]').val();
	var city =  $('input[title=City]').val();
	var state =  $('input[title=State]').val();
 
	//fail early if some data is absent, since we wouldn't get a good geocode
	if(street == '' || city == '' || state == '')
  	    return;
 
	//wrap data into a URL so we can do an HTTP GET
	var address = '&amp;street='+street+'&amp;city='+city+'&amp;state='+state;
 
	//use jQuery.getJSON to avoid that pesky cross domain security restriction
	//?jsoncallback=? is a peculiarity required by jQuery, the server must echo this back
	$.getJSON("http://path_to/yahoo-geocoder.php?jsoncallback=?"+address,
		function(json) {
    		    //async callback, unpack the data
 		    var lat = json.lat;
		    var lng = json.lng;
       		    //simple error detection
		    if(lat == '' || lng == '') {
			//show '!!!' after form fields to indicate that geocoding failed
			$('input[title=Latitude]')
			.after($('&lt;span&gt;!!!&lt;/span&gt;').attr({'id':'error-lat'}).css('color','red'));
			$('input[title=Longitude]')
			.after($('&lt;span&gt;!!!&lt;/span&gt;').attr({'id':'error-lng'}).css('color','red'));
		    } else {
			//success, place results into form fields
			$('input[title=Latitude]').val(lat);
			$('input[title=Longitude]').val(lng);
			//remove any previous error indicators
			$('#error-lat').remove();
			$('#error-lng').remove();
		    }
		});
	});
});
0

Python, PIL and Pretty Polaroids

I suspect that by now everyone and their grandmother has written a script to convert photos so they look like Polaroids. Yesterday I spent a slow morning at work replacing all our slideshows (which used a super-ugly Flash control) with these:

puppies-img_0253

puppies-img_0254

There’s plenty of other neat effects that could be done.. maybe add a bit of aging or apply some filters. But I think it looks pretty good. The following script uses Python, PIL (Python Imaging Library), and a pre-drawn “polaroid” frame.

import PIL, time, glob, random, os, sys
from PIL import Image, ImageOps, ImageEnhance, ImageDraw, ImageFont
 
# Generate Polaroid-looking images
def make_polaroid(infile, outfile, text=''):
    base = (300,320)    #size of polaroid background
    polaroid = Image.open('polaroid-0.png')
    polaroid = ImageOps.fit(polaroid, base, Image.ANTIALIAS, 0, (0.5,0.5))
 
    target = (272,248); # size of empty target area on polaroid background
    img = Image.open(infile)
    img = ImageOps.fit(img, target, Image.ANTIALIAS, 0, (0.5,0.5))
 
    #enhance the image a bit
    img = ImageOps.autocontrast(img, cutoff=2)
    img = ImageEnhance.Sharpness(img).enhance(2.0)
 
    #draw the text, if any
    font = ImageFont.truetype("arial.ttf", 16)
    text_size = ImageDraw.Draw(polaroid).textsize(text, font=font)
    fontxy = (base[0]/2 - text_size[0]/2, 278)
    ImageDraw.Draw(polaroid).text(fontxy, text, font=font, fill=(40,40,40))
 
    #copy the image onto the polaroid background
    imgcorner = (14,20) #paste image onto polaroid
    polaroid.paste(img, imgcorner)
 
    #copy the whole thing onto a larger background and rotate randomly
    angle = random.randint(-10,10)
    blank = Image.new(polaroid.mode, (400,400))
    blank.paste(polaroid, (blank.size[0]/2-polaroid.size[0]/2, blank.size[1]/2-polaroid.size[1]/2))
    blank = blank.rotate(angle, Image.BICUBIC)
 
    blank.save(outfile)
 
if __name__ == "__main__":
    # Takes 1 required argument -- the desired prefix for the output filename
    if len(sys.argv) &lt; 2:
        print "Missing required positional argument 'prefix'"
        exit()
 
    # Text to appear on image, use "" if none 
    text = "Some Text, or leave blank"
 
    # Erase everything in Output folder
    for f in glob.glob('output/*'):
        os.remove(f)
 
    # Create Polaroids of each JPG in Input folder
    files = [f[6:] for f in glob.glob('input/*.jpg')]
    for f in files:
        make_polaroid('input/'+f,'output/'+sys.argv[1]+'-'+f[:-4]+'.jpg',text)
 
    # Write index.html so Output folder can be copied/renamed elsewhere
    files = [f[7:] for f in glob.glob('output/*')]
    outhtml = open('output/index.html','w')
    outhtml.write("<html><head></head><body style='background-color: #000;'><div align='center'><p>")
 
    for i in range(len(files)):
        outhtml.write("<img src='%s' />" % (files[i]))
        if (i+1) % 2 == 0:
            outhtml.write("</p>")
    outhtml.write("</div></body></html>")
    outhtml.close()

The script is a bit over-specialized to my purpose .. converting a bunch of individual folders one at a time. So you may need to hack on it a bit to suit your needs. You can download the script here: polaroid.zip. Place files you want to convert into the “input” folder. Run the script with a single argument for the output filename prefix. It will take a few seconds or minutes to run, depending on how many photos you’re converting. When it finishes, copy the “output” folder elsewhere. The file “index.html” is pre-generated to contain all the photos in the folder.

5

Fetching Android Market Stats with Python, MozRepl, and BeautifulSoup

A few weeks ago I was quite keen on the idea of gathering stats and creating charts to track the popularity of my Android apps. Alas, despite digging around in various packages and experimenting with cURL, I could never seem to get logged in programmatically to the Android Marketplace Developer Console. So I gave up to continue working on my next app. Now I’ve come up with another reason to do some screen-scraping, so I thought I should give this another try.

Half the magic here belongs to a very cool Firefox plugin called MozRepl which lets you open a telnet connection to Firefox and interact with it via Javascript. Awesome, no?

All you have to do is ask MozRepl to go to the Developer Console, download the HTML, and run it through BeautifulSoup (the rest of the magic) to extract the data.

It turns out to be just slightly trickier because MozRepl needs to talk to Python via Telnet. I suppose this script could be setup in cron to grabs stats a couple of times each day. I think I’m just gonna run it manually every once in awhile.

import BeautifulSoup, re, time
import os, telnetlib
# Install MozRepl Plugin
# http://wiki.github.com/bard/mozrepl
# Setup MozRepl to start automatically with FF, check that port number is 4242
# Login to Developer Console once manually so login credentials get saved
 
# Create a new profile and set this accordingly
# http://support.mozilla.com/en-US/kb/Managing+profiles
profile = 'my_firefox_profile'
 
# go to Developer Console using new profile
url = 'http://market.android.com/publish/Home'
os.system("firefox -no-remote -P %s %s &" % (profile, url))
time.sleep(5) #wait a sec for FF to start
 
#connect to MozRepl and fetch HTML
t = telnetlib.Telnet("localhost", 4242)
t.read_until("repl>")
t.write("content.document.body.innerHTML")
body = t.read_until("repl>")
t.close()
 
#is there a better way to do this?
os.system("killall -9 firefox")
 
#yank stats out of HTML
now = time.strftime("%Y-%m-%d %H:%M:%S")
soup = BeautifulSoup.BeautifulSoup(body)
table = soup.find("div", { "class" : "listingTable" })
for row in table.findAll('div', {'class':'listingRow'}):
  app = row.find("div", { "class" : "listingApp" })
  rating = row.find("div", { "class" : "listingRating" })
  stats = row.find("div", { "class" : "listingStats" })
  if app and rating and stats:
    name = app.next.next.string
    total = stats.next.string.split()[0]
    active = stats.next.nextSibling.string.split()[0]
    nratings = rating.next.string[1:-1]
    stars = len(rating.findAll(attrs={'style':re.compile("scroll -78px")}))
    print now, name, total, "total", active, "active", nratings, "ratings", stars, "stars"
#that's it, now maybe save these to a CSV or a log file..

I debated whether to show my actual numbers. Here you go, enjoy:

2009-04-03 17:45:15 Measure Stuff 4 total 1 active 2 ratings 1 stars
2009-04-03 17:45:15 Measure Stuff Lite 3006 total 995 active 28 ratings 2 stars
2009-04-03 17:45:15 RGB Probe 4 total 2 active 2 ratings 1 stars
2009-04-03 17:45:15 Thumb Maze 112 total 39 active 8 ratings 3 stars
2009-04-03 17:45:15 Thumb Maze Lite 16313 total 8813 active 172 ratings 3 stars

Uh oh, those numbers are not very good at all! So far my plan to live off Android looks doomed, but maybe things will pick up in the future. Two of the apps appear twice because there is a paid version and a free one. Can you tell which is which? =). Also, I think there is something wrong with RGB Probe. I’ve gotten a couple of e-mails saying the download failed.

So I hope folks will find this script useful. Obviously, use of this code is completely at your own risk. Screen scrapers are an arguably questionable enterprise, so don’t blame me if you hose your Firefox profile or Google gets mad at you.

Also, if anyone knows the cURL incantation that will do the same thing sans Firefox, I’d love to hear it. I kept getting a 302 response and never quite figured it out. I’ve taken several suggestions based on other Google services that ‘should work’, but for some reason don’t.

There are certainly pros and cons to screen scraping through the browser; I’ll only point out two advantages: First, you get ‘real’ Javascript executed right in Firefox. With many of the big data sites being Ajax-heavy, simply fetching the HTML without executing the JS only gets you halfway there. Second, it is possible to detect and block screen scrapers by looking for unusual or suspicous request patterns. I don’t know if any sites actually do this, but it could be done. For example, a simple fetch via wget looks different to a server than a fetch with Firefox and it goes beyond User-Agents. The css, images, javascript, and such will also be fetched in a particular way and a server can look for anything unusual in the order or timing with which resources are requested. Sound crazy? You’re right! It probably is and I’m not sure anybody actually does this. In fact, it very possibly wouldn’t work well at all in practice. For one, it could screw up text-only browsers. But I think it is still within the realm of possibility..

Now for balance, two downsides: First, the browser needs a window to run in. This means it is kinda slow, hijacks your computer for a few seconds, and doesn’t really lend itself to parallelization. Second, tools like cURL and wget and many language-specific libraries are practically standard.