<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title> &#187; python</title>
	<atom:link href="http://www.craiget.com/tags/python/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.craiget.com</link>
	<description>In which I write mostly about programming and that sort of thing</description>
	<lastBuildDate>Fri, 20 Aug 2010 17:32:16 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Another interesting python snippet</title>
		<link>http://www.craiget.com/2009/07/another-interesting-python-snippet/</link>
		<comments>http://www.craiget.com/2009/07/another-interesting-python-snippet/#comments</comments>
		<pubDate>Thu, 02 Jul 2009 22:22:42 +0000</pubDate>
		<dc:creator>craiget</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.craigethomas.com/blog/?p=131</guid>
		<description><![CDATA[Well, I think it&#8217;s interesting anyway.. So today I was trying to express the idea &#8220;do something to all these files, unless the filename matches the list of things we don&#8217;t care about&#8221;. I have to do a BUNCH of find-and-replace kinda stuff in the next week on a couple thousand webpages, so my plan [...]]]></description>
			<content:encoded><![CDATA[<p>Well, I think it&#8217;s interesting anyway..</p>
<p>So today I was trying to express the idea &#8220;do something to all these files, unless the filename matches the list of things we don&#8217;t care about&#8221;. I have to do a BUNCH of find-and-replace kinda stuff in the next week on a couple thousand webpages, so my plan is to write a little script to make sure I don&#8217;t miss anything. Sometimes the script will turn up a false positive that I know I want to ignore.. </p>
<p><strong>EDIT</strong> &#8211; The following snippet doesn&#8217;t do quite what I thought it did.. more later..</p>
<p>There are at least three distinct and reasonable ways to do this in Python:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #808080; font-style: italic;"># skip if name matches any of these</span>
ignore = <span style="color: black;">&#91;</span><span style="color: #483d8b;">&quot;thing1&quot;</span>, <span style="color: #483d8b;">&quot;thing2&quot;</span><span style="color: black;">&#93;</span>
&nbsp;
<span style="color: #808080; font-style: italic;"># Iterate over the strings, see if a substring matches</span>
<span style="color: #808080; font-style: italic;"># This is reasonably clear, but it seems long and requires a flag</span>
<span style="color: #ff7700;font-weight:bold;">for</span> filepath <span style="color: #ff7700;font-weight:bold;">in</span> dirwalk<span style="color: black;">&#40;</span>path<span style="color: black;">&#41;</span>:
    found = <span style="color: #008000;">False</span>
    <span style="color: #ff7700;font-weight:bold;">for</span> item <span style="color: #ff7700;font-weight:bold;">in</span> ignore:
        <span style="color: #ff7700;font-weight:bold;">if</span> filepath.<span style="color: black;">find</span><span style="color: black;">&#40;</span>item<span style="color: black;">&#41;</span> <span style="color: #66cc66;">&amp;</span>gt<span style="color: #66cc66;">;</span> <span style="color: #ff4500;">0</span>:
            found = <span style="color: #008000;">True</span>
            <span style="color: #ff7700;font-weight:bold;">break</span>
    <span style="color: #ff7700;font-weight:bold;">if</span> found:
        <span style="color: #ff7700;font-weight:bold;">continue</span>
    <span style="color: #808080; font-style: italic;"># do stuff..</span>
&nbsp;
<span style="color: #808080; font-style: italic;"># The next two ways build a list of matches</span>
<span style="color: #808080; font-style: italic;"># If there are any matches then skip this file</span>
&nbsp;
<span style="color: #808080; font-style: italic;"># This is a more functional style</span>
<span style="color: #ff7700;font-weight:bold;">for</span> filepath <span style="color: #ff7700;font-weight:bold;">in</span> dirwalk<span style="color: black;">&#40;</span>path<span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">filter</span><span style="color: black;">&#40;</span><span style="color: #ff7700;font-weight:bold;">lambda</span> item: filepath.<span style="color: black;">find</span><span style="color: black;">&#40;</span>item<span style="color: black;">&#41;</span> <span style="color: #66cc66;">!</span>= -<span style="color: #ff4500;">1</span>, ignore<span style="color: black;">&#41;</span>:
        <span style="color: #ff7700;font-weight:bold;">continue</span>
    <span style="color: #808080; font-style: italic;"># do stuff..</span>
&nbsp;
<span style="color: #808080; font-style: italic;"># Another way to do the same thing</span>
<span style="color: #ff7700;font-weight:bold;">for</span> filepath <span style="color: #ff7700;font-weight:bold;">in</span> dirwalk<span style="color: black;">&#40;</span>path<span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: black;">&#91;</span>item <span style="color: #ff7700;font-weight:bold;">for</span> item <span style="color: #ff7700;font-weight:bold;">in</span> ignore <span style="color: #ff7700;font-weight:bold;">if</span> filepath.<span style="color: black;">find</span><span style="color: black;">&#40;</span>item<span style="color: black;">&#41;</span> <span style="color: #66cc66;">!</span>= -<span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span>:
        <span style="color: #ff7700;font-weight:bold;">continue</span>
    <span style="color: #808080; font-style: italic;"># do stuff..</span></pre></div></div>

<p>Interestingly, the last two use the same number of characters. I&#8217;m not sure which I prefer. While I suspect the final one would be considered the most Pythonic, I do have a soft spot for lambda. Eh, maybe someday when I&#8217;m not lazy I will see which is the fastest.. though that doesn&#8217;t really matter for my purposes.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.craiget.com/2009/07/another-interesting-python-snippet/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>More Reasons I Love Python..</title>
		<link>http://www.craiget.com/2009/05/more-reasons-i-love-python/</link>
		<comments>http://www.craiget.com/2009/05/more-reasons-i-love-python/#comments</comments>
		<pubDate>Wed, 20 May 2009 02:44:40 +0000</pubDate>
		<dc:creator>craiget</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.craigethomas.com/blog/?p=121</guid>
		<description><![CDATA[I was cleaning up some folders the other day at work where the files had been named using one of several naming schemes (or a few with no particular scheme at all). After brief consideration, I decided to do the legwork of renaming all the files with a naming scheme that actually makes sense: Category_YYYY-MM-DD [...]]]></description>
			<content:encoded><![CDATA[<p>I was cleaning up some folders the other day at work where the files had been named using one of several naming schemes (or a few with no particular scheme at all). After brief consideration, I decided to do the legwork of renaming all the files with a naming scheme that actually makes sense:</p>
<pre>Category_YYYY-MM-DD</pre>
<p>That way, the files will stay grouped together if they get copied around to other folders, and they sort alphabetically by date. Then there&#8217;s the task for regenerating all the HTML for these baddies. Happily, Python was up to the task:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">os</span>
<span style="color: #ff7700;font-weight:bold;">from</span> <span style="color: #dc143c;">datetime</span> <span style="color: #ff7700;font-weight:bold;">import</span> date
files = <span style="color: #dc143c;">os</span>.<span style="color: black;">listdir</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;Path<span style="color: #000099; font-weight: bold;">\\</span>To<span style="color: #000099; font-weight: bold;">\\</span>File&quot;</span><span style="color: black;">&#41;</span>
files.<span style="color: black;">sort</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
files.<span style="color: black;">reverse</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: #ff7700;font-weight:bold;">for</span> <span style="color: #008000;">file</span> <span style="color: #ff7700;font-weight:bold;">in</span> files:
    <span style="color: #808080; font-style: italic;"># chop the prefix, chop the suffix, split into (year, month, date), convert to int</span>
    x = <span style="color: black;">&#91;</span><span style="color: #008000;">int</span><span style="color: black;">&#40;</span>x<span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">for</span> x <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">file</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;_&quot;</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span>-<span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span>:-<span style="color: #ff4500;">4</span><span style="color: black;">&#93;</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'-'</span><span style="color: black;">&#41;</span><span style="color: black;">&#93;</span>
    <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">&quot;&lt;li&gt;&lt;a href=<span style="color: #000099; font-weight: bold;">\&quot;</span>/path/to/%s<span style="color: #000099; font-weight: bold;">\&quot;</span>&gt;%s&lt;/a&gt;&lt;/li&gt;&quot;</span> <span style="color: #66cc66;">%</span> <span style="color: black;">&#40;</span><span style="color: #008000;">file</span>, date<span style="color: black;">&#40;</span>x<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>, x<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span>, x<span style="color: black;">&#91;</span><span style="color: #ff4500;">2</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>.<span style="color: black;">strftime</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'%B %d, %Y'</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span></pre></div></div>

<p>Well, it&#8217;s nothing like the real pros can do. But you gotta love a few links of code that save your fingers from a repetitive and typo-prone task like manually editing hundreds of links.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.craiget.com/2009/05/more-reasons-i-love-python/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Python, PIL and Pretty Polaroids</title>
		<link>http://www.craiget.com/2009/04/python-pil-and-pretty-polaroids/</link>
		<comments>http://www.craiget.com/2009/04/python-pil-and-pretty-polaroids/#comments</comments>
		<pubDate>Sat, 04 Apr 2009 13:20:30 +0000</pubDate>
		<dc:creator>craiget</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[pil]]></category>
		<category><![CDATA[puppies]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.craigethomas.com/blog/?p=78</guid>
		<description><![CDATA[I suspect that by now everyone and their grandmother has written a script to convert photos so they look like Polaroids. Yesterday I spent a slow morning at work replacing all our slideshows (which used a super-ugly Flash control) with these: There&#8217;s plenty of other neat effects that could be done.. maybe add a bit [...]]]></description>
			<content:encoded><![CDATA[<p>I suspect that by now everyone and their grandmother has written a script to convert photos so they look like Polaroids. Yesterday I spent a slow morning at work replacing all our slideshows (which used a super-ugly Flash control) with these:
</p>
<p>
<a href="http://www.craigethomas.com/blog/wp-content/uploads/2009/04/puppies-img_0253.jpg"><img class="alignleft size-thumbnail wp-image-83" title="puppies-img_0253" src="http://www.craigethomas.com/blog/wp-content/uploads/2009/04/puppies-img_0253-150x150.jpg" alt="puppies-img_0253" width="150" height="150" /></a></p>
<p><a href="http://www.craigethomas.com/blog/wp-content/uploads/2009/04/puppies-img_0254.jpg"><img class="size-thumbnail wp-image-84 alignright" title="puppies-img_0254" src="http://www.craigethomas.com/blog/wp-content/uploads/2009/04/puppies-img_0254-150x150.jpg" alt="puppies-img_0254" width="150" height="150" /></a></p>
<p style="clear: both;">
There&#8217;s plenty of other neat effects that could be done.. maybe add a bit of aging or apply some filters. But I think it looks pretty good. The following script uses Python, PIL (Python Imaging Library), and a pre-drawn &#8220;polaroid&#8221; frame.</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">import</span> PIL, <span style="color: #dc143c;">time</span>, <span style="color: #dc143c;">glob</span>, <span style="color: #dc143c;">random</span>, <span style="color: #dc143c;">os</span>, <span style="color: #dc143c;">sys</span>
<span style="color: #ff7700;font-weight:bold;">from</span> PIL <span style="color: #ff7700;font-weight:bold;">import</span> Image, ImageOps, ImageEnhance, ImageDraw, ImageFont
&nbsp;
<span style="color: #808080; font-style: italic;"># Generate Polaroid-looking images</span>
<span style="color: #ff7700;font-weight:bold;">def</span> make_polaroid<span style="color: black;">&#40;</span>infile, outfile, text=<span style="color: #483d8b;">''</span><span style="color: black;">&#41;</span>:
    base = <span style="color: black;">&#40;</span><span style="color: #ff4500;">300</span>,<span style="color: #ff4500;">320</span><span style="color: black;">&#41;</span>    <span style="color: #808080; font-style: italic;">#size of polaroid background</span>
    polaroid = Image.<span style="color: #008000;">open</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'polaroid-0.png'</span><span style="color: black;">&#41;</span>
    polaroid = ImageOps.<span style="color: black;">fit</span><span style="color: black;">&#40;</span>polaroid, base, Image.<span style="color: black;">ANTIALIAS</span>, <span style="color: #ff4500;">0</span>, <span style="color: black;">&#40;</span><span style="color: #ff4500;">0.5</span>,<span style="color: #ff4500;">0.5</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
&nbsp;
    target = <span style="color: black;">&#40;</span><span style="color: #ff4500;">272</span>,<span style="color: #ff4500;">248</span><span style="color: black;">&#41;</span><span style="color: #66cc66;">;</span> <span style="color: #808080; font-style: italic;"># size of empty target area on polaroid background</span>
    img = Image.<span style="color: #008000;">open</span><span style="color: black;">&#40;</span>infile<span style="color: black;">&#41;</span>
    img = ImageOps.<span style="color: black;">fit</span><span style="color: black;">&#40;</span>img, target, Image.<span style="color: black;">ANTIALIAS</span>, <span style="color: #ff4500;">0</span>, <span style="color: black;">&#40;</span><span style="color: #ff4500;">0.5</span>,<span style="color: #ff4500;">0.5</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #808080; font-style: italic;">#enhance the image a bit</span>
    img = ImageOps.<span style="color: black;">autocontrast</span><span style="color: black;">&#40;</span>img, cutoff=<span style="color: #ff4500;">2</span><span style="color: black;">&#41;</span>
    img = ImageEnhance.<span style="color: black;">Sharpness</span><span style="color: black;">&#40;</span>img<span style="color: black;">&#41;</span>.<span style="color: black;">enhance</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">2.0</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #808080; font-style: italic;">#draw the text, if any</span>
    font = ImageFont.<span style="color: black;">truetype</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;arial.ttf&quot;</span>, <span style="color: #ff4500;">16</span><span style="color: black;">&#41;</span>
    text_size = ImageDraw.<span style="color: black;">Draw</span><span style="color: black;">&#40;</span>polaroid<span style="color: black;">&#41;</span>.<span style="color: black;">textsize</span><span style="color: black;">&#40;</span>text, font=font<span style="color: black;">&#41;</span>
    fontxy = <span style="color: black;">&#40;</span>base<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>/<span style="color: #ff4500;">2</span> - text_size<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>/<span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">278</span><span style="color: black;">&#41;</span>
    ImageDraw.<span style="color: black;">Draw</span><span style="color: black;">&#40;</span>polaroid<span style="color: black;">&#41;</span>.<span style="color: black;">text</span><span style="color: black;">&#40;</span>fontxy, text, font=font, fill=<span style="color: black;">&#40;</span><span style="color: #ff4500;">40</span>,<span style="color: #ff4500;">40</span>,<span style="color: #ff4500;">40</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #808080; font-style: italic;">#copy the image onto the polaroid background</span>
    imgcorner = <span style="color: black;">&#40;</span><span style="color: #ff4500;">14</span>,<span style="color: #ff4500;">20</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;">#paste image onto polaroid</span>
    polaroid.<span style="color: black;">paste</span><span style="color: black;">&#40;</span>img, imgcorner<span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #808080; font-style: italic;">#copy the whole thing onto a larger background and rotate randomly</span>
    angle = <span style="color: #dc143c;">random</span>.<span style="color: black;">randint</span><span style="color: black;">&#40;</span>-<span style="color: #ff4500;">10</span>,<span style="color: #ff4500;">10</span><span style="color: black;">&#41;</span>
    blank = Image.<span style="color: #dc143c;">new</span><span style="color: black;">&#40;</span>polaroid.<span style="color: black;">mode</span>, <span style="color: black;">&#40;</span><span style="color: #ff4500;">400</span>,<span style="color: #ff4500;">400</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
    blank.<span style="color: black;">paste</span><span style="color: black;">&#40;</span>polaroid, <span style="color: black;">&#40;</span>blank.<span style="color: black;">size</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>/<span style="color: #ff4500;">2</span>-polaroid.<span style="color: black;">size</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>/<span style="color: #ff4500;">2</span>, blank.<span style="color: black;">size</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span>/<span style="color: #ff4500;">2</span>-polaroid.<span style="color: black;">size</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span>/<span style="color: #ff4500;">2</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
    blank = blank.<span style="color: black;">rotate</span><span style="color: black;">&#40;</span>angle, Image.<span style="color: black;">BICUBIC</span><span style="color: black;">&#41;</span>
&nbsp;
    blank.<span style="color: black;">save</span><span style="color: black;">&#40;</span>outfile<span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">if</span> __name__ == <span style="color: #483d8b;">&quot;__main__&quot;</span>:
    <span style="color: #808080; font-style: italic;"># Takes 1 required argument -- the desired prefix for the output filename</span>
    <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span><span style="color: #dc143c;">sys</span>.<span style="color: black;">argv</span><span style="color: black;">&#41;</span> <span style="color: #66cc66;">&amp;</span>lt<span style="color: #66cc66;">;</span> <span style="color: #ff4500;">2</span>:
        <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">&quot;Missing required positional argument 'prefix'&quot;</span>
        exit<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #808080; font-style: italic;"># Text to appear on image, use &quot;&quot; if none </span>
    text = <span style="color: #483d8b;">&quot;Some Text, or leave blank&quot;</span>
&nbsp;
    <span style="color: #808080; font-style: italic;"># Erase everything in Output folder</span>
    <span style="color: #ff7700;font-weight:bold;">for</span> f <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #dc143c;">glob</span>.<span style="color: #dc143c;">glob</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'output/*'</span><span style="color: black;">&#41;</span>:
        <span style="color: #dc143c;">os</span>.<span style="color: black;">remove</span><span style="color: black;">&#40;</span>f<span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #808080; font-style: italic;"># Create Polaroids of each JPG in Input folder</span>
    files = <span style="color: black;">&#91;</span>f<span style="color: black;">&#91;</span><span style="color: #ff4500;">6</span>:<span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">for</span> f <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #dc143c;">glob</span>.<span style="color: #dc143c;">glob</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'input/*.jpg'</span><span style="color: black;">&#41;</span><span style="color: black;">&#93;</span>
    <span style="color: #ff7700;font-weight:bold;">for</span> f <span style="color: #ff7700;font-weight:bold;">in</span> files:
        make_polaroid<span style="color: black;">&#40;</span><span style="color: #483d8b;">'input/'</span>+f,<span style="color: #483d8b;">'output/'</span>+<span style="color: #dc143c;">sys</span>.<span style="color: black;">argv</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span>+<span style="color: #483d8b;">'-'</span>+f<span style="color: black;">&#91;</span>:-<span style="color: #ff4500;">4</span><span style="color: black;">&#93;</span>+<span style="color: #483d8b;">'.jpg'</span>,text<span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #808080; font-style: italic;"># Write index.html so Output folder can be copied/renamed elsewhere</span>
    files = <span style="color: black;">&#91;</span>f<span style="color: black;">&#91;</span><span style="color: #ff4500;">7</span>:<span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">for</span> f <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #dc143c;">glob</span>.<span style="color: #dc143c;">glob</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'output/*'</span><span style="color: black;">&#41;</span><span style="color: black;">&#93;</span>
    outhtml = <span style="color: #008000;">open</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'output/index.html'</span>,<span style="color: #483d8b;">'w'</span><span style="color: black;">&#41;</span>
    outhtml.<span style="color: black;">write</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;&lt;html&gt;&lt;head&gt;&lt;/head&gt;&lt;body style='background-color: #000;'&gt;&lt;div align='center'&gt;&lt;p&gt;&quot;</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">for</span> i <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">range</span><span style="color: black;">&#40;</span><span style="color: #008000;">len</span><span style="color: black;">&#40;</span>files<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>:
        outhtml.<span style="color: black;">write</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;&lt;img src='%s' /&gt;&quot;</span> <span style="color: #66cc66;">%</span> <span style="color: black;">&#40;</span>files<span style="color: black;">&#91;</span>i<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
        <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: black;">&#40;</span>i+<span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span> <span style="color: #66cc66;">%</span> <span style="color: #ff4500;">2</span> == <span style="color: #ff4500;">0</span>:
            outhtml.<span style="color: black;">write</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;&lt;/p&gt;&quot;</span><span style="color: black;">&#41;</span>
    outhtml.<span style="color: black;">write</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;&lt;/div&gt;&lt;/body&gt;&lt;/html&gt;&quot;</span><span style="color: black;">&#41;</span>
    outhtml.<span style="color: black;">close</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></div></div>

<p>The script is a bit over-specialized to my purpose .. converting a bunch of individual folders one at a time. So you may need to hack on it a bit to suit your needs. You can download the script here: <a href="http://www.craigethomas.com/blog/wp-content/uploads/2009/04/polaroid.zip">polaroid.zip</a>. Place files you want to convert into the &#8220;input&#8221; folder. Run the script with a single argument for the output filename prefix. It will take a few seconds or minutes to run, depending on how many photos you&#8217;re converting. When it finishes, copy the &#8220;output&#8221; folder elsewhere. The file &#8220;index.html&#8221; is pre-generated to contain all the photos in the folder.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.craiget.com/2009/04/python-pil-and-pretty-polaroids/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fetching Android Market Stats with Python, MozRepl, and BeautifulSoup</title>
		<link>http://www.craiget.com/2009/04/get-android-market-stats-with-python-mozrepl-and-beautifulsoup/</link>
		<comments>http://www.craiget.com/2009/04/get-android-market-stats-with-python-mozrepl-and-beautifulsoup/#comments</comments>
		<pubDate>Fri, 03 Apr 2009 04:24:00 +0000</pubDate>
		<dc:creator>craiget</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[android]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.craigethomas.com/blog/?p=65</guid>
		<description><![CDATA[A few weeks ago I was quite keen on the idea of gathering stats and creating charts to track the popularity of my Android apps. Alas, despite digging around in various packages and experimenting with cURL, I could never seem to get logged in programmatically to the Android Marketplace Developer Console. So I gave up [...]]]></description>
			<content:encoded><![CDATA[<p>A few weeks ago I was quite keen on the idea of gathering stats and creating charts to track the popularity of my Android apps. Alas, despite digging around in various packages and experimenting with cURL, I could never seem to get logged in programmatically to the Android Marketplace Developer Console. So I gave up to continue working on my next app. Now I&#8217;ve come up with another reason to do some screen-scraping, so I thought I should give this another try.</p>
<p>Half the magic here belongs to a very cool Firefox plugin called <a href="http://wiki.github.com/bard/mozrepl">MozRepl</a> which lets you open a telnet connection to Firefox and interact with it via Javascript. Awesome, no?</p>
<p>All you have to do is ask MozRepl to go to the Developer Console, download the HTML, and run it through <a href="http://www.crummy.com/software/BeautifulSoup/">BeautifulSoup</a> (the rest of the magic) to extract the data.</p>
<p>It turns out to be just slightly trickier because MozRepl needs to talk to Python via Telnet. I suppose this script could be setup in cron to grabs stats a couple of times each day. I think I&#8217;m just gonna run it manually every once in awhile.</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">import</span> BeautifulSoup, <span style="color: #dc143c;">re</span>, <span style="color: #dc143c;">time</span>
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">os</span>, <span style="color: #dc143c;">telnetlib</span>
<span style="color: #808080; font-style: italic;"># Install MozRepl Plugin</span>
<span style="color: #808080; font-style: italic;"># http://wiki.github.com/bard/mozrepl</span>
<span style="color: #808080; font-style: italic;"># Setup MozRepl to start automatically with FF, check that port number is 4242</span>
<span style="color: #808080; font-style: italic;"># Login to Developer Console once manually so login credentials get saved</span>
&nbsp;
<span style="color: #808080; font-style: italic;"># Create a new profile and set this accordingly</span>
<span style="color: #808080; font-style: italic;"># http://support.mozilla.com/en-US/kb/Managing+profiles</span>
<span style="color: #dc143c;">profile</span> = <span style="color: #483d8b;">'my_firefox_profile'</span>
&nbsp;
<span style="color: #808080; font-style: italic;"># go to Developer Console using new profile</span>
url = <span style="color: #483d8b;">'http://market.android.com/publish/Home'</span>
<span style="color: #dc143c;">os</span>.<span style="color: black;">system</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;firefox -no-remote -P %s %s &amp;&quot;</span> <span style="color: #66cc66;">%</span> <span style="color: black;">&#40;</span><span style="color: #dc143c;">profile</span>, url<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
<span style="color: #dc143c;">time</span>.<span style="color: black;">sleep</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">5</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;">#wait a sec for FF to start</span>
&nbsp;
<span style="color: #808080; font-style: italic;">#connect to MozRepl and fetch HTML</span>
t = <span style="color: #dc143c;">telnetlib</span>.<span style="color: black;">Telnet</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;localhost&quot;</span>, <span style="color: #ff4500;">4242</span><span style="color: black;">&#41;</span>
t.<span style="color: black;">read_until</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;repl&gt;&quot;</span><span style="color: black;">&#41;</span>
t.<span style="color: black;">write</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;content.document.body.innerHTML&quot;</span><span style="color: black;">&#41;</span>
body = t.<span style="color: black;">read_until</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;repl&gt;&quot;</span><span style="color: black;">&#41;</span>
t.<span style="color: black;">close</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #808080; font-style: italic;">#is there a better way to do this?</span>
<span style="color: #dc143c;">os</span>.<span style="color: black;">system</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;killall -9 firefox&quot;</span><span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #808080; font-style: italic;">#yank stats out of HTML</span>
now = <span style="color: #dc143c;">time</span>.<span style="color: black;">strftime</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;%Y-%m-%d %H:%M:%S&quot;</span><span style="color: black;">&#41;</span>
soup = BeautifulSoup.<span style="color: black;">BeautifulSoup</span><span style="color: black;">&#40;</span>body<span style="color: black;">&#41;</span>
table = soup.<span style="color: black;">find</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;div&quot;</span>, <span style="color: black;">&#123;</span> <span style="color: #483d8b;">&quot;class&quot;</span> : <span style="color: #483d8b;">&quot;listingTable&quot;</span> <span style="color: black;">&#125;</span><span style="color: black;">&#41;</span>
<span style="color: #ff7700;font-weight:bold;">for</span> row <span style="color: #ff7700;font-weight:bold;">in</span> table.<span style="color: black;">findAll</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'div'</span>, <span style="color: black;">&#123;</span><span style="color: #483d8b;">'class'</span>:<span style="color: #483d8b;">'listingRow'</span><span style="color: black;">&#125;</span><span style="color: black;">&#41;</span>:
  app = row.<span style="color: black;">find</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;div&quot;</span>, <span style="color: black;">&#123;</span> <span style="color: #483d8b;">&quot;class&quot;</span> : <span style="color: #483d8b;">&quot;listingApp&quot;</span> <span style="color: black;">&#125;</span><span style="color: black;">&#41;</span>
  rating = row.<span style="color: black;">find</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;div&quot;</span>, <span style="color: black;">&#123;</span> <span style="color: #483d8b;">&quot;class&quot;</span> : <span style="color: #483d8b;">&quot;listingRating&quot;</span> <span style="color: black;">&#125;</span><span style="color: black;">&#41;</span>
  stats = row.<span style="color: black;">find</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;div&quot;</span>, <span style="color: black;">&#123;</span> <span style="color: #483d8b;">&quot;class&quot;</span> : <span style="color: #483d8b;">&quot;listingStats&quot;</span> <span style="color: black;">&#125;</span><span style="color: black;">&#41;</span>
  <span style="color: #ff7700;font-weight:bold;">if</span> app <span style="color: #ff7700;font-weight:bold;">and</span> rating <span style="color: #ff7700;font-weight:bold;">and</span> stats:
    name = app.<span style="color: black;">next</span>.<span style="color: black;">next</span>.<span style="color: #dc143c;">string</span>
    total = stats.<span style="color: black;">next</span>.<span style="color: #dc143c;">string</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>
    active = stats.<span style="color: black;">next</span>.<span style="color: black;">nextSibling</span>.<span style="color: #dc143c;">string</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>
    nratings = rating.<span style="color: black;">next</span>.<span style="color: #dc143c;">string</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>:-<span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span>
    stars = <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>rating.<span style="color: black;">findAll</span><span style="color: black;">&#40;</span>attrs=<span style="color: black;">&#123;</span><span style="color: #483d8b;">'style'</span>:<span style="color: #dc143c;">re</span>.<span style="color: #008000;">compile</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;scroll -78px&quot;</span><span style="color: black;">&#41;</span><span style="color: black;">&#125;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
    <span style="color: #ff7700;font-weight:bold;">print</span> now, name, total, <span style="color: #483d8b;">&quot;total&quot;</span>, active, <span style="color: #483d8b;">&quot;active&quot;</span>, nratings, <span style="color: #483d8b;">&quot;ratings&quot;</span>, stars, <span style="color: #483d8b;">&quot;stars&quot;</span>
<span style="color: #808080; font-style: italic;">#that's it, now maybe save these to a CSV or a log file..</span></pre></div></div>

<p>I debated whether to show my actual numbers. Here you go, enjoy:</p>
<pre>
2009-04-03 17:45:15 Measure Stuff 4 total 1 active 2 ratings 1 stars
2009-04-03 17:45:15 Measure Stuff Lite 3006 total 995 active 28 ratings 2 stars
2009-04-03 17:45:15 RGB Probe 4 total 2 active 2 ratings 1 stars
2009-04-03 17:45:15 Thumb Maze 112 total 39 active 8 ratings 3 stars
2009-04-03 17:45:15 Thumb Maze Lite 16313 total 8813 active 172 ratings 3 stars
</pre>
<p>Uh oh, those numbers are not very good at all! So far my plan to live off Android looks doomed, but maybe things will pick up in the future. Two of the apps appear twice because there is a paid version and a free one. Can you tell which is which? =). Also, I think there is something wrong with RGB Probe. I&#8217;ve gotten a couple of e-mails saying the download failed.</p>
<p>So I hope folks will find this script useful. Obviously, use of this code is completely at your own risk. Screen scrapers are an arguably questionable enterprise, so don&#8217;t blame me if you hose your Firefox profile or Google gets mad at you.</p>
<p>Also, if anyone knows the cURL incantation that will do the same thing sans Firefox, I&#8217;d love to hear it. I kept getting a 302 response and never quite figured it out. I&#8217;ve taken several suggestions based on other Google services that &#8216;should work&#8217;, but for some reason don&#8217;t.</p>
<p>There are certainly pros and cons to screen scraping through the browser; I&#8217;ll only point out two advantages: First, you get &#8216;real&#8217; Javascript executed right in Firefox. With many of the big data sites being Ajax-heavy, simply fetching the HTML without executing the JS only gets you halfway there. Second, it is possible to detect and block screen scrapers by looking for unusual or suspicous request patterns. I don&#8217;t know if any sites actually do this, but it <em>could</em> be done. For example, a simple fetch via wget looks different to a server than a fetch with Firefox and it goes beyond User-Agents. The css, images, javascript, and such will also be fetched in a particular way and a server can look for anything unusual in the order or timing with which resources are requested. Sound crazy? You&#8217;re right! It probably is and I&#8217;m not sure anybody actually does this. In fact, it very possibly wouldn&#8217;t work well at all in practice. For one, it could screw up text-only browsers. But I think it is still within the realm of possibility..</p>
<p>Now for balance, two downsides: First, the browser needs a window to run in. This means it is kinda slow, hijacks your computer for a few seconds, and doesn&#8217;t really lend itself to parallelization. Second, tools like cURL and wget and many language-specific libraries are practically standard.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.craiget.com/2009/04/get-android-market-stats-with-python-mozrepl-and-beautifulsoup/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>The Amazing World of COM</title>
		<link>http://www.craiget.com/2009/03/the-amazing-world-of-com/</link>
		<comments>http://www.craiget.com/2009/03/the-amazing-world-of-com/#comments</comments>
		<pubDate>Thu, 05 Mar 2009 00:05:59 +0000</pubDate>
		<dc:creator>craiget</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[com]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.craigethomas.com/blog/?p=58</guid>
		<description><![CDATA[COM is one of those things that I&#8217;d heard about but never really needed. Well yesterday I stumbled on a script for converting Word documents to PDF using COM and that got me thinking &#8212; &#8220;Can I save myself TONS of time by writing a few scripts to do the really boring and repetitive parts [...]]]></description>
			<content:encoded><![CDATA[<p>COM is one of those things that I&#8217;d heard about but never really needed. Well yesterday I stumbled on a script for converting Word documents to PDF using COM and that got me thinking &#8212; &#8220;Can I save myself TONS of time by writing a few scripts to do the really boring and repetitive parts of my job?&#8221;. Indeed, the answer is yes.</p>
<p>Since Python is my first choice when there is a choice, I was pleased to discover the win32com package. Following is a very simple script using COM to convert Powerpoint slideshows from PPT to PPS (powerpoint show):</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">os</span>, win32com.<span style="color: black;">client</span>
doc_template_name = <span style="color: #dc143c;">os</span>.<span style="color: black;">path</span>.<span style="color: black;">abspath</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'test.ppt'</span><span style="color: black;">&#41;</span>
doc_final_name = <span style="color: #dc143c;">os</span>.<span style="color: black;">path</span>.<span style="color: black;">abspath</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'test.pps'</span><span style="color: black;">&#41;</span>
app = win32com.<span style="color: black;">client</span>.<span style="color: black;">Dispatch</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;PowerPoint.Application&quot;</span><span style="color: black;">&#41;</span>
app.<span style="color: black;">Visible</span> = <span style="color: #008000;">True</span>
doc_template = app.<span style="color: black;">Presentations</span>.<span style="color: black;">Open</span><span style="color: black;">&#40;</span>doc_template_name<span style="color: black;">&#41;</span>
doc_final = app.<span style="color: black;">Presentations</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>
doc_final.<span style="color: black;">SaveAs</span><span style="color: black;">&#40;</span>doc_final_name<span style="color: black;">&#41;</span>
doc_template.<span style="color: black;">Close</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
app.<span style="color: black;">Quit</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></div></div>

<p>This script is heavily influenced by the many wonderful examples at: <a href="http://win32com.goermezer.de/">http://win32com.goermezer.de/</a></p>
<p>(This was supposed to be published more than a month ago.. I just noticed it was marked as a Draft, danggit)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.craiget.com/2009/03/the-amazing-world-of-com/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
