tag:blogger.com,1999:blog-1697307561385480651Tue, 18 Mar 2014 23:52:32 +0000PythonMathAppEngineGoogle App EngineNumber TheoryAlgebraBinary Quadratic FormConwayConway's TopographDeferred LibraryGoogle CodesiteTask Queue APIJavascriptDecoratorProject EulerContinued FractionsExceptionGDataGoogle CalendarMetaclassOAuthOAuth2.0PiPythonicRequest HandlerSquare RootjQuery1and1.comAPIAnalysisApproximationBenchmarkBirth RatesChristmasClass as ObjectCommit HookComparisonCompizConfig FilesContinuedDNSDNS LookupDirichletDivvyDjangoDomain Name SystemDynamic LanguageEnvironment VariablesFamily SizeFibonacciFinanceFinite FieldFractionsGCalGitGoogleGoogle AppsHosts fileIP AddressISPIn-App PaymentsInferior GoodsInfinite Nested RadicalInterest RateInternet ProtocolInternet Service ProviderJITJavascript EngineJokeJust-in Time CompileLinear AlgebraLinuxMac OS XMacbookMail APIMathematicsMortgageNested RadicalNewton-Raphson MethodNumerical AnalysisOOPOpen SourcePerformancePractical JokePrankPrime NumberPrivate KeyProtectPyPyQuadratic IrrationalRadicalRoot TwoSantaSocial EconomicsStack OverflowTangentUNIXV8VerificationWindow Managergdata-python-clientgoogle-api-python-clientnode.jsnytimes.compeople.combossylobsterhttp://blog.bossylobster.com/noreply@blogger.com (Danny Hermes)Blogger31125tag:blogger.com,1999:blog-1697307561385480651.post-5713612631146165863Mon, 25 Nov 2013 22:50:00 +00002013-11-25T15:01:02.743-08:00ApproximationInfinite Nested RadicalMathNested RadicalPiRadicalRoot TwoSquare RootTrigonometry and Nested RadicalsEarly last month, I was chatting with one of my officemates about a curious problem I had studied in high school. I hadn't written any of the results down, so much of the discussion involved me rediscovering the results and proving them with much more powerful tools than I once possessed.<br /><br />Before writing about the problem I had played around with, I want to give a <strike>brief</strike> motivation. For as long as humans have been doing mathematics, finding values of \(\pi\) has been deemed worthwhile (or every generation has just found it worthwhile to waste time computing digits).<br /><br />One such way the Greeks (particularly <a href="http://www.math.utah.edu/~alfeld/Archimedes/Archimedes.html" target="_blank">Archmides</a>) computed \(\pi\) was by approximating a circle by a regular polygon and letting the number of sides grow large enough so that the error between the area of the unit circle (\(\pi \cdot 1^2\)) and the area of the polygon would be smaller than some fixed threshold. Usually these thresholds were picked to ensure that the first \(k\) digits were fully accurate (for some appropriate value of \(k\)).<br /><br />In many introductory Calculus courses, this problem is introduced exactly when the limit is introduced and students are forced to think about the <a href="http://www.qbyte.org/puzzles/p045s.html" target="_blank">area problem</a> in the regular polygon:<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://1.bp.blogspot.com/-VRz8LKn081A/UpPU9TzebwI/AAAAAAAANJs/qK36fAX2aOY/s1600/p045.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="" border="0" src="http://1.bp.blogspot.com/-VRz8LKn081A/UpPU9TzebwI/AAAAAAAANJs/qK36fAX2aOY/s1600/p045.png" title="From http://www.qbyte.org/puzzles/p045s.html" /></a></div><div class="separator" style="clear: both; text-align: center;"></div>Given \(N\) sides, the area is \(N \cdot T_N\) where \(T_N\) is the area of each individual triangle given by one side of the polygon and the circumcenter.<br /><br />Call one such triangle \(\Delta ABC\) and let \(BC\) be the side that is also a side of the polygon while the other sides have \(\left|AB\right| = \left|AC\right| = 1\) since the polygon is inscribed in a unit circle. The angle \(\angle BAC = \frac{2\pi}{N}\) since each of the triangles has the same internal angle and there are \(N\) of them. If we can find the perpendicular height \(h\) from \(AB\) to \(C\), the area will be \(\frac{1}{2} h \left|AB\right| = \frac{h}{2}\). But we also know that<br />\[\sin\left(\angle BAC\right) = \frac{h}{\left|AC\right|} \Rightarrow h = \sin\left(\frac{2\pi}{N}\right).\] Combining all of these, we can approximate \(\pi\) with the area:<br />\[\pi \approx \frac{N}{2} \sin\left(\frac{2\pi}{N}\right) = \pi \frac{\sin\left(\frac{2\pi}{N}\right)}{\frac{2 \pi}{N}}. \] As I've shown my <a href="http://math.berkeley.edu/courses/choosing/lowerdivcourses/math1A" target="_blank">Math 1A</a> students, we see that<br />\[\lim_{N \to \infty} \pi \frac{\sin\left(\frac{2\pi}{N}\right)}{\frac{2 \pi}{N}} = \pi \lim_{x \to 0} \frac{\sin(x)}{x} = \pi\] so these are indeed good approximations.<br /><h3>Theory is Nice, But I Thought We Were Computing Something</h3>Unfortunately for us (and Archimedes), computing \(\sin\left(\frac{2\pi}{N}\right)\) is not quite as simple as dividing by \(N\), so often special values of \(N\) were chosen. In fact, starting from \(N\) and then using \(2N\), the areas could be computed via a special way of averaging the previous areas. Lucky for us, such a method is equivalent to the trusty <a href="http://en.wikipedia.org/wiki/List_of_trigonometric_identities#Double-angle.2C_triple-angle.2C_and_half-angle_formulae" target="_blank">half angle identities</a> (courtesy of <a href="http://en.wikipedia.org/wiki/Abraham_de_Moivre" target="_blank">Abraham De Moivre</a>). To keep track of these polygons with a power of two as the number of sides, we call \(A_n = \frac{2^n}{2} \sin\left(\frac{2\pi}{2^n}\right)\).<br /><br />Starting out with the simplest polygon, the square with \(N = 2^2\) sides, we have<br />\[A_2 = 2 \sin\left(\frac{\pi}{2}\right) = 2.\] Jumping to the <a href="http://en.wikipedia.org/wiki/Octagon" target="_blank">octagon</a> (no not that "<a href="https://www.google.com/search?q=%22the+octagon%22&tbm=isch" target="_blank">The Octagon</a>"), we have<br />\[A_3 = 4 \sin\left(\frac{\pi}{4}\right) = 4 \frac{\sqrt{2}}{2} = 2 \sqrt{2}.\] So far, the toughest thing we've had to deal with is a \(45^{\circ}\) angle and haven't yet had to lean on Abraham (<a href="http://www.nocturnar.com/imagenes/abraham-de-moivre-mathematician-abraham-de-moivre.jpg" target="_blank">him</a>, <a href="http://foglobe.com/data_images/main/abraham-lincoln/abraham-lincoln-03.jpg" target="_blank">not him</a>) for help. The <a href="http://en.wikipedia.org/wiki/Hexadecagon" target="_blank">hexadecagon</a> wants to change that:<br />\[A_4 = 8 \sin\left(\frac{\pi}{8}\right) = 8 \sqrt{\frac{1 - \cos\left(\frac{\pi}{4}\right)}{2}} = 8 \sqrt{\frac{2 - \sqrt{2}}{4}} = 4 \sqrt{2 - \sqrt{2}}.\] <div>To really drill home the point (and motivate my next post) we'll compute this for the \(32\)-gon (past the point where polygons have worthwhile names):</div><div>\[A_5 = 16 \sin\left(\frac{\pi}{16}\right) = 16 \sqrt{\frac{1 - \cos\left(\frac{\pi}{8}\right)}{2}}.\] Before, we could rely on the fact that we know that a \(45-45-90\) triangle looked like, but now, we come across \(\cos\left(\frac{\pi}{8}\right)\), a value which we haven't seen before. Luckily, Abraham has help here as well:<br />\[\cos\left(\frac{\pi}{8}\right) = \sqrt{\frac{1 + \cos\left(\frac{\pi}{4}\right)}{2}} = \sqrt{\frac{2 + \sqrt{2}}{4}} = \frac{1}{2} \sqrt{2 + \sqrt{2}}\] which lets us compute<br />\[A_5 = 16 \sqrt{\frac{1 - \frac{1}{2} \sqrt{2 + \sqrt{2}}}{2}} = 8 \sqrt{2 - \sqrt{2 + \sqrt{2}}}.\]</div><div><br /></div><div>So why have I put you through all this? If we wave our hands like a <a href="http://imgs.tuts.dragoart.com/how-to-draw-fantasia-wizard-mickey_1_000000008546_5.jpg" target="_blank">magician</a>, we can see this pattern continues and for the general \(n\)</div><div>\[A_n = 2^{n - 2} \sqrt{2 - \sqrt{2 + \sqrt{2 + \sqrt{\cdots + \sqrt{2}}}}}\]</div><div>where there are \(n - 3\) nested radicals with the \(\oplus\) sign and only one minus sign at the beginning.</div><div><br /></div><div>This motivates us to study two questions, what is the limiting behavior of such a nested radical:</div>\[\sqrt{2 + s_1 \sqrt{2 + s_2 \sqrt{ \cdots }}}\] as the signs \(s_1, s_2, \ldots\) takes values in \(\left\{-1, 1\right\}\). Recasting in terms of the discussion above, we want to know how close we are to \(\pi\) as we increase the number of sides. <div><br /></div><div>When I was in high school, I just loved to <a href="http://blog.verdebmx.com/wp-content/uploads/2008/07/computer.jpg" target="_blank">nerd out</a> on any and all math problems, so I studied this just for fun. Having heard about the unfathomable brain of <a href="http://en.wikipedia.org/wiki/Srinivasa_Ramanujan" target="_blank">Ramanujan</a> and the fun <a href="http://www.isibang.ac.in/~sury/ramanujanday.pdf" target="_blank">work he had done</a> with infinitely <a href="http://en.wikipedia.org/wiki/Nested_radical" target="_blank">nested radicals</a>, I wanted to examine which sequences of signs \((s_1, s_2, \ldots)\) produced an infinite radical that converged and what the convergence behavior was.</div><div><br /></div><div>I'm fairly certain my original questions came from an Illinois Council of Teachers of Mathematics (<a href="http://www.ictm.org/contest.html" target="_blank">ICTM</a>) contest problem along the lines of</div><div>\[\text{Find the value of the infinite nested radical } \sqrt{2 + \sqrt{2 + \cdots}}\] or maybe the slightly more difficult \[\text{Find the value of the infinite nested radical } \sqrt{2 - \sqrt{2 + \sqrt{2 - \sqrt{2 + \cdots}}}}.\] <a href="http://www.search-best-cartoon.com/cartoon-moose/armed-cartoon-moose.jpg" target="_blank">Armed</a> with my <a href="http://img1.targetimg1.com/wcsstore/TargetSAS//img/p/93/50/93505.jpg" target="_blank">TI-83</a>, I set out to do some hardcore programming and figure it out. It took me around a month of off-and-on tinkering. This second time around as a mathematical grown-up, it took me the first half of a plane ride from SFO to Dallas.</div><div><br /></div><div>In the next few weeks/months, I hope to write a few blog posts, including math, proofs and some real code on what answers I came up with and what other questions I have.</div>http://blog.bossylobster.com/2013/11/trigonometry-and-nested-radicals.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-2850542803010530774Tue, 10 Sep 2013 23:46:00 +00002013-09-10T17:04:47.588-07:00DirichletMathMathematicsNumber TheoryPrime NumberCalculating a Greatest Common Divisor with Dirichlet's HelpHaving just left Google and started my PhD in Applied Mathematics at <a href="http://math.berkeley.edu/" target="_blank">Berkeley</a>, I thought it might be appropriate to write some (more) math-related blog posts. Many of these posts, I jotted down on napkins and various other places on the web and just haven't had time to post until now.<br /><br />For today, I'm posting a result which was somewhat fun to figure out with/for one of my <a href="https://picasaweb.google.com/101796704659729637490/WhereHasYourMathTShirtBeen#5802889644579484306" target="_blank">buddies</a> from <a href="http://www.lsa.umich.edu/math/" target="_blank">Michigan Math</a>. I'd also like to point out that he is absolutely kicking ass at Brown.<br /><br />While trying to determine if<br />\[J(B_n)_{\text{Tor}}\left(\mathbb{Q}\right) \stackrel{?}{=} \mathbb{Z}/2\mathbb{Z} \] where \(J(B_n)\) is the Jacobian of the curve \(B_n\) given by \(y^2 = (x + 2) \cdot f^n(x)\) where \(f^n\) denotes \(\underbrace{f \circ \cdots \circ f}_{n \text{ times}}\) and \(f(x) = x^2 - 2\).<br /><br />Now, his and my interests diverged some time ago, so I can't appreciate what steps took him from this to the problem I got to help with. However, he was able to show (trivially maybe?) that this was equivalent to showing that<br />\[\gcd\left(5^{2^n} + 1, 13^{2^n} + 1, \ldots, p^{2^n} + 1, \ldots \right) = 2 \qquad (1)\] where the \(n\) in the exponents is the same as that in \(B_n\) and where the values we are using in our greatest common divisor (e.g. \(5, 13\) and \(p\) above) are all of the primes \(p \equiv 5 \bmod{8}\).<br /><br />My buddy, being sadistic and for some reason angry with me, passed me along the stronger statement:<br />\[\gcd\left(5^{2^n} + 1, 13^{2^n} + 1\right) = 2 \qquad (2)\] which I of course struggled with and tried to beat down with tricks like \(5^2 + 12^2 = 13^2\). After a few days of this struggle, he confessed that he was trying to ruin my life and told me about the weaker version \((1)\).<br /><br />When he sent me the email informing me of this, I read it at 8am, drove down to Santa Clara for <a href="https://us.pycon.org/2013/" target="_blank">PyCon</a> and by the time I arrived at 8:45am I had figured the weaker case \((1)\) out. This felt much better than the days of struggle and made me want to write about my victory (which I'm doing now). Though, before we actually demonstrate the weaker fact \((1)\) I will admit that I am not in fact tall. Instead I stood on the shoulders of Dirichlet and <a href="http://www.youtube.com/watch?v=A6f-6l0W-0o#t=35s" target="_blank">called myself tall</a>. Everything else is bookkeeping.<br /><h2><span style="font-size: large;">Let's Start the Math</span></h2>First, if \(n = 0\), we see trivially that<br />\[\gcd\left(5^{2^0} + 1, 13^{2^0} + 1\right) = \gcd\left(6, 14\right) = 2\] and all the remaining terms are divisible by \(2\) hence the \(\gcd\) over all the primes must be \(2\).<br /><br />Now, if \(n > 0\), we will show that \(2\) divides our \(\gcd\), but \(4\) does not and that no odd prime can divide this \(\gcd\). First, for \(2\), note that<br />\[p^{2^n} + 1 \equiv \left(\pm 1\right)^{2^n} + 1 \equiv 2 \bmod{4}\] since our primes are odd. Thus they are all divisible by \(2\) and none by \(4\).<br /><br />Now assume some odd prime \(p^{\ast}\) divides all of the quantities in question. We'll show no such \(p^{\ast}\) can exist by contradiction.<br /><br />In much the same way we showed the \(\gcd\) wasn't divisible by \(4\), we seek to find a contradiction in some modulus. But since we are starting with \(p^{2^n} + 1 \equiv 0 \bmod{p^{\ast}}\), if we can find some such \(p\) with \(p \equiv 1 \bmod{p^{\ast}}\), then we'd have our contradiction from<br />\[0 \equiv p^{2^n} + 1 \equiv 1^{2^n} + 1 \equiv 2 \bmod{p^{\ast}}\] which can't occur since \(p^{\ast}\) is an odd prime.<br /><br />With this in mind, along with a subsequence of the arithmetic progression \(\left\{5, 13, 21, 29, \ldots\right\}\), it seems that using <a href="http://en.wikipedia.org/wiki/Dirichlet's_theorem_on_arithmetic_progressions" target="_blank">Dirichlet's theorem on arithmetic progressions</a> may be a good strategy. However, this sequence only tells us about the residue modulo \(8\), but we also want to know about the residue modulo \(p^{\ast}\). Naturally, we look for a subsequence in<br />\[\mathbb{Z}/\mathbb{8Z} \times \mathbb{Z}/\mathbb{p^{\ast}Z}\] corresponding to the residue pair \((5 \bmod{8}, 1 \bmod{p^{\ast}})\). Due to the <a href="http://en.wikipedia.org/wiki/Chinese_remainder_theorem" target="_blank">Chinese remainder theorem</a> this corresponds to a unique residue modulo \(8p^{\ast}\).<br /><br />Since this residue \(r\) has \(r \equiv 1 \bmod{p^{\ast}}\), we must have<br />\[r \in \left\{1, 1 + p^{\ast}, 1 + 2p^{\ast}, \ldots, 1 + 7p^{\ast}\right\} .\] But since \(1 + kp^{\ast} \equiv r \equiv 5 \bmod{8}\), we have \(kp^{\ast} \equiv 4 \bmod{8}\) and \(k \equiv 4\left(p^{\ast}\right)^{-1} \bmod{8}\) since \(p^{\ast}\) is odd and invertible mod \(8\). But this also means its inverse is odd, hence \(k \equiv 4\cdot(2k' + 1) \equiv 4 \bmod{8}\). Thus we have \(1 + 4 p^{\ast} \in \mathbb{Z}/8p^{\ast}\mathbb{Z}\) corresponding to our residue pair. Thus every element in the arithmetic progression \(S = \left\{(1 + 4p^{\ast}) + (8p^{\ast})k \right\}_{k=0}^{\infty}\) is congruent to \(1 + 4 p^{\ast} \bmod{8p^{\ast}}\) and hence \(5 \bmod{8}\) and \(1 \bmod{p^{\ast}}\).<br /><br />What's more, since \(5 \in \left(\mathbb{Z}/8\mathbb{Z}\right)^{\times}\) and \(1 \in \left(\mathbb{Z}/p^{\ast}\mathbb{Z}\right)^{\times}\), we have \(1 + 4 p^{\ast} \in \left(\mathbb{Z}/8p^{\ast}\mathbb{Z}\right)^{\times}\) (again by the Chinese remainder theorem). Thus the arithmetic progression \(S\) satisfies the hypothesis of Dirichlet's theorem. Hence there must at least one prime \(p\) occurring in the progression (since there are infinitely many). But that also means \(p\) occurs in \(\left\{5, 13, 29, 37, \ldots\right\}\) hence we've reached our desired contradiction. RAA.<br /><h2><span style="font-size: large;">Now What?</span></h2>We still don't know if the strong version \((2)\)<br />\[\gcd\left(5^{2^n} + 1, 13^{2^n} + 1, \ldots, p^{2^n} + 1, \ldots \right) = 2\] By similar arguments as above, if any odd prime \(p^{\ast}\) divides this \(\gcd\), then we have<br />\[5^{2^n} \equiv -1 \bmod{p^{\ast}}\] hence there is an element of order \(2^{n + 1}\). This means the order of the multiplicative group \(\varphi\left(p^{\ast}\right) = p^{\ast} - 1\) is divisible by \(2^{n + 1}\). Beyond that, who knows? We're still thinking about it (but only passively, more important things to do).http://blog.bossylobster.com/2013/09/calculating-greatest-common-divisor.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-4992345193811489391Mon, 19 Aug 2013 00:54:00 +00002013-08-18T18:47:52.565-07:00AlgebraFibonacciFinite FieldLinear AlgebraMathNumber TheorySome Fibonacci Fun with PrimesI haven't written in way too long and just wanted to post this fun little proof.<br /><br /><b>Assertion:</b> Let \(F_n\) be the \(n\)th Fibonacci number defined by \(F_n = F_{n-1} + F_{n-2}\), \(F_0 = 0, F_1 = 1\). Show that for an odd prime \(p\neq 5\), we have \(p\) divides \(F_{p^2−1}\).<br /><br /><b>Proof:</b> We do this by working inside \(\mathbb{F}_p\) instead of working in \(\mathbb{R}\). The recurrence is given by<br /><br />\[ \left( \begin{array}{cc}<br />1 & 1 \\<br />1 & 0 \end{array} \right)<br />\left( \begin{array}{c}<br />F_{n-1} \\<br />F_{n-2} \end{array} \right)<br />=<br />\left( \begin{array}{c}<br />F_{n-1} + F_{n-2} \\<br />F_{n-1} \end{array} \right)<br />=<br />\left( \begin{array}{c}<br />F_n \\<br />F_{n-1} \end{array} \right)\] and in general<br />\[ \left( \begin{array}{cc}<br />1 & 1 \\<br />1 & 0 \end{array} \right)^{n}<br />\left( \begin{array}{c}<br />1 \\<br />0 \end{array} \right)<br />=<br />\left( \begin{array}{cc}<br />1 & 1 \\<br />1 & 0 \end{array} \right)^{n}<br />\left( \begin{array}{c}<br />F_1 \\<br />F_0 \end{array} \right)<br />=<br />\left( \begin{array}{c}<br />F_{n + 1} \\<br />F_n \end{array} \right)\] The matrix \(A = \left(\begin{array}{cc} 1 & 1 \\ 1 & 0 \end{array} \right)\) has characteristic polynomial<br />\[\chi_A(t) = (1 - t)(0 - t) - (1)(1) = t^2 - t - 1\] If this polynomial has distinct roots, then \(A\) is diagonalizable (this is sufficient, but not necessary). Assuming the converse we have \(\chi_A(t) = (t - \alpha)^2\) for some \(\alpha \in \mathbb{F}_p\); we can assume \(\alpha \in \mathbb{F}_p\) since \(-2\alpha = -1\) is the coefficient of \(t\), which means \(\alpha = 2^{-1} \) (we are fine with this since \(p\) odd means that \(2^{-1}\) exists). In order for this to be a root of \(\chi_A\), we must have<br />\[0 \equiv 4 \cdot \chi_A\left(2^{-1}\right) \equiv 4 \cdot \left(2^{-2} - 2^{-1} - 1\right) \equiv 1 - 2 - 4 \equiv -5 \bmod{p}.\] Since \(p \neq 5\) is prime, this is not possible, hence we reached a contradiction and \(\chi_A\) does not have a repeated root.<br /><br />Thus we may write \(\chi_A(t) = (t - \alpha)(t - \beta)\) for \(\alpha, \beta \in \mathbb{F}_{p^2}\) (it's possible that \(\chi_A\) is irreducible over \(\mathbb{F}_p\), but due to degree considerations it <b>must</b> split completely over \(\mathbb{F}_{p^2}\)). Using this, we may write<br /><br />\[A = P \left(\begin{array}{cc} \alpha & 0 \\ 0 & \beta \end{array} \right) P^{-1}\] for some \(P \in GL_{2} \left(\mathbb{F}_{p^2}\right)\) and so<br /><br />\[A^{p^2 - 1} = P \left(\begin{array}{cc} \alpha & 0 \\ 0 & \beta \end{array} \right)^{p^2 - 1} P^{-1}<br />= P \left(\begin{array}{cc} \alpha^{p^2 - 1} & 0 \\ 0 & \beta^{p^2 - 1} \end{array} \right)P^{-1}\] Since \(\chi_A(0) = 0 - 0 - 1 \neq 0\) we know \(\alpha\) and \(\beta\) are nonzero, hence \(\alpha^{p^2 - 1} = \beta^{p^2 - 1} = 1 \in \mathbb{F}_{p^2} \). Thus \(A^{p^2 - 1} = P I_2 P^{-1} = I_2\) and so<br /><br />\[\left( \begin{array}{c}<br />F_p \\<br />F_{p^2 - 1} \end{array} \right)<br />=<br />\left( \begin{array}{cc}<br />1 & 1 \\<br />1 & 0 \end{array} \right)^{p^2 - 1}<br />\left( \begin{array}{c}<br />1 \\<br />0 \end{array} \right)<br />=<br />I_2 \left( \begin{array}{c}<br />1 \\<br />0 \end{array} \right)<br />=<br />\left( \begin{array}{c}<br />1 \\<br />0 \end{array} \right)\] so we have \(F_{p^2 - 1} = 0\) in \(\mathbb{F}_p\) as desired.http://blog.bossylobster.com/2013/08/some-fibonacci-fun-with-primes.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-3716043890755082571Mon, 24 Dec 2012 17:37:00 +00002012-12-24T09:43:31.626-08:00AppEngineDecoratorGDatagdata-python-clientGoogle App EngineGoogle Calendargoogle-api-python-clientOAuthOAuth2.0PythonBridging OAuth 2.0 objects between GData and DiscoveryMy colleague <a class="g-profile" href="http://plus.google.com/110554344789668969711" target="_blank">+Takashi Matsuo</a> and I recently gave a <a href="http://www.youtube.com/watch?v=HoUdWBzUZ-M" target="_blank">talk</a> about using <span style="color: lime; font-family: Courier New, Courier, monospace;">OAuth2Decorator</span> (from the <span style="color: lime; font-family: Courier New, Courier, monospace;">google-api-python-client</span> <a href="http://code.google.com/p/google-api-python-client/" target="_blank">library</a>) with request handlers in<a href="https://developers.google.com/appengine/" target="_blank"> Google App Engine</a>. Shortly after, a <a href="http://stackoverflow.com/questions/13981641" target="_blank">Stack Overflow question</a> sprung up asking about the right way to use the decorator and, as a follow up, if the decorator could be used with the <a href="https://developers.google.com/google-apps/provisioning/" target="_blank">Google Apps Provisioning API</a>. As I mentioned in my answer,<br /><blockquote class="tr_bq">The Google Apps Provisioning API is a <a href="https://developers.google.com/gdata/docs/2.0/reference" target="_blank">Google Data API</a>...As a result, you'll need to use the <span style="color: lime; font-family: Courier New, Courier, monospace;">gdata-python-client</span> <a href="http://code.google.com/p/gdata-python-client/" target="_blank">library</a> to use the Provisioning API. Unfortunately, you'll need to manually convert from a <a href="http://code.google.com/p/google-api-python-client/source/browse/oauth2client/client.py?r=efd0ccd31d6c16ddf9f65ba5c31c7033749be0e1#349" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">oauth2client.client.OAuth2Credentials</span> object</a> to a <a href="http://code.google.com/p/gdata-python-client/source/browse/src/gdata/gauth.py?r=cf0208e89433800c713495654774f36d84e894b3#1143" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">gdata.gauth.OAuth2Token</span> object</a> to use the same token for either one.</blockquote>Instead of making everyone and their brother write their own, I thought I'd take a stab at it and write about it here. The general philosophy I took was that the token subclass should be 100% based on an <span style="color: lime; font-family: Courier New, Courier, monospace;">OAuth2Credentials</span> object: <br /><ul><li>the token constructor simply takes an <span style="color: lime; font-family: 'Courier New', Courier, monospace;">OAuth2Credentials</span> object</li><li>the token refresh updates the <span style="color: lime; font-family: 'Courier New', Courier, monospace;">OAuth2Credentials</span> object set on the token</li><li>values of the current token can be updated directly from the <span style="color: lime; font-family: 'Courier New', Courier, monospace;">OAuth2Credentials</span> object set on the token</li></ul>Starting from the top, we'll use two imports:<br /><pre class="prettyprint" style="background-color: white;">import httplib2<br />from gdata.gauth import OAuth2Token</pre>The first is needed to refresh an <span style="color: lime; font-family: 'Courier New', Courier, monospace;">OAuth2Credentials</span> object using the mechanics native to <span style="color: lime; font-family: 'Courier New', Courier, monospace;">google-api-python-client</span>, and the second is needed so we may subclass the <span style="color: lime; font-family: 'Courier New', Courier, monospace;">gdata-python-client</span> native token class.<br /><br />As I mentioned, the values should be updated directly from an <span style="color: lime; font-family: 'Courier New', Courier, monospace;">OAuth2Credentials</span> object, so in our constructor, we first initialize the values to <span style="color: lime; font-family: Courier New, Courier, monospace;">None</span> and then call our update method to actual set the values. This allows us to write less code, because, <a href="http://en.wikipedia.org/wiki/Don't_repeat_yourself" target="_blank">repeating is bad</a> (I think someone told me that once?). <br /><pre class="prettyprint" style="background-color: white;">class OAuth2TokenFromCredentials(OAuth2Token):<br /> def __init__(self, credentials):<br /> self.credentials = credentials<br /> super(OAuth2TokenFromCredentials, self).__init__(None, None, None, None)<br /> self.UpdateFromCredentials()<br /></pre>We can get away with passing four <span style="color: lime; font-family: Courier New, Courier, monospace;">None</span>s to the superclass constructor, as it only has four positional arguments: <span style="color: lime; font-family: Courier New, Courier, monospace;">client_id</span>, <span style="color: lime; font-family: 'Courier New', Courier, monospace;">client_secret</span>, <span style="color: lime; font-family: Courier New, Courier, monospace;">scope</span>, and <span style="color: lime; font-family: Courier New, Courier, monospace;">user_agent</span>. Three of those have equivalents on the <span style="color: lime; font-family: 'Courier New', Courier, monospace;">OAuth2Credentials</span> object, but there is no place for <span style="color: lime; font-family: 'Courier New', Courier, monospace;">scope</span> because that part of the token exchange handled elsewhere (<span style="color: lime; font-family: Courier New, Courier, monospace;">OAuth2WebServerFlow</span>) in the <span style="color: lime; font-family: 'Courier New', Courier, monospace;">google-api-python-client</span> library.<br /><pre class="prettyprint" style="background-color: white;"> def UpdateFromCredentials(self):<br /> self.client_id = self.credentials.client_id<br /> self.client_secret = self.credentials.client_secret<br /> self.user_agent = self.credentials.user_agent<br /> ...</pre>Similarly, the <span style="color: lime; font-family: 'Courier New', Courier, monospace;">OAuth2Credentials</span> object only implements the refresh part of the OAuth 2.0 flow, so only has the token URI, hence <span style="color: lime; font-family: Courier New, Courier, monospace;">auth_uri</span>, <span style="color: lime; font-family: 'Courier New', Courier, monospace;">revoke_uri</span> and <span style="color: lime; font-family: Courier New, Courier, monospace;">redirect</span><span style="color: lime; font-family: 'Courier New', Courier, monospace;">_uri</span> will not be set either. However, the token URI and the token data are the same for both.<br /><pre class="prettyprint" style="background-color: white;"> ...<br /> self.token_uri = self.credentials.token_uri<br /> self.access_token = self.credentials.access_token<br /> self.refresh_token = self.credentials.refresh_token<br /> ...<br /></pre>Finally, we copy the extra fields which may be set outside of a constructor:<br /><pre class="prettyprint" style="background-color: white;"> ...<br /> self.token_expiry = self.credentials.token_expiry<br /> self._invalid = self.credentials.invalid<br /></pre>Since <span style="color: lime; font-family: 'Courier New', Courier, monospace;">OAuth2Credentials</span> doesn't deal with all parts of the OAuth 2.0 process, we disable those methods from <span style="color: lime; font-family: 'Courier New', Courier, monospace;">OAuth2Token</span> that do.<br /><pre class="prettyprint" style="background-color: white;"> def generate_authorize_url(self, *args, **kwargs): raise NotImplementedError<br /> def get_access_token(self, *args, **kwargs): raise NotImplementedError<br /> def revoke(self, *args, **kwargs): raise NotImplementedError<br /> def _extract_tokens(self, *args, **kwargs): raise NotImplementedError<br /></pre>Finally, the last method which needs to be implemented is <span style="color: lime; font-family: Courier New, Courier, monospace;">_refresh</span>, which should refresh the <span style="color: lime; font-family: 'Courier New', Courier, monospace;">OAuth2Credentials</span> object and then update the current GData token after the refresh. Instead of using the passed in request object, we use one from <span style="color: lime; font-family: Courier New, Courier, monospace;">httplib2</span> as we mentioned in imports.<br /><pre class="prettyprint" style="background-color: white;"> def _refresh(self, unused_request):<br /> self.credentials._refresh(httplib2.Http().request)<br /> self.UpdateFromCredentials()<br /></pre>After refreshing the <span style="color: lime; font-family: 'Courier New', Courier, monospace;">OAuth2Credentials</span> object, we can update the current token using the same method called in the constructor.<br /><br />Using this class, we can simultaneously call a <a href="https://developers.google.com/discovery/v1/getting_started#background" target="_blank">discovery-based API</a> and a GData API: <br /><pre class="prettyprint" style="background-color: white;">from apiclient.discovery import build<br />from gdata.contacts.client import ContactsClient<br /><br />service = build('calendar', 'v3', developerKey='...')<br /><br />class MainHandler(webapp2.RequestHandler):<br /> @decorator.oauth_required<br /> def get(self):<br /> auth_token = OAuth2TokenFromCredentials(decorator.credentials)<br /> contacts_client = ContactsClient()<br /> auth_token.authorize(contacts_client)<br /> contacts = contacts_client.get_contacts()<br /> ...<br /> events = service.events().list(calendarId='primary').execute(<br /> http=decorator.http())<br /> ...<br /></pre>http://blog.bossylobster.com/2012/12/bridging-oauth-20-objects-between-gdata.htmlnoreply@blogger.com (Danny Hermes)3tag:blogger.com,1999:blog-1697307561385480651.post-999321618493917715Mon, 10 Sep 2012 14:56:00 +00002012-11-20T16:01:13.532-08:00AppEngineDeferred LibraryGoogle App EngineGoogle CodesiteJavascriptjQueryPythonTask Queue APILast to Cross the Finish Line: Part ThreeRecently, my colleague <a href="https://plus.google.com/115640166224745944209" target="_blank">+Fred Sauer</a> and I gave a tech talk called "Last Across the Finish Line: Asynchronous <a href="https://developers.google.com/appengine/docs/python/taskqueue/overview" target="_blank">Tasks</a> with <a href="https://appengine.google.com/" target="_blank">App Engine</a>". This is part three in a three part series where I will share our <a href="http://www.forbes.com/pictures/ekij45gdh/learnings/#gallerycontent" target="_blank">learnings</a> and give some helpful references to the <a href="https://developers.google.com/appengine/docs/" target="_blank">App Engine documentation</a>.<br /><br />Check out the <a href="http://blog.bossylobster.com/2012/08/last-to-cross-finish-line-part-two.html" target="_blank">previous post</a> if you haven't already. In this section, we'll define the <span style="color: lime; font-family: Courier New, Courier, monospace;">PopulateBatch</span> function and explore the <a href="https://developers.google.com/appengine/docs/python/ndb/" target="_blank">ndb models</a> and <a href="https://developers.google.com/appengine/docs/python/taskqueue/" target="_blank">Task Queue</a> operations that make it work.<br /><h2><span style="font-size: large;">Imports</span></h2>Before defining the <a href="https://developers.google.com/appengine/docs/python/ndb/" target="_blank">models</a> and helper functions in <a href="http://code.google.com/p/gae-last-across-the-finish-line/source/browse/models.py" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">models.py</span></a>, let's first review the imports:<br /><pre class="prettyprint" style="background-color: white;">import json<br /><br />from google.appengine.api import channel<br />from google.appengine.ext.deferred import defer<br />from google.appengine.ext import ndb<br /></pre>Again, we import <a href="http://docs.python.org/library/json.html" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">json</span></a> and <span style="color: lime; font-family: Courier New, Courier, monospace;">channel</span> for serialization and message passing. We import the <span style="color: lime; font-family: Courier New, Courier, monospace;">defer</span> function from the <a href="https://developers.google.com/appengine/articles/deferred" target="_blank">deferred library</a> to abstract away task creation and take advantage of the ability to "defer" a function call to another thread of execution. Finally, we import <span style="color: lime; font-family: Courier New, Courier, monospace;">ndb</span> as a means for interacting with the App Engine <a href="https://developers.google.com/appengine/docs/python/datastore/overview" target="_blank">Datastore</a>. <br /><h2><span style="font-size: large;">Method Wrapper Built for Tasks</span></h2>As we saw in the <span style="color: lime; font-family: Courier New, Courier, monospace;">BeginWork</span> handler in <a href="http://blog.bossylobster.com/2012/08/last-to-cross-finish-line-part-two.html" target="_blank">part two</a>, units of work are passed to <span style="color: lime; font-family: Courier New, Courier, monospace;">PopulateBatch</span> as 3-tuples containing a method, the positional arguments and the keyword arguments to that method.<br /><br />In order to keep our task from hanging indefinitely due to unseen errors and to implicitly include the work unit in the batch, we define a wrapper around these method calls:<br /><pre class="prettyprint" style="background-color: white;">def AlwaysComplete(task, method, *args, **kwargs):<br /> try:<br /> method(*args, **kwargs)<br /> except: # TODO: Consider failing differently.<br /> pass<br /> finally:<br /> defer(task.Complete)</pre>As you can see, we catch any and all errors thrown by our method and don't retry the method if it fails. In our example, if the call <span style="color: lime; font-family: Courier New, Courier, monospace;">method(*args, **kwargs)</span> fails, the data won’t be sent through the channel and the given square will not show up in the quilt. However, since we catch these exceptions, the batch will complete and the spinner will disappear with this square still missing.<br /><br />This part is likely going to be customized to the specific work involved, but for our case, we didn't want individual failures to cause the whole batch to fail. In addition, we implicitly link the work unit with a special type of task object in the datastore.<br /><br />In the <span style="color: lime; font-family: Courier New, Courier, monospace;">finally</span> section of the error catch, we defer the <span style="color: lime; font-family: Courier New, Courier, monospace;">Complete</span> method on the task corresponding to this work unit. We defer the call to this complete method in order to avoid any errors (possibly from a failed datastore action) that the method may cause. If it were to throw an error, since <span style="color: lime; font-family: Courier New, Courier, monospace;">AlwaysComplete</span> is called in a deferred task, the task would retry and our worker unit would execute (or fail) again, which is bad if our user interface is not <a href="http://en.wikipedia.org/wiki/Idempotence#Computer_science_meaning" target="_blank">idempotent</a>.<br /><h2><span style="font-size: large;">Task Model</span></h2>As we saw above, we need a datastore model to represent tasks within a batch. We start out initially with a model having only one attribute — a boolean representing whether or not the task has completed. <br /><pre class="prettyprint" style="background-color: white;">class BatchTask(ndb.Model):<br /> # Very important that the default value True of `indexed` is used here<br /> # since we need to query on BatchTask.completed<br /> completed = ndb.BooleanProperty(default=False)</pre>As we know, we'll need to define a <span style="color: lime; font-family: Courier New, Courier, monospace;">Complete</span> method in order to use the task in <span style="color: lime; font-family: Courier New, Courier, monospace;">AlwaysComplete</span>, but before doing so, we'll define another method which will put the task object in the datastore and pass a unit of work to <span style="color: lime; font-family: Courier New, Courier, monospace;">AlwaysComplete</span>:<br /><pre class="prettyprint" style="background-color: white;"> @ndb.transactional<br /> def Populate(self, method, *args, **kwargs):<br /> self.put()<br /> kwargs['_transactional'] = True<br /> defer(AlwaysComplete, self.key, method, *args, **kwargs)</pre>In this <span style="color: lime; font-family: Courier New, Courier, monospace;">Populate</span> method, we first put the object in the datastore <a href="https://developers.google.com/appengine/docs/python/datastore/transactions" target="_blank">transactionally</a> by using the <span style="color: lime; font-family: Courier New, Courier, monospace;">ndb.transactional</span> decorator. By adding the <span style="color: lime; font-family: Courier New, Courier, monospace;">_transactional</span> keyword to the keyword arguments, <span style="color: lime; font-family: Courier New, Courier, monospace;">defer</span> <a href="http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/ext/deferred/deferred.py?r=277#250" target="_blank">strips away</a> the underscore and creates a <a href="https://developers.google.com/appengine/docs/python/taskqueue/overview#Tasks_within_Transactions" target="_blank">transactional task</a>. By doing this<br /><blockquote class="tr_bq">"the task is only enqueued — and guaranteed to be enqueued — if the transaction is committed successfully."</blockquote>We need this deferred task to be enqueued transactionally for consistency of the <span style="color: lime; font-family: Courier New, Courier, monospace;">completed</span> boolean attribute. The datastore put in <span style="color: lime; font-family: Courier New, Courier, monospace;">Populate</span> uses the default value of <span style="color: lime; font-family: Courier New, Courier, monospace;">False</span>, but after <span style="color: lime; font-family: Courier New, Courier, monospace;">Complete</span> is called we want to set this boolean to <span style="color: lime; font-family: Courier New, Courier, monospace;">True</span>. If this value was not <a href="https://developers.google.com/appengine/docs/python/datastore/transactions#Isolation_and_Consistency" target="_blank">consistent</a>, we may have a race condition that resulted in a completed task in the datastore being marked as incomplete. As we'll see later, we rely on this consistency for a query that will help us determine if our batch is done.<br /><br />To signal that a unit of work has completed, we define the <span style="color: lime; font-family: Courier New, Courier, monospace;">Complete</span> method on the task object: <br /><pre class="prettyprint" style="background-color: white;"> @ndb.transactional<br /> def Complete(self):<br /> self.completed = True<br /> self.put()<br /><br /> batcher_parent = self.key.parent().get()<br /> defer(batcher_parent.CheckComplete, _transactional=True)</pre>It performs two functions. First, it sets <span style="color: lime; font-family: Courier New, Courier, monospace;">completed</span> to <span style="color: lime; font-family: Courier New, Courier, monospace;">True</span> in a transaction. Second, it retrieves the parent <a href="https://developers.google.com/appengine/docs/python/ndb/entities" target="_blank">entity</a> of the task object and defers the <span style="color: lime; font-family: Courier New, Courier, monospace;">CheckComplete</span> method on this parent. As we will see in more depth in the <span style="color: lime; font-family: Courier New, Courier, monospace;">PopulateBatch</span> function, we use a special type of batch parent object to create an <a href="https://developers.google.com/appengine/docs/python/datastore/entities#Transactions_and_Entity_Groups" target="_blank">entity group</a> containing all the worker tasks for the batch. We don't want to check if the batch has completed until the datastore put has succeeded, so we defer the call to call to <span style="color: lime; font-family: Courier New, Courier, monospace;">CheckComplete</span> transactionally, just as we did with <span style="color: lime; font-family: Courier New, Courier, monospace;">AlwaysComplete</span> in the <span style="color: lime; font-family: Courier New, Courier, monospace;">Populate</span> method.<br /><br /><b>Note</b>: <i>It may seem that these </i><span style="color: lime; font-family: Courier New, Courier, monospace;">get</span><i> calls to retrieve the parent via </i><span style="color: lime; font-family: Courier New, Courier, monospace;">self.key.parent().get()</span><i> are using more bandwidth than necessary. However, we are relying here on the power of </i><span style="color: lime; font-family: Courier New, Courier, monospace;">ndb</span><i>. Using a combination of instance caching and <a href="https://developers.google.com/appengine/docs/python/memcache/overview" target="_blank">memcache</a>, most (if not all) of these gets will use the cache and will not incur the cost of a round-trip to the datastore.</i><br /><h2><span style="font-size: large;">Batch Parent Model</span></h2>Given what we rely on in <span style="color: lime; font-family: Courier New, Courier, monospace;">BatchTask</span>, we need to define a special type of datastore object that will act as the parent entity for a batch. Since we are going to use it to check when a batch is complete, we define the boolean attribute <span style="color: lime; font-family: Courier New, Courier, monospace;">all_tasks_loaded</span> to signal whether or not all worker tasks from the batch have begun. We can use this as a short circuit in our <span style="color: lime; font-family: Courier New, Courier, monospace;">CheckComplete</span> method (or as a guard against premature completion).<br /><pre class="prettyprint" style="background-color: white;">class TaskBatcher(ndb.Model):<br /> all_tasks_loaded = ndb.BooleanProperty(default=False, indexed=False)</pre>To check if a batch is complete, we first determine if all tasks have loaded. If that is the case, we perform an <a href="https://developers.google.com/appengine/docs/python/datastore/queries#Ancestor_Queries" target="_blank">ancestor query</a> that simply attempts to fetch the first worker task in the entity group which has not yet completed. If such a task does not exist, we know the batch has completed, and so start to clean up the task and batch parent objects from the datastore.<br /><pre class="prettyprint" style="background-color: white;"> def CheckComplete(self):<br /> # Does not need to be transactional since it doesn't change data<br /> session_id = self.key.id()<br /> if self.all_tasks_loaded:<br /> incomplete = BatchTask.query(BatchTask.completed == False,<br /> ancestor=self.key).fetch(1)<br /> if len(incomplete) == 0:<br /> channel.send_message(session_id, json.dumps({'status': 'complete'}))<br /> self.CleanUp()<br /> return<br /><br /> channel.send_message(session_id, json.dumps({'status': 'incomplete'}))</pre>We again do the utmost at this step to ensure <a href="https://www.blogger.com/"><span id="goog_842417011"></span>consistency<span id="goog_842417012"></span></a> by using an ancestor query:<br /><blockquote class="tr_bq">"There are scenarios in which any pending modifications are guaranteed to be completely applied...any ancestor queries in the High Replication datastore. In both cases, query results will always be current and consistent."</blockquote>After checking if a batch is complete, we need to communicate the status back to the client. We'll rely on <span style="color: lime; font-family: Courier New, Courier, monospace;">PopulateBatch</span> to create instances of <span style="color: lime; font-family: Courier New, Courier, monospace;">TaskBatcher</span> with the ID of the session corresponding to the batch as the datastore key. We send a status complete or incomplete message to the client using the session ID for the channel. In order to correctly handle these messages on the client, we'll need to update the <span style="color: lime; font-family: Courier New, Courier, monospace;">onmessage</span> handler (defined in <a href="http://blog.bossylobster.com/2012/08/last-to-cross-finish-line-part-two.html" target="_blank">part two</a>) to account for status updates:<br /><pre class="prettyprint" style="background-color: white;"><span style="opacity: 0.4;">socket.onmessage = function(msg) {<br /> var response = JSON.parse(msg.data);</span><br /><b> if (response.status !== undefined) {<br /> setStatus(response.status);<br /> }</b><span style="opacity: 0.4;"> else {<br /> var squareIndex = 8*response.row + response.column;<br /> var squareId = '#square' + squareIndex.toString();<br /> $(squareId).css('background-color', response.color);<br /> }<br />}</span></pre>Just as the <span style="color: lime; font-family: Courier New, Courier, monospace;">setStatus</span> method revealed the progress spinner when work began, it will remove the spinner when the status is complete.<br /><br />We'll next define the <span style="color: lime; font-family: Courier New, Courier, monospace;">CleanUp</span> method that is called when the batch is complete:<br /><pre class="prettyprint" style="background-color: white;"> def CleanUp(self):<br /> children = BatchTask.query(ancestor=self.key).iter(keys_only=True)<br /> ndb.delete_multi(children)<br /> self.key.delete()</pre>This method uses the key from the batch parent to perform another ancestor query and creates an object which can <a href="https://developers.google.com/appengine/docs/python/ndb/queries#iterators" target="_blank">iterate over all the keys</a> of the tasks in the batch. By using the <span style="color: lime; font-family: Courier New, Courier, monospace;">delete_multi</span> function provided by <span style="color: lime; font-family: 'Courier New', Courier, monospace;">ndb</span>, we are able to delete these in parallel rather than waiting for each to complete. After deleting all the tasks, the batcher deletes itself and clean up is done. Since the <span style="color: lime; font-family: Courier New, Courier, monospace;">TaskBatcher.CheckComplete</span> spawns <span style="color: lime; font-family: 'Courier New', Courier, monospace;">CleanUp</span> in a deferred task, if the deletes time out, the task will try again until all tasks in the batch are deleted.<br /><br />As a final method on <span style="color: lime; font-family: Courier New, Courier, monospace;">TaskBatcher</span>, we define something similar to <span style="color: lime; font-family: Courier New, Courier, monospace;">BatchTask.Populate</span> that is triggered after all workers in the batch have been added:<br /><pre class="prettyprint" style="background-color: white;"> @ndb.transactional<br /> def Ready(self):<br /> self.all_tasks_loaded = True<br /> self.put()<br /> self.CheckComplete()</pre>This simply signals that all tasks from the batch have loaded by setting <span style="color: lime; font-family: Courier New, Courier, monospace;">all_tasks_loaded</span> to <span style="color: lime; font-family: Courier New, Courier, monospace;">True</span> and calls <span style="color: lime; font-family: Courier New, Courier, monospace;">CheckComplete</span> in case all the tasks in the batch have already completed. This check is necessary because if all worker tasks complete before <span style="color: lime; font-family: Courier New, Courier, monospace;">all_tasks_loaded</span> is <span style="color: lime; font-family: Courier New, Courier, monospace;">True</span>, then none of the checks initiated by those tasks would signal completion. We use a transaction to avoid a race condition with the initial datastore put — a put which is a signal that all tasks have <b><i>not</i></b> loaded.<br /><h2><span style="font-size: large;">Populating a Batch</span></h2>With our two models in hand, we are finally ready to define the <span style="color: lime; font-family: Courier New, Courier, monospace;">PopulateBatch</span> function used (in <a href="http://blog.bossylobster.com/2012/08/last-to-cross-finish-line-part-two.html" target="_blank">part two</a>) by the <span style="color: lime; font-family: Courier New, Courier, monospace;">BeginWork</span> handler. We want users of this function to be able to call it directly, but don't want it to block the process they call it in, so we wrap the real function in a function that will simply defer the work: <br /><pre class="prettyprint" style="background-color: white;">def PopulateBatch(session_id, work):<br /> defer(_PopulateBatch, session_id, work)</pre>In the actual function, we first create a <span style="color: lime; font-family: Courier New, Courier, monospace;">TaskBatcher</span> object using the session ID as the key and put it into the datastore using the default value of <span style="color: lime; font-family: Courier New, Courier, monospace;">False</span> for <span style="color: lime; font-family: Courier New, Courier, monospace;">all_tasks_loaded</span>. Since this is a single synchronous <span style="color: lime; font-family: Courier New, Courier, monospace;">put</span>, it blocks the thread of execution and we can be sure our parent is in the datastore before members of the entity group (the task objects) are created.<br /><pre class="prettyprint" style="background-color: white;">def _PopulateBatch(session_id, work):<br /> batcher_key = ndb.Key(TaskBatcher, session_id)<br /> batcher = TaskBatcher(key=batcher_key)<br /> batcher.put()</pre>After doing this, we loop through all the 3-tuples in the passed in batch of <span style="color: lime; font-family: Courier New, Courier, monospace;">work</span>. For each unit of work, we create a task using the batcher as parent and then call the <span style="color: lime; font-family: Courier New, Courier, monospace;">Populate</span> method on the task using the method, positional arguments and keyword arguments provided in the unit of work.<br /><pre class="prettyprint" style="background-color: white;"> for method, args, kwargs in work:<br /> task = BatchTask(parent=batcher_key)<br /> task.Populate(method, *args, **kwargs)</pre>Finally, to signal that all tasks in the batch have been added, we call the <span style="color: lime; font-family: Courier New, Courier, monospace;">Ready</span> method on the batch parent:<br /><pre class="prettyprint" style="background-color: white;"> batcher.Ready()</pre><b>Note:</b> <i>This approach can cause performance issues as the number of tasks grows, since contentious puts within the entity group can cause task completions to stall or retry. I (or my colleagues) will be following up with two posts on the following topics: </i><br /><ul><i><li>using task tagging and pull queues to achieve a similar result, but reducing contention</li><li>exploring ways to extend this model to a hierarchical model where tasks may have subtasks</li></i></ul><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2012/09/last-to-cross-finish-line-part-three.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-1785625669927912191Wed, 29 Aug 2012 20:34:00 +00002012-11-20T15:37:00.039-08:00AppEngineDeferred LibraryGoogle App EngineGoogle CodesiteJavascriptjQueryPythonTask Queue APILast to Cross the Finish Line: Part TwoRecently, my colleague <a href="https://plus.google.com/115640166224745944209" target="_blank">+Fred Sauer</a> and I gave a tech talk called "Last Across the Finish Line: Asynchronous <a href="https://developers.google.com/appengine/docs/python/taskqueue/overview" target="_blank">Tasks</a> with <a href="https://appengine.google.com/" target="_blank">App Engine</a>". This is part two in a three part series where I will share our <a href="http://www.forbes.com/pictures/ekij45gdh/learnings/#gallerycontent" target="_blank">learnings</a> and give some helpful references to the <a href="https://developers.google.com/appengine/docs/" target="_blank">App Engine documentation</a>.<br /><br />Check out the <a href="http://blog.bossylobster.com/2012/08/last-to-cross-finish-line-part-one.html" target="_blank">previous post</a> if you haven't already. In this section, we'll cover the two <a href="https://developers.google.com/appengine/docs/python/tools/webapp/running" target="_blank">WSGI handlers</a> in <a href="http://code.google.com/p/gae-last-across-the-finish-line/source/browse/main.py" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">main.py</span></a> serving requests for our application and the client side code that communicates with our application.<br /><h2><span style="font-size: large;">Imports</span></h2>Before defining the handlers, let's first review the imports:<br /><pre class="prettyprint" style="background-color: white;">import json<br /><br />from google.appengine.api import channel<br />from google.appengine.api import users<br />from google.appengine.ext.webapp.util import login_required<br />import webapp2<br />from webapp2_extras import jinja2<br /><br />from display import RandomRowColumnOrdering<br />from display import SendColor<br />from models import PopulateBatch<br /></pre>We import <a href="http://docs.python.org/library/json.html" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">json</span></a> for serialization of messages. Specific to App Engine, we import <span style="color: lime; font-family: Courier New, Courier, monospace;">channel</span> to use the <a href="https://developers.google.com/appengine/docs/python/channel/" target="_blank">Channel API</a>, <a href="https://developers.google.com/appengine/docs/python/users/" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">users</span></a> and <a href="https://developers.google.com/appengine/docs/python/tools/webapp/utilmodule" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">login_required</span></a> for authenticating users within a request, <a href="https://developers.google.com/appengine/docs/python/gettingstartedpython27/usingwebapp" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">webapp2</span></a> for creating <a href="http://webapp-improved.appspot.com/guide/app.html" target="_blank">WSGI Handlers</a> and <a href="https://developers.google.com/appengine/docs/python/gettingstartedpython27/templates" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">jinja2</span></a> for templating.<br /><br />Finally, we import four functions from the two other modules defined within our project. From the <a href="http://code.google.com/p/gae-last-across-the-finish-line/source/browse/display.py" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">display</span></a> module, we import the <span style="color: lime; font-family: Courier New, Courier, monospace;">SendColor</span> function that we explored in part one and the <span style="color: lime; font-family: Courier New, Courier, monospace;">RandomRowColumnOrdering</span> function, which generates all possible row, column pairs in a random order. From the as of yet undiscussed <a href="http://code.google.com/p/gae-last-across-the-finish-line/source/browse/models.py" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">models</span></a> module we import the <span style="color: lime; font-family: Courier New, Courier, monospace;">PopulateBatch</span> function, which takes a session ID and a batch of work to be done and spawns workers to carry out the batch of work.<br /><h2><span style="font-size: large;">Handlers</span></h2>This module defines two handlers: the main page for the user interface and an <a href="http://en.wikipedia.org/wiki/Ajax_(programming)" target="_blank">AJAX</a> handler which will begin spawning the workers.<br /><br />For the main page we use <span style="color: lime; font-family: Courier New, Courier, monospace;">jinja2</span> templates to render from the template <a href="http://code.google.com/p/gae-last-across-the-finish-line/source/browse/templates/main.html" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">main.html</span></a> in the <span style="color: lime; font-family: Courier New, Courier, monospace;">templates</span> folder:<br /><pre class="prettyprint" style="background-color: white;">class MainPage(webapp2.RequestHandler):<br /> def RenderResponse(self, template, **context):<br /> jinja2_renderer = jinja2.get_jinja2(app=self.app)<br /> rendered_value = jinja2_renderer.render_template(template, **context)<br /> self.response.write(rendered_value)<br /> @login_required<br /> def get(self):<br /> user_id = users.get_current_user().user_id()<br /> token = channel.create_channel(user_id)<br /> self.RenderResponse('main.html', token=token, table_id='pixels',<br /> rows=8, columns=8)</pre>In <span style="color: lime; font-family: Courier New, Courier, monospace;">get</span> — the actual handler serving the <a href="http://en.wikipedia.org/wiki/GET_(HTTP)#Request_methods" target="_blank">GET</a> request from the browser — we use the <span style="color: lime; font-family: Courier New, Courier, monospace;">login_required</span> decorator to make sure the user is signed in, and then create a channel for message passing using the ID of the signed in user. The template takes an HTML ID, rows and columns to create an HTML table as the "quilt" that the user will see. We pass the created token for the channel, an HTML ID for the table and the rows and columns to the template by simply specifying them as keyword arguments.<br /><br />For the handler which will spawn the workers, we use <span style="color: lime; font-family: Courier New, Courier, monospace;">RandomRowColumnOrdering</span> to generate row, column pairs. Using each pair along with the <span style="color: lime; font-family: Courier New, Courier, monospace;">SendColor</span> function and the user ID (as a proxy for session ID) for message passing, we add a unit of work to the batch<br /><pre class="prettyprint" style="background-color: white;">class BeginWork(webapp2.RequestHandler):<br /> # Can't use login_required decorator here because it is not<br /> # supported for <a href="http://en.wikipedia.org/wiki/POST_(HTTP)#Request_methods" target="_blank">POST</a> requests<br /> def post(self):<br /> response = {'batch_populated': False}<br /> try:<br /> # Will raise an AttributeError if no current user<br /> user_id = users.get_current_user().user_id()<br /> # TODO: return 400 if not logged in <br /> work = []<br /> for row, column in RandomRowColumnOrdering(8, 8):<br /> args = (row, column, user_id)<br /> work.append((SendColor, args, {})) # No keyword args<br /> PopulateBatch(user_id, work)<br /> response['batch_populated'] = True<br /> except:<br /> # TODO: Consider logging traceback.format_exception(*sys.exc_info()) here<br /> pass<br /> self.response.write(json.dumps(response))</pre>Finally, for routing applications within our app, we define:<br /><pre class="prettyprint" style="background-color: white;">app = webapp2.WSGIApplication([('/begin-work', BeginWork),<br /> ('/', MainPage)],<br /> debug=True)</pre>and specify <br /><pre class="prettyprint" style="background-color: white;">handlers:<br />- url: /.*<br /> script: main.app</pre>in <span style="color: lime; font-family: Courier New, Courier, monospace;">app.yaml</span>; to use WSGI apps, the App Engine runtime must be <span style="color: lime; font-family: Courier New, Courier, monospace;">python27</span>. <br /><h2><span style="font-size: large;">Client Side Javascript and jQuery</span></h2>In the template <a href="http://code.google.com/p/gae-last-across-the-finish-line/source/browse/templates/main.html" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">main.html</span></a> we use <a href="http://jquery.com/" target="_blank">jQuery</a> to make AJAX requests and manage the CSS for each square in our "quilt". We also define some other Javascript functions for interacting with the App Engine Channel API. In the HTML <span style="color: lime; font-family: Courier New, Courier, monospace;"><head></span> element we load the <a href="https://developers.google.com/appengine/docs/python/channel/javascript" target="_blank">Channel Javascript API</a>, and in the <span style="color: lime; font-family: Courier New, Courier, monospace;"><body></span> element we open a channel using the <span style="color: lime; font-family: Courier New, Courier, monospace;">{{ token }}</span> passed in to the template:<br /><pre class="prettyprint" style="background-color: white;"><head><br /> <script src="/_ah/channel/jsapi"></script><br /></head><br /><body><br /> <script type="text/javascript"><br /> channel = new goog.appengine.Channel('{{ token }}');<br /> socket = channel.open();<br /> socket.onerror = function() { console.log('Socket error'); };<br /> socket.onclose = function() { console.log('Socket closed'); };<br /> </script><br /></body></pre>In addition to <span style="color: lime; font-family: Courier New, Courier, monospace;">onerror</span> and <span style="color: lime; font-family: Courier New, Courier, monospace;">onclose</span>, we define more complex functions for the <span style="color: lime; font-family: Courier New, Courier, monospace;">onopen</span> and <span style="color: lime; font-family: Courier New, Courier, monospace;">onmessage</span> callbacks.<br /><br />First, when the socket has been opened, we send a POST request to <span style="color: lime; font-family: Courier New, Courier, monospace;">/begin-work</span> to signal that the channel is ready for communication. If the response indicates that the batch of workers has been initialized successfully, we call a method <span style="color: lime; font-family: Courier New, Courier, monospace;">setStatus</span> which will reveal the progress spinner:<br /><pre class="prettyprint" style="background-color: white;">socket.onopen = function() {<br /> $.post('/begin-work', function(data) {<br /> var response = JSON.parse(data);<br /> if (response.batch_populated) {<br /> setStatus('Loading began');<br /> }<br /> });<br />}</pre>As we defined in part one, each <span style="color: lime; font-family: Courier New, Courier, monospace;">SendColor</span> worker sends back a <a href="https://developers.google.com/appengine/docs/python/channel/overview#Life_of_a_Typical_Channel_Message" target="_blank">message</a> along the channel representing a row, column pair and a color. On message receipt, we use these messages to set the background color of the corresponding square to the color provided:<br /><pre class="prettyprint" style="background-color: white;">socket.onmessage = function(msg) {<br /> var response = JSON.parse(msg.data);<br /> var squareIndex = 8*response.row + response.column;<br /> var squareId = '#square' + squareIndex.toString();<br /> $(squareId).css('background-color', response.color);<br />}</pre>As you can see from <span style="color: lime; font-family: Courier New, Courier, monospace;">squareId</span>, each square in the table generated by the template has an HMTL ID so we can specifically target it.<br /><h2><span style="font-size: large;">Next...</span></h2>In the <a href="http://blog.bossylobster.com/2012/09/last-to-cross-finish-line-part-three.html" target="_blank">final post</a>, we'll define the <span style="color: lime; font-family: Courier New, Courier, monospace;">PopulateBatch</span> function and explore the <a href="https://developers.google.com/appengine/docs/python/ndb/" target="_blank">ndb models</a> and <a href="https://developers.google.com/appengine/docs/python/taskqueue/" target="_blank">Task Queue</a> operations that make it work.<a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2012/08/last-to-cross-finish-line-part-two.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-6513171994227710563Mon, 27 Aug 2012 19:53:00 +00002012-11-20T15:38:11.322-08:00AppEngineDeferred LibraryGoogle App EngineGoogle CodesiteJavascriptPythonTask Queue APILast to Cross the Finish Line: Part OneRecently, my colleague <a href="https://plus.google.com/115640166224745944209" target="_blank">+Fred Sauer</a> and I gave a tech talk called "Last Across the Finish Line: Asynchronous <a href="https://developers.google.com/appengine/docs/python/taskqueue/overview" target="_blank">Tasks</a> with <a href="https://appengine.google.com/" target="_blank">App Engine</a>". This is part one in a three part series where I will share our <a href="http://www.forbes.com/pictures/ekij45gdh/learnings/#gallerycontent" target="_blank">learnings</a> and give some helpful references to the <a href="https://developers.google.com/appengine/docs/" target="_blank">App Engine documentation</a>.<br /><h2><span style="font-size: large;">Intro</span></h2>Before I dive in, a quick overview of our approach:<br /><ul><li>"Fan out; Fan in" First spread tasks over independent workers; then gather the results back together</li><li>Use task queues to perform background work in parallel <ul><li>Tasks have built-in retries</li><li>Can respond quickly to the client, making UI more responsive</li></ul></li><li style="margin-top: -12px;">Operate asynchronously when individual tasks can be executed independently, hence can be run concurrently <ul><li>If tasks are too work intensive to run synchronously, (attempt to) break work into small independent pieces</li></ul></li><li style="margin-top: -12px;">Break work into smaller tasks, for example: <ul><li>rendering media (sounds, images, video)</li><li>retrieving and parsing data from an external service (Google Drive, Cloud Storage, GitHub, ...)</li></ul></li><li style="margin-top: -12px;">Keep track of all workers; notify client when work is complete</li></ul>Before talking about the sample, let's check it out in action: <br /><div style="text-align: center;"><iframe allowfullscreen="allowfullscreen" frameborder="0" height="360" src="https://www.youtube.com/embed/tEDDVmgN-iU" width="640"></iframe></div>We are randomly generating a color in a worker and sending it back to the client to fill in a square in the "quilt". (Thanks to <a href="https://plus.google.com/103073491679741548297" target="_blank">+Iein Valdez</a> for this term.) In this example, think of each square as a (most likely more complex) compute task.<br /><h2><span style="font-size: large;">Application Overview</span></h2>The <a href="https://github.com/GoogleCloudPlatform/appengine-last-across-the-finish-line-python" target="_blank">application</a> has a simple structure: <br /><pre class="prettyprint" style="background-color: white;">gae-last-across-the-finish-line/<br />|-- app.yaml<br />|-- display.py<br />|-- main.py<br />|-- models.py<br />+-- templates/<br /> +-- main.html</pre>We'll inspect each of the Python modules <span style="color: lime; font-family: Courier New, Courier, monospace;">display.py</span>, <span style="color: lime; font-family: Courier New, Courier, monospace;">main.py</span> and <span style="color: lime; font-family: Courier New, Courier, monospace;">models.py</span> individually and explore how they interact with one another. In addition to this, we'll briefly inspect the HTML and Javascript contained in the template <span style="color: lime; font-family: Courier New, Courier, monospace;">main.html</span>, to understand how the workers pass messages back to the client.<br /><br />In this post, I will explain the actual background work we did and briefly touch on the methods for communicating with the client, but won't get into client side code or the generic code for running the workers and watching them all as they cross the finish line. In the second post, we’ll examine the client side code and in the third, we’ll discuss the models that orchestrate the work.<br /><h2><span style="font-size: large;">Workers</span></h2>These worker methods are defined in <a href="http://code.google.com/p/gae-last-across-the-finish-line/source/browse/display.py" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">display.py</span></a>. To generate the random colors, we simply choose a hexadecimal digit six different times and throw a <span style="color: lime; font-family: Courier New, Courier, monospace;">#</span> on at the beginning:<br /><pre class="prettyprint" style="background-color: white;">import random<br /><br />HEX_DIGITS = '0123456789ABCDEF'<br /><br />def RandHexColor(length=6):<br /> result = [random.choice(HEX_DIGITS) for _ in range(length)]<br /> return '#' + ''.join(result)</pre>With <span style="color: lime; font-family: Courier New, Courier, monospace;">RandHexColor</span> in hand, we define a worker that will take a row and column to be colored and a session ID that will identify the client requesting the work. This worker will generate a random color and then send it to the specified client along with the row and column. To pass messages to the client, we use the <a href="https://developers.google.com/appengine/docs/python/channel/" target="_blank">Channel API</a> and serialize our messages using the <a href="http://docs.python.org/library/json.html" target="_blank"><span style="color: lime; font-family: Courier New, Courier, monospace;">json</span></a> library in Python.<br /><pre class="prettyprint" style="background-color: white;">import json<br />from google.appengine.api import channel<br /><br />def SendColor(row, column, session_id):<br /> color = RandHexColor(length=6)<br /> color_dict = {'row': row, 'column': column, 'color': color}<br /> channel.send_message(session_id, json.dumps(color_dict))</pre><h2><span style="font-size: large;">Next...</span></h2>In the <a href="http://blog.bossylobster.com/2012/08/last-to-cross-finish-line-part-two.html" target="_blank">next post</a>, we'll explore the <a href="https://developers.google.com/appengine/docs/python/tools/webapp/running" target="_blank">WSGI handlers</a> that run the application and the client side code that handles the messages from the workers. <a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2012/08/last-to-cross-finish-line-part-one.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-3795861107212585506Sun, 19 Aug 2012 22:56:00 +00002012-09-10T08:00:30.067-07:00AppEngineDecoratorDeferred LibraryEnvironment VariablesGoogle App EnginePythonTask Queue APIA Decorator for App Engine Deferred TasksI happen to be a big fan of the <a href="https://developers.google.com/appengine/articles/deferred" target="_blank">deferred library</a> for both Python runtimes in <a href="https://developers.google.com/appengine/" target="_blank">Google App Engine</a>. If an application needs to queue up work, breaking the work into easy to understand units by writing worker methods and then deferring the work into tasks is a breeze using the deferred library. For the majority of cases, if fine grained control over the method of execution is not needed, using the deferred library is a great (and in my opinion, the correct) abstraction.<br /><br />Maybe I am just biased because I have made a few <a href="http://blog.bossylobster.com/2012/03/where-have-i-been.html" target="_blank">changes</a> to the deferred library over the past few months? One such change I made added a <a href="http://code.google.com/p/googleappengine/issues/detail?id=6412" target="_blank">feature</a> that allows a task to fail once without having an impact on subsequent retries; this can be accomplished by raising a <span style="color: lime; font-family: Courier New, Courier, monospace;">SingularTaskFailure</span>. Over this weekend, I found that I wanted to use this feature for a special type* of worker. Since I wanted to utilize this unique exception, I wanted to make sure that this worker <b><i>only</i></b> ran in a deferred task.<br /><br />Initially I thought I was lost, since any <a href="http://docs.python.org/library/pickle.html" target="_blank">pickled</a> method wouldn't directly have access to the <a href="https://developers.google.com/appengine/docs/python/taskqueue/overview-push#Task_Request_Headers" target="_blank">task queue specific headers</a> from the request. But luckily, many of these headers persist as <a href="http://en.wikipedia.org/wiki/Environment_variable" target="_blank">environment variables</a>, so can be accessed via <span style="color: lime; font-family: Courier New, Courier, monospace;">os.environ</span> or <span style="color: lime; font-family: Courier New, Courier, monospace;">os.getenv</span>, yippee! Being a good little (Python) boy, I decided to abstract this requirement into a <a href="http://stackoverflow.com/questions/739654/understanding-python-decorators#1594484" target="_blank">decorator</a> and let the function do it's own work in peace.<br /><br />Upon realizing the usefulness of such a decorator, I decided to write about it, so here it is:<br /><pre class="prettyprint" style="background-color: white;">import functools<br />import os<br /><br />from google.appengine.ext.deferred import defer<br />from google.appengine.ext.deferred.deferred import _DEFAULT_QUEUE as DEFAULT_QUEUE<br />from google.appengine.ext.deferred.deferred import _DEFAULT_URL as DEFERRED_URL<br /><br />QUEUE_KEY = 'HTTP_X_APPENGINE_QUEUENAME'<br />URL_KEY = 'PATH_INFO'<br /><br />def DeferredWorkerDecorator(method):<br /> @functools.wraps(method)<br /> def DeferredOnlyMethod(*args, **kwargs):<br /> path_info = os.environ.get(URL_KEY, '')<br /> if path_info != DEFERRED_URL:<br /> raise EnvironmentError('Wrong path of execution: {}'.format(path_info))<br /> queue_name = os.environ.get(QUEUE_KEY, '')<br /> if queue_name != DEFAULT_QUEUE:<br /> raise EnvironmentError('Wrong queue name: {}'.format(queue_name))<br /><br /> return method(*args, **kwargs)<br /><br /> return DeferredOnlyMethod<br /></pre>This decorator first checks if the environment variable <span style="color: lime; font-family: Courier New, Courier, monospace;">PATH_INFO</span> is set to the default value for the deferred queue: <span style="color: lime; font-family: Courier New, Courier, monospace;">/_ah/queue/deferred</span>. If this is not the case (or if the environment variable is not set), an <span style="color: lime; font-family: Courier New, Courier, monospace;">EnvironmentError</span> is raised. Then the environment variable <span style="color: lime; font-family: Courier New, Courier, monospace;">HTTP_X_APPENGINE_QUEUENAME</span> is checked against the name of the default queue: <span style="color: lime; font-family: Courier New, Courier, monospace;">default</span>. Again, if this is incorrect or unset, an <span style="color: lime; font-family: Courier New, Courier, monospace;">EnvironmentError</span> is raised. If both these checks pass, the decorated method is called with its arguments and the value is returned.<br /><br />To use this decorator:<br /><pre class="prettyprint" style="background-color: white;">import time<br /><br />from google.appengine.ext.deferred import SingularTaskFailure<br /><br />@DeferredWorkerDecorator<br />def WorkerMethod():<br /> if too_busy():<br /> time.sleep(30)<br /> raise SingularTaskFailure<br /><br /> # do work<br /><br />WorkerMethod() # This will fail with an EnvironmentError<br />defer(WorkerMethod) # This will perform the work, but in it's own task</pre>In case you want to extend this, here is a more "complete" list of some helpful values that you may be able to retrieve from environment variables: <br /><pre class="prettyprint" style="background-color: white;">HTTP_X_APPENGINE_TASKRETRYCOUNT<br />HTTP_X_APPENGINE_QUEUENAME<br />HTTP_X_APPENGINE_TASKNAME<br />HTTP_X_APPENGINE_TASKEXECUTIONCOUNT<br />HTTP_X_APPENGINE_TASKETA<br />HTTP_X_APPENGINE_COUNTRY<br />HTTP_X_APPENGINE_CURRENT_NAMESPACE<br />PATH_INFO</pre><br />*<b>Specialized Worker</b>: <i>I had two different reasons to raise a </i><span style="color: lime; font-family: Courier New, Courier, monospace;">SingularTaskFailure</span><i> in my worker. First, I was polling for resources that may not have been online, so wanted the task to sleep and then restart (after raising the one time failure). Second, I was using a special sentinel in the datastore to determine if the current user had any other job in progress. Again, I wanted to sleep and try again until the current user's other job had completed.</i>http://blog.bossylobster.com/2012/08/a-decorator-for-appengine-deferred-tasks.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-6895780823757405955Fri, 18 May 2012 02:45:00 +00002012-05-17T20:37:59.210-07:00AnalysisContinued FractionsMathPiTangentLife of π: Continued Fractions and Infinite SeriesThis is from a talk I gave to the <a href="http://www.math.ucsc.edu/">UC Santa Cruz Math</a> Club back in February. I have had the slides <a href="http://brucefong.files.wordpress.com/2008/09/cluttered_desk.jpg">cluttering my desk</a> since I gave the talk as a reminder of putting them up on the web, and today I finally cleaned them off! <br /><br />An interested tidbit about these slides: I gave the <a href="http://www.math.lsa.umich.edu/mathclub/fall2008/103008.pdf">same talk</a> one other time as an <a href="http://www.math.lsa.umich.edu/mathclub/">undergraduate</a>. As an undergraduate I gave the talk the day after a root canal. This February (at UCSC), one day after the talk I also had a root canal. What's the moral of the story? Don't give this talk. <br /><br />No seriously, don't give this talk. Though the fact is cool and the math is not too bad to grasp, the stuff from slides 106 to 113 goes way over an audience's collective head just because they have <a href="http://ellay2013.files.wordpress.com/2009/09/sleeping_in_class.jpg">stopped paying attention</a> at that point. Too many variables! <br /><br /><center><div style="width:680px" id="__ss_12977566"><strong style="display:block;margin:12px 0 4px"><a href="http://www.slideshare.net/bossylobster/life-of-continued-fractions-and-infinite-series" title="Life of π: Continued Fractions and Infinite Series">Life of π: Continued Fractions and Infinite Series</a></strong><object id="__sse12977566" width="680" height="568"><param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=beamerpipresentation-120517213119-phpapp01&stripped_title=life-of-continued-fractions-and-infinite-series&userName=bossylobster" /><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><param name="wmode" value="transparent"/><embed name="__sse12977566" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=beamerpipresentation-120517213119-phpapp01&stripped_title=life-of-continued-fractions-and-infinite-series&userName=bossylobster" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" wmode="transparent" width="680" height="568"></embed></object><div style="padding:5px 0 12px">View more <a href="http://www.slideshare.net/">presentations</a> from <a href="http://www.slideshare.net/bossylobster">bossylobster</a>.</div></div></center><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2012/05/life-of-continued-fractions-and.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-5262988849454237220Wed, 16 May 2012 02:43:00 +00002012-05-15T19:44:04.274-07:00FinanceInterest RateMortgageNewton-Raphson MethodNumerical AnalysisPythonReverse Calculating An Interest RateI was recently playing around with some loan data and only happened to have the term (or length, or duration) of the loan, the amount of the recurring payment (in this case monthly) and the remaining principal owed on the loan. I figured there was an easy way to get at the interest rate, but wasn't sure how. After some badgering from my coworker <a href="https://plus.google.com/104679465567407024302">+Paul</a>, I searched the web and found a <a href="http://www.calcamo.net/loancalculator/quickcalculations/loan-rate.php5">tool</a> from <a href="http://www.calcamo.net/">CALCAmo</a> (a site just for calculating amortizations).<br /><br />Problem solved, right? Wrong. I wanted to know why; I had to <a href="http://t.qkme.me/35co7h.jpg">go deeper</a>. So I did a bit of math and a bit of programming and I was where I needed to be. I'll break the following down into parts before going on full steam.<br /><ul><li>Break down the amortization schedule in terms of the variables we have and the one we want</li><li>Determine a function we want to find zeroes of</li><li>Write some code to implement the Newton-Raphson method</li><li>Utilize the Newton-Raphson code to find an interest rate</li><li>Bonus: Analyze the function to make sure we are right</li></ul><b><span style="font-size: large;">Step I: Break Down the <a href="http://en.wikipedia.org/wiki/Amortization_schedule">Amortization Schedule</a></span></b><br /><br />We can do this using the series \(\left\{P_i\right\}_i\) of principal owed, which varies over time and will go to zero once paid off. In this series, \(P_0\) is the principal owed currently and \(P_i\) is the principal owed after \(i\) payments have been made. (Assuming monthly payments, this will be after \(i\) months.) If the term is \(T\) periods, then we have \(P_T = 0\).<br /><br />We have already introduced the term (\(T\)); we also need the value of the recurring (again, usually monthly) payment \(R\), the interest rate \(r\) and the initial principal owed \(P_0 = P\).<br /><br /><b>Time-Relationship between Principal Values</b><br /><br />If after \(i\) periods, \(P_i\) is owed, then after one period has elapsed, we will owe \(P_i \cdot m\) where \(m = m(r)\) is some multiplier based on the length of the term. For example if each period is one month, then we divide our rate by \(12\) for the interest and add \(1\) to note that we are adding to existing principal: \[m(r) = 1 + \frac{r}{12}.\] In addition to the interest, we will have paid off \(R\) hence \[P_{i + 1} = P_i \cdot m - R.\] <b>Formula for \(P_i\)</b><br /><br />Using this, we can actually determine \(P_i\) strictly in terms of \(m, R\) and \(P\). First, note that \[P_2 = P_1 \cdot m - R = (P_0 \cdot m - R) \cdot m - R = P \cdot m^2 - R(m + 1)\] since \(P_0 = P\). We can show inductively that \[P_i = P \cdot m^i - R \cdot \sum_{j = 0}^{i - 1} m^j.\] We already have the base case \(i = 1\), by definition. Assuming it holds for \(i\), we see that \[P_{i + 1} = P_i \cdot m - R = m \cdot \left(P \cdot m^i - R \cdot \sum_{j = 0}^{i - 1} m^j\right) - R = P \cdot m^{i + 1} - R \cdot \sum_{j = 1}^{i} m^j - R,\] and our induction is complete. (We bump the index \(j\) since we are multiplying each \(m^j\) by \(m\).)<br />Each term in the series is related to the previous one (except \(P_0\), since time can't be negative in this case). <br /><br /><b><span style="font-size: large;">Step II: Determine a Function we want to find Zeroes of</span></b><br /><br />Since we know \(P_T = 0\) and \(P_T = P \cdot m^T - R \cdot \sum_{j = 0}^{T - 1} m^j\), we actually have a polynomial in place that will let us solve for \(m\) and in so doing, solve for \(r\).<br /><br />To make our lives a tad easier, we'll do some rearranging. First, note that \[\sum_{j = 0}^{T - 1} m^j = m^{T - 1} + \cdots + m + 1 = \frac{m^T - 1}{m - 1}.\] We calculate this sum of a geometric series here, but I'll just refer you to the <a href="http://en.wikipedia.org/wiki/Geometric_series">Wikipedia page</a> instead. With this reduction we want to solve \[0 = P \cdot m^T - R \cdot \frac{m^T - 1}{m - 1} \Longleftrightarrow P \cdot m^T \cdot (m - 1) = R \cdot (m^T - 1).\] With that, we have accomplished Step II, we have found a function (parameterized by \(P, T\) and \(R\) which we can use zeroes from to find our interest rate: \[f_{P, T, R}(m) = P \cdot m^T \cdot (m - 1) - R \cdot (m^T - 1) = P \cdot m^{T + 1} - (P + R) \cdot m^T + R.\] <b><span style="font-size: large;">Step III: Write some code to implement the <a href="http://en.wikipedia.org/wiki/Newton's_method">Newton-Raphson method</a></span></b><br /><br />We use the Newton-Raphson method to get super-duper-close to a zero of the function. For in-depth coverage, see the Wikipedia page on the Newton-Raphson method, but I'll give some cursory coverage below. The methods used to show that a fixed point is found are not necessary for the intuition behind the method.<br /><br /><b>Intuition behind the method</b><br /><br />For the intuition, assume we know (and can compute) a function \(f\), its derivative \(f'\) and a value \(x\). Assume there is some zero \(y\) nearby \(x\). Since they are close, we can approximate the slope of the line between the points \((x, f(x)\) and \((y, f(y)\) with the derivative nearby. Since we know \(x\), we use \(f'(x)\) and intuit that \[f'(x) = \text{slope} = \frac{f(y) - f(x)}{y - x} \Rightarrow y - x = \frac{f(y) - f(x)}{f'(x)}.\] But, since we know that \(y\) is a zero, \(f(y) - f(x) = -f(x)\) hence \[y - x = \frac{-f(x)}{f'(x)} \Rightarrow y = x - \frac{f(x)}{f'(x)}.\] Using this method, one can start with a given value \(x_0 = x\) and compute better and better approximations of a zero via the iteration above that determines \(y\). We use a sequence to do so: \[x_{i + 1} = x_i - \frac{f(x_i)}{f'(x_i)}\] and stop calculating the \(x_i\) either after \(f(x_i)\) is below a preset threshold or after the fineness of the approximation \(\left|x_i - x_{i + 1}\right|\) goes below a (likely different) preset threshold. Again, there is much that can be said about these approximations, but we are trying to accomplish things today, not theorize.<br /><br /><b>Programming Newton-Raphson</b><br /><br />To perform Newton-Raphson, we'll implement a Python function that takes the initial guess (\(x_0\)) and the functions \(f\) and \(f'\). We'll also (arbitrarily) stop after the value \(f(x_i)\) drops below \(10^{-8}\) in absolute value.<br /><pre class="prettyprint" style="background-color: white;">def newton_raphson_method(guess, f, f_prime):<br /> def next_value(value):<br /> return value - f(value)*1.0/f_prime(value)<br /><br /> current = guess<br /> while abs(f(current)) > 10**(-8):<br /> current = next_value(current)<br /><br /> return current</pre>As you can see, once we have <span style="color: lime; font-family: 'Courier New', Courier, monospace;">f</span> and <span style="color: lime; font-family: 'Courier New', Courier, monospace;">f_prime</span>, everything else is easy because all the work in calculating the next value (via <span style="color: lime; font-family: 'Courier New', Courier, monospace;">next_value</span>) is done by the functions.<br /><br /><b><span style="font-size: large;">Step IV: Utilize the Newton-Raphson code to find an Interest Rate</span></b><br /><br />We first need to implement \(f_{P, T, R}(m) = P \cdot m^{T + 1} - (P + R) \cdot m^T + R\) and \(f'_{P, T, R}\) in Python. Before doing so, we do a simple derivative calculation: \[f_{P, T, R}'(m) = P \cdot (T + 1) \cdot m^T - (P + R) \cdot T \cdot m^{T - 1}.\] With these <a href="http://dictionary.reference.com/browse/formulae">formulae</a> in hand, we write a function which will spit out the corresponding <span style="color: lime; font-family: 'Courier New', Courier, monospace;">f</span> and <span style="color: lime; font-family: 'Courier New', Courier, monospace;">f_prime</span> given the parameters \(P\) (<span style="color: lime; font-family: 'Courier New', Courier, monospace;">principal</span>), \(T\) (<span style="color: lime; font-family: 'Courier New', Courier, monospace;">term</span>) and \(R\) (<span style="color: lime; font-family: 'Courier New', Courier, monospace;">payment</span>):<br /><pre class="prettyprint" style="background-color: white;">def generate_polynomials(principal, term, payment):<br /> def f(m):<br /> return (principal*(m**(term + 1)) - (principal + payment)*(m**term) +<br /> payment)<br /><br /> def f_prime(m):<br /> return (principal*(term + 1)*(m**term) -<br /> (principal + payment)*term*(m**(term - 1)))<br /><br /> return (f, f_prime)</pre>Note that these functions only take a single argument (<span style="color: lime; font-family: 'Courier New', Courier, monospace;">m</span>), but we are able to use the other parameters from the parent scope beyond the life of the call to <span style="color: lime; font-family: 'Courier New', Courier, monospace;">generate_polynomials</span> due to <a href="http://en.wikipedia.org/wiki/Closure_(computer_science)">closure</a> in Python.<br /><br />In order to solve, we need an initial <span style="color: lime; font-family: 'Courier New', Courier, monospace;">guess</span>, but we need to know the relationship between \(m\) and \(r\) before we can determine what sort of <span style="color: lime; font-family: 'Courier New', Courier, monospace;">guess</span> makes sense. In addition, once a value for \(m\) is returned from Newton-Raphson, we need to be able to turn it into an \(r\) value so functions <span style="color: lime; font-family: 'Courier New', Courier, monospace;">m</span> and <span style="color: lime; font-family: 'Courier New', Courier, monospace;">m_inverse</span> should be implemented. For our dummy case here, we'll assume monthly payments (and compounding):<br /><pre class="prettyprint" style="background-color: white;">def m(r):<br /> return 1 + r/12.0<br /><br />def m_inverse(m_value):<br /> return 12.0*(m_value - 1)</pre>Using these, and assuming that an interest rate of <b>10%</b> is a good guess, we can put all the pieces together:<br /><pre class="prettyprint" style="background-color: white;">def solve_for_interest_rate(principal, term, payment, m, m_inverse):<br /> f, f_prime = generate_polynomials(principal, term, payment)<br /><br /> guess_m = m(0.10) # ten percent as a decimal<br /> m_value = newton_raphson_method(guess_m, f, f_prime)<br /> return m_inverse(m_value)</pre>To check that this makes sense, let's plug in some values. Using the <a href="http://www.bankrate.com/calculators/mortgages/mortgage-calculator.aspx">bankrate.com loan calculator</a>, if we have a 30-year loan (with \(12 \cdot 30 = 360\) months of payments) of $100,000 with an interest rate of 7%, the monthly payment would be $665.30. Plugging this into our pipeline:<br /><pre class="prettyprint" style="background-color: white;">>>> principal = 100000<br />>>> term = 360<br />>>> payment = 665.30<br />>>> solve_for_interest_rate(principal, term, payment, m, m_inverse)<br />0.0699996284703</pre>And we see the rate of 7% is approximated quite well!<br /><br /><b><span style="font-size: large;">Bonus: Analyze the function to make sure we are right</span></b><br /><br />Coming soon. We will analyze the derivate and concavity to make sure that our guess yield the correct (and unique) zero. <a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2012/05/reverse-calculating-interest-rate.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-8522018487491594240Sat, 07 Apr 2012 20:40:00 +00002012-05-15T17:33:06.370-07:00DNSDNS LookupDomain Name SystemHosts fileInternet ProtocolInternet Service ProviderIP AddressISPMacbooknytimes.compeople.comPractical JokePrankUNIXSilly Pranks on your Friends<span style="font-size: large;">Disclaimer: These are silly little pranks, but I don't encourage messing with someone's computing environment without letting them know you have done so.</span><br /><br /><b>First Prank:</b><br /><br />I have a friend who really likes to read <a href="http://people.com/">people.com</a>, so I figured I would "enrich" her life a bit with another source of daily news :)<br /><br />I decided to play around with her <a href="http://en.wikipedia.org/wiki/Hosts_(file)#Purpose">hosts file</a>, so that when she visited <a href="http://people.com/">people.com</a>, she really got the <a href="http://nytimes.com/">New York Times</a> (the realest news I could think of at that time, though there are plenty of fine candidates).<br /><br />To quote the Wikipedia article on hosts files:<br /><blockquote class="tr_bq">"The hosts file...assists in addressing network nodes in a computer network. It is a common part of an operating system's Internet Protocol (IP) implementation, and serves the function of translating human-friendly hostnames into numeric protocol addresses, called IP addresses, that identify and locate a host in an IP network."</blockquote>More importantly: "the /etc/hosts file...allows you to add entries that traditionally your computer will look up first before trying your server DNS." (<a href="http://www.justincarmony.com/blog/2011/07/27/mac-os-x-lion-etc-hosts-bugs-and-dns-resolution/">source</a>) This means that even though the <a href="http://en.wikipedia.org/wiki/Domain_Name_System">DNS Lookup</a> provided by her <a href="http://en.wikipedia.org/wiki/Internet_service_provider">ISP</a> could resolve people.com, her browser would get an <a href="http://en.wikipedia.org/wiki/IP_address">IP address</a> from the hosts file first and hence will render the New York Times page for <a href="http://people.com/">people.com</a>.<br /><br />The first step to do this was to find the IP address for the replacement site: <br /><pre class="prettyprint" style="background-color: white;">$ ping www.nytimes.com<br />PING www.nytimes.com (199.239.136.200): 56 data bytes<br />64 bytes from 199.239.136.200: icmp_seq=0 ttl=64 time=0.062 ms<br />64 bytes from 199.239.136.200: icmp_seq=1 ttl=64 time=0.054 ms<br />...<br /></pre>For the second (and final) step, I just needed to add an entry to the hosts file. After <a href="http://en.wikipedia.org/wiki/Hosts_(file)#Location_in_the_file_system">locating</a> the file on her Macbook in <span style="color: lime; font-family: 'Courier New', Courier, monospace;">/etc/hosts</span>, I updated the contents: <br /><pre class="prettyprint" style="background-color: white;">##<br /># Host Database<br />#<br /># localhost is used to configure the loopback interface<br /># when the system is booting. Do not change this entry.<br />##<br />127.0.0.1 localhost<br />255.255.255.255 broadcasthost<br />::1 localhost<br />fe80::1%lo0 localhost<br /><b>199.239.136.200 people.com # New entry</b><br /></pre>Voilà! With that, the prank was complete and the next time she visited <a href="http://people.com/">people.com</a>, the got the contents of <a href="http://nytimes.com/">nytimes.com</a> in her browser.<br /><br /><b>Second Prank coming soon.</b><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2012/04/silly-pranks-on-your-friends.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-2020225366519691581Thu, 29 Mar 2012 01:09:00 +00002012-03-28T18:09:10.993-07:00AppEngineGoogle App EngineGoogle CodesiteOpen SourcePythonStack OverflowWhere have I been?Well, it's been a bit crazy and I haven't written a blog post in ages. I have several brewing, but had just been too busy at work (and a ton of travel for personal fun) to really have the excess time to write.<br /><br />This return post will not have much content but will announce that I'm a big boy now.<br /><br />In the 1.6.3 release of the App Engine SDK (and hence runtime), three nifty changes of mine were included. Two of them even made the <a href="http://code.google.com/p/googleappengine/wiki/SdkReleaseNotes#Version_1.6.3_-_February_28,_2012">Release Notes</a>:<br /><ul><li>Code that inherits from the deferred library's <span style="color: lime; font-family: 'Courier New', Courier, monospace;">TaskHandler</span> can now define custom handling of exceptions.</li><ul><li><a href="http://code.google.com/p/googleappengine/issues/detail?id=6478">http://code.google.com/p/googleappengine/issues/detail?id=6478</a></li></ul><li>Fixed an issue so that a deferred task retries like a push queue task when using the <span style="color: lime; font-family: 'Courier New', Courier, monospace;">SingularTaskFailure</span> exception:</li><ul><li><a href="http://code.google.com/p/googleappengine/issues/detail?id=6412">http://code.google.com/p/googleappengine/issues/detail?id=6412</a></li></ul></ul>In addition, the one that was most confusing to fix didn't make it into any set of Release Notes, but I "closed" the <a href="http://stackoverflow.com/questions/8304854/gql-query-with-key-in-list-of-keys">issue</a> that I originally opened on StackOverflow. Checkout the <a href="http://code.google.com/p/googleappengine/source/diff?spec=svn241&r=241&format=side&path=/trunk/python/google/appengine/ext/gql/__init__.py&old_path=/trunk/python/google/appengine/ext/gql/__init__.py">diff</a> to see the details. I'll follow up with a summary of each of the fixes in a later post. <a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2012/03/where-have-i-been.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-3336994532609298536Wed, 30 Nov 2011 19:35:00 +00002011-11-30T14:51:59.163-08:00AppEngineClass as ObjectDecoratorExceptionGoogle App EngineMetaclassOOPPythonPythonicRequest HandlerA Python Metaclass for "extra bad" errors in Google App EngineSo now here we are, having tried to <a href="http://blog.bossylobster.com/2011/11/handling-errors-in-google-app-engineand.html">handle errors in Google App Engine...and failed</a> all because silly <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">DeadlineExceededError</span> <a href="http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/runtime/__init__.py#32" target="_blank">jumps over</a> <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">Exception</span> in the inheritance chain and goes right for <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">BaseException</span>. How can we catch these in our handlers while staying <a href="http://docs.python.org/glossary.html#term-pythonic" target="_blank">Pythonic*</a>?<br /><br />First and foremost, in the case of a timeout, we need to explicitly catch a DeadlineExceededError. To do so, we can use a <a href="http://stackoverflow.com/questions/739654/understanding-python-decorators#1594484" target="_blank">decorator</a> (hey, that's Pythonic) in each and every handler for each and every HTTP verb. (Again, <a href="http://troll.me/images/war-cat/prepare-yourself-for-war.jpg" target="_blank">prepare yourselves</a>, a bunch of code is about to happen. See the necessary <a href="http://blog.bossylobster.com/2011/11/python-metaclass-for-extra-bad-errors.html#imports">imports</a> at the bottom of the post.)<br /><pre class="prettyprint" style="background-color: white;">def deadline_decorator(method):<br /><br /> def wrapped_method(self, *args, **kwargs):<br /> try:<br /> method(self, *args, **kwargs)<br /> except DeadlineExceededError:<br /> traceback_info = ''.join(format_exception(*sys.exc_info()))<br /> email_admins(traceback_info, defer_now=True)<br /><br /> serve_500(self)<br /><br /> return wrapped_method</pre>Unfortunately, having to manually<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/decorate_all_the_functions.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://www.bossylobster.com/images/blog/decorate_all_the_functions.jpg" /></a></div>is not so Pythonic. At this point I was stuck and wanted to give up, but <a href="https://plus.google.com/u/0/114760865724135687241/posts/GJjXjq5zXhU" target="_blank">asked for some advice</a> on <a href="http://www.google.com/+" target="_blank">G+</a> and actually got what I needed from the all knowing <a href="https://plus.google.com/u/0/118327176775959145936/posts" target="_blank">Ali Afshar</a>. What did I need? <a href="http://stackoverflow.com/questions/100003/what-is-a-metaclass-in-python#6581949" target="_blank">Metaclasses</a>.<br /><br />Before showing the super simple metaclass I wrote, you need to know one thing from StackOverflow user <a href="http://stackoverflow.com/users/9951/kevin-samuel" target="_blank">Kevin Samuel</a>:<br /><blockquote>The main purpose of a metaclass is to change the class automatically, when it's created.</blockquote>With the <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">__new__</span> method, the <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">type</span> object in Python actually constructs a class (which is also an object) by taking into account the name of the class, the parents (or bases) and the class attritubutes. So, we can make a metaclass by subclassing <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">type</span> and overriding <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">__new__</span>:<br /><pre class="prettyprint" style="background-color: white;">class DecorateHttpVerbsMetaclass(type):<br /><br /> def __new__(cls, name, bases, cls_attr):<br /> verbs = ['get', 'post', 'put', 'delete']<br /> for verb in verbs:<br /> if verb in cls_attr and isinstance(cls_attr[verb], function):<br /> cls_attr[verb] = deadline_decorator(cls_attr[verb])<br /><br /> return super(DecorateHttpVerbsMetaclass, cls).__new__(cls, name,<br /> bases, cls_attr)</pre>In <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">DecorateHttpVerbsMetaclass</span>, we look for four (of the nine) HTTP <a href="http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_methods" target="_blank">verbs</a>, because heck, only seven are supported in <a href="http://code.google.com/appengine/docs/python/tools/webapp/requesthandlerclass.html" target="_blank">RequestHandler</a>, and we're not that crazy. If the class has one of the verbs as an attribute <i><b>and if</b></i> the attribute is a function, we <a href="http://troll.me/images/misc-corrupted-husband/i-try-to-decorate-the-house-he-puts-spiderman-images-everywhere.jpg" target="_blank">decorate</a> it with <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">deadline_decorator</span>.<br /><br />Now, we can rewrite our subclass of <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">RequestHandler</span> with one extra line:<br /><pre class="prettyprint" style="background-color: white;">class ExtendedHandler(RequestHandler):<br /> __metaclass__ = DecorateHttpVerbsMetaclass<br /><br /> def handle_exception(self, exception, debug_mode):<br /> traceback_info = ''.join(format_exception(*sys.exc_info()))<br /> email_admins(traceback_info, defer_now=True)<br /><br /> serve_500(self)</pre>By doing this, when the <i><b>class</b></i> <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">ExtendedHandler</span> is built (as an <b><i>object</i></b>), all of its attributes and all of its parent classes (or bases) attributes are checked and possibly updated by our metaclass.<br /><br />And now you and James Nekbehrd can feel <a href="http://www.youtube.com/watch?v=NisCkxU544c" target="_blank">like a boss</a> when your app handles errors.<br /><br /><b><a href="http://blog.bossylobster.com/2011/11/python-metaclass-for-extra-bad-errors.html#imports" id="imports" style="color: white;">Imports:</a></b><br /><pre class="prettyprint" style="background-color: white;">from google.appengine.api import mail<br />from google.appengine.ext.deferred import defer<br />from google.appengine.ext.webapp import RequestHandler<br />from google.appengine.runtime import DeadlineExceededError<br />import sys<br />from traceback import format_exception<br />from SOME_APP_SPECIFIC_LIBRARY import serve_500<br />from <a href="http://blog.bossylobster.com/2011/11/handling-errors-in-google-app-engineand.html">LAST_POST</a> import email_admins</pre><b>*Pythonic:</b><br /><blockquote><i>An idea or piece of code which closely follows the most common idioms of the Python language, rather than implementing code using concepts common to other languages.</i></blockquote><b>Notes:</b><br /><ul><li><i>Using</i> <span class="Apple-style-span" style="background-color: white; color: purple; font-family: 'Courier New', Courier, monospace;">grep -r "Exception)" . | grep "class "</span> <i>I have convinced myself (for now) that the only errors AppEngine will throw that do not inherit from </i><span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">Exception</span><i> are </i><span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">DeadlineExceededError</span><i>, </i><span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">SystemExit</span><i>, and </i><span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">KeyboardInterrupt</span><i> so that is why I only catch the timeout.</i></li><li><i>You can also use </i><span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">webapp2</span><i> to <a target="_blank" href="http://stackoverflow.com/questions/6853257/how-can-i-setup-a-global-deadlineexceedederror-handler">catch 500 errors</a>, even when </i><span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">handle_exception</span><i> fails to catch them.</i></li></ul><br /><b>Disclaimer:</b> <i>Just because you know what a metaclass is doesn't mean you should use one:</i><br /><ul><li><i>"Don't do stuff like this though, what is your use case?" -Ali Afshar</i></li><li><i>"Metaclasses are deeper magic than 99% of users should ever worry about. If you wonder whether you need them, you don't (the people who actually need them know with certainty that they need them, and don't need an explanation about why)." -Python Guru Tim Peters</i></li><li><i>"The main use case for a metaclass is creating an API." -Kevin Samuel</i></li></ul><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2011/11/python-metaclass-for-extra-bad-errors.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-5037167046921612503Sun, 27 Nov 2011 23:43:00 +00002011-11-30T14:52:50.693-08:00AppEngineDeferred LibraryExceptionGoogle App EngineMail APIMetaclassPythonPythonicRequest HandlerTask Queue APIHandling errors in Google App Engine...and failingAfter spending a nontrivial amount of my nights and weekends working on an AppEngine app, I wanted a good way to monitor the logs without checking in on them every day. After a particularly frustrating weekend of updates that exposed unnoticed bugs that had yet to be triggered by the app, I set out to find such a way. I set out to find a <a href="http://docs.python.org/glossary.html#term-pythonic" target="_blank">Pythonic*</a> way.<br /><br />Since I knew the <a href="http://code.google.com/appengine/docs/python/mail/" target="_blank">App Engine Mail API</a> was <a href="http://t.qkme.me/355773.jpg" target="_blank">super easy</a> to configure, I figured I would just email myself every time there was an exception, before serving my default 500 error page. To do so, I just needed to subclass the default <a href="http://code.google.com/appengine/docs/python/tools/webapp/requesthandlerclass.html" target="_blank">RequestHandler</a> with my own <a href="http://code.google.com/appengine/docs/python/tools/webapp/requesthandlerclass.html#RequestHandler_handle_exception" target="_blank"><span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">handle_exception</span></a> method. (OK, <a href="http://troll.me/images/war-cat/prepare-yourself-for-war.jpg" target="_blank">prepare yourselves</a>, a bunch of code is about to happen. See the necessary <a href="http://blog.bossylobster.com/2011/11/handling-errors-in-google-app-engineand.html#imports">imports</a> at the bottom of the post.)<br /><pre class="prettyprint" style="background-color: white;">class ExtendedHandler(RequestHandler):<br /><br /> def handle_exception(self, exception, debug_mode):<br /> traceback_info = ''.join(format_exception(*sys.exc_info()))<br /> email_admins(traceback_info, defer_now=True)<br /><br /> serve_500(self)</pre>Awesome! By making all my handlers inherit from <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">ExtendedHandler</span>, I can use the native Python modules <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">traceback</span> and <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">sys</span> to get the traceback and my handy dandy <br /><pre class="prettyprint" style="background-color: white;">def email_admins(error_msg, defer_now=False):<br /> if defer_now:<br /> defer(email_admins, error_msg, defer_now=False)<br /> return<br /><br /> sender = 'YOUR APP Errors <errors@your_app_id_here.appspotmail.com>'<br /> to = 'Robert Admin <bob@example.com>, James Nekbehrd <jim@example.com>'<br /> subject = 'YOUR APP Error: Admin Notify'<br /> body = '\n'.join(['Dearest Admin,',<br /> '',<br /> 'An error has occurred in YOUR APP:',<br /> error_msg,<br /> ''])<br /><br /> mail.send_mail(sender=sender, to=to,<br /> subject=subject, body=body)</pre>to send out the email in the <a href="http://code.google.com/appengine/articles/deferred.html" target="_blank">deferred queue**</a> so as not to hold up the handler serving the page. <a href="http://www.realdigitalmedia.com/digital-signage-blog/wp-content/uploads/2011/04/Mission-accomplished.jpg" target="_blank">Mission accomplished</a>, right? <span class="Apple-style-span" style="font-size: large;">WRONG!</span><br /><br />Unfortunately, <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">handle_exception</span> <a href="http://code.google.com/p/googleappengine/issues/detail?id=2110" target="_blank">only handles</a> the "right" kind of exceptions. That is, exceptions which inherit directly from Python's <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">Exception</span>. From the <a href="http://media.comicvine.com/uploads/3/37572/1705127-sea_horse_your_argument_is_invalid_super.jpg" target="_blank">horse</a>'s <a href="http://docs.python.org/tutorial/errors.html#user-defined-exceptions" target="_blank">mouth</a>:<br /><blockquote>Exceptions should typically be derived from the <a href="http://docs.python.org/library/exceptions.html#exceptions.Exception" target="_blank">Exception</a> class, either directly or indirectly.</blockquote>But. <a href="http://www.youtube.com/watch?v=a1Y73sPHKxw" target="_blank">But</a>! If the app fails because a request times out, a <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">DeadlineExceededError</span> is thrown and <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">handle_exception</span> falls on its face. Why? Because <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">DeadlineExceededError</span> <a href="http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/runtime/__init__.py#32" target="_blank">inherits</a> directly from <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">Exception</span>'s parent class: <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">BaseException</span>. (<a href="http://vipdictionary.com/img/gasp_by_dokuro-png.jpg" target="_blank">Gasp</a>)<br /><br />It's OK little ones, in my <a href="http://blog.bossylobster.com/2011/11/python-metaclass-for-extra-bad-errors.html">next post</a> I explain how I did it while keeping my code Pythonic by using <a href="http://stackoverflow.com/questions/100003/what-is-a-metaclass-in-python#6581949" target="_blank">metaclasses</a>.<br /><br /><b><a href="http://www.blogger.com/blogger.g?blogID=1697307561385480651" id="imports" style="color: white;">Imports:</a></b><br /><pre class="prettyprint" style="background-color: white;">from google.appengine.api import mail<br />from google.appengine.ext.deferred import defer<br />from google.appengine.ext.webapp import RequestHandler<br />import sys<br />from traceback import format_exception<br />from SOME_APP_SPECIFIC_LIBRARY import serve_500</pre><b>*Pythonic:</b><br /><blockquote><i>An idea or piece of code which closely follows the most common idioms of the Python language, rather than implementing code using concepts common to other languages.</i></blockquote>**<b>Deferred Queue</b>: <i>Make sure to enable the deferred library in your </i><span class="Apple-style-span" style="background-color: white; color: purple; font-family: 'Courier New', Courier, monospace;">app.yaml</span><i> by using </i><span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">deferred: on</span><i> in your builtins.</i><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2011/11/handling-errors-in-google-app-engineand.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-7379753153473554486Thu, 17 Nov 2011 18:11:00 +00002011-11-27T15:43:05.500-08:00APIChristmasGCalGDataGoogle CalendarOAuthOAuth2.0PythonSantaQuick and Dirty: Santa's ComingI have been wanting to write a post for awhile, but was travelling for a <a href="https://sites.google.com/site/barcelonadevfest/">work event</a> and while on the road I decided to be lazy.<br /><br />Since I just so happen to use a few <a href="http://code.google.com/apis/gdata/">GData APIs</a> occasionally in my day to day work, most of the post ideas revolve around quirky things I have done or want to do with the APIs. Also, due to my obscene love for Python, all my mashups seem to end up using the <a href="http://code.google.com/p/gdata-python-client/">Python Client for GData</a>.<br /><br /><b>Back-story:</b> As I was finalizing travel and gifts for my winter holiday back home, I called an old friend to let him know I'd be home in 40 days. After relaying this information to a few other people, I noted to my girlfriend that it would be nice if a computer would remind me of the count every day. This is where this quick and dirty pair of scripts come in to remind me when Santa is coming.<br /><br /><b>Pre-work — Account Settings: </b>To allow an app to make requests on my behalf, I signed up to <a href="https://accounts.google.com/ManageDomains">Manage my Domain</a> for use with Google Apps, etc. For illustration purposes, let's say I used <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">http://example.com</span> (in reality, I used a pre-existing App of mine, I really just needed an OAuth token for one time use, no real safety concerns there). After adding this domain in the management page, I am able to get my <i>"OAuth Consumer Key"</i> and <i>"OAuth Consumer Secret"</i> which we'll say are <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">EXAMPLE_KEY</span> and <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">EXAMPLE_SECRET</span> in this example. Also in the management page, I set my <i>"OAuth 2.0 Redirect URIs"</i> and made sure my app can serve that page (even if it is a 404). Again for illustration, let's pretend I used <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">http://example.com/verify</span>.<br /><br />After doing this settings pre-work, I have two scripts to do the work for me.<br /><br /><b>First script — get the OAuth Token:</b><br /><pre class="prettyprint" style="background-color: white;">import gdata.calendar.client<br />import gdata.gauth<br /><br />gcal = gdata.calendar.client.CalendarClient()<br />oauth_callback = 'http://example.com/verify'<br />scopes = ['https://www.google.com/calendar/feeds/']<br />consumer_key = 'EXAMPLE_KEY'<br />consumer_secret = 'EXAMPLE_SECRET'<br />request_token = gcal.get_oauth_token(scopes, oauth_callback,<br /> consumer_key, consumer_secret)<br />out_str = ('Please visit https://www.google.com/accounts/OAuthAuthorize'<br /> 'Token?hd=default&oauth_token=%s' % request_token.token)<br />print out_str<br />follow = raw_input('Please entry the follow link after authorizing:\n')<br />gdata.gauth.authorize_request_token(request_token, follow)<br />gcal.auth_token = gcal.get_access_token(request_token)<br />print 'TOKEN:', gcal.auth_token.token<br />print 'TOKEN_SECRET:', gcal.auth_token.token_secret<br /></pre>This script "spoofs" the OAuth handshake by asking the user (me) to go directly to the <a href="https://www.google.com/accounts/OAuthAuthorizeToken">OAuth Authorize page</a>. After doing so and authorizing the App, I am redirected to <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">http://example.com/verify</span> with query parameters for <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">oauth_verifier</span> and <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">oauth_token</span>. These are then used by the <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">gauth</span> section of the GData library to finish the OAuth handshake. Once the handshake is complete, the script prints out a necessary token and token secret to be used by the second script. I would advise piping the output to a file, augmenting the script to write them to a file, or writing these down (this is a joke, please don't write down 40 plus character goop that was <i><b>produced by your computer</b></i>). For the next script, let's pretend our token is <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">FAKE_TOKEN</span> and our token secret is <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">FAKE_TOKEN_SECRET</span>.<br /><br /><b>Second script — insert the events:</b><br /><pre class="prettyprint" style="background-color: white;"># General libraries<br />from datetime import date<br />from datetime import timedelta<br /><br /># Third-party libraries<br />import atom<br />import gdata.gauth<br />import gdata.calendar.client<br />import gdata.calendar.data<br /><br />gcal = gdata.calendar.client.CalendarClient()<br />auth_token = gdata.gauth.OAuthHmacToken(consumer_key='EXAMPLE_KEY',<br /> consumer_secret='EXAMPLE_SECRET',<br /> token='FAKE_TOKEN',<br /> token_secret='FAKE_TOKEN_SECRET',<br /> auth_state=3)<br />gcal.auth_token = auth_token<br /><br />today = date.today()<br />days_left = (date(year=2011, month=12, day=23) - today).days<br /><br />while days_left >= 0:<br /> event = gdata.calendar.data.CalendarEventEntry()<br /> if days_left > 1:<br /> msg = '%s Days Until Home for Christmas' % days_left<br /> elif days_left == 1:<br /> msg = '1 Day Until Home for Christmas'<br /> elif days_left == 0:<br /> msg = 'Going Home for Christmas'<br /> event.title = atom.data.Title(msg)<br /><br /> # When<br /> start_time = '2011-%02d-%02dT08:00:00.000-08:00' % (today.month, today.day)<br /> end_time = '2011-%02d-%02dT09:00:00.000-08:00' % (today.month, today.day)<br /> event.when.append(gdata.calendar.data.When(<br /> start=start_time,<br /> end=end_time,<br /> reminder=[gdata.data.Reminder(hours='1')]))<br /><br /> gcal.InsertEvent(event)<br /><br /> today += timedelta(days=1)<br /> days_left -= 1<br /></pre>This script first authenticates by using the key/secret pair for the application (retrieved from the settings page) and the key/secret pair for the user token (that we obtained from the first script). To authenticate, we explicitly construct an HMAC-SHA1 signed token in the final auth state (3) of two-legged OAuth and then set the token on our calendar client (<span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">gcal</span>).<br /><br />After authenticating, we start with today and figure out the number of days in the countdown given my return date of December 23, 2011. With these in hand, we can loop through until there are no days left, creating a <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">CalendarEventEntry</span> with title as the number of days left in the countdown and occurring from 8am to 9am PST (UTC -8). Notice also I include a <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">gdata.data.Reminder</span> so I get an email at 7am every morning (60 minutes before the event) updating my brain on the length of the countdown!<br /><br /><b>Cleanup:</b> Be sure to go to your <a href="https://accounts.google.com/IssuedAuthSubTokens">issued tokens page</a> and revoke access to the App (e.g. <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">http://example.com</span>) after doing this to avoid any unwanted security issues.<br /><div><br /><b>References: </b><i>I have never read this, but I'm sure the documentation on <a href="http://code.google.com/apis/accounts/docs/RegistrationForWebAppsAuto.html">Registration for Web-Based Applications</a> is very helpful.</i><br /><br /><b>Notes:</b><br /><ul><li><i>You can annoy other people by inviting them to these events for them as well. To do so, simply add the following two lines before inserting the event</i><br /><pre class="prettyprint" style="background-color: white;">who_add = gdata.calendar.data.EventWho(email='name@mail.com')<br />event.who.append(who_add)</pre></li><li><i>Sometimes inserting an item results in a RedirectError, so it may be safer to try the insert multiple times with a helper function such as the following:</i><br /><pre class="prettyprint" style="background-color: white;">def try_insert(attempts, gcal, event):<br /> from time import sleep<br /> from gdata.client import RedirectError<br /><br /> while attempts > 0:<br /> try:<br /> gcal.InsertEvent(event)<br /> break<br /> except RedirectError:<br /> attempts -= 1<br /> sleep(3)<br /> pass<br /><br /> if attempts == 0:<br /> print 'Insert "%s" failed' % event.title.text</pre></li><li><i>In what I swear was a complete coincidence, v3 of the Calendar API was <a href="http://googleappsdeveloper.blogspot.com/2011/11/introducing-next-version-of-google.html">announced</a> today. I will try to use the <a href="https://code.google.com/apis/calendar/v3/getting_started.html">new documentation</a> to redo this quick and dirty example with v3.</i></li></ul></div><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2011/11/quick-and-dirty-santas-coming.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-2299860739349152544Tue, 18 Oct 2011 05:25:00 +00002011-11-27T15:42:35.795-08:00This Lobster, not so Bossy<div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/baby_lobster_collegehumor.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://www.bossylobster.com/images/blog/baby_lobster_collegehumor.jpg" /></a></div><div class="separator" style="clear: both; text-align: center;"></div><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2011/10/this-lobster-not-so-bossy.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-2847274450152667006Wed, 05 Oct 2011 22:15:00 +00002011-11-27T13:58:00.938-08:00AppEngineCommit HookConfig FilesGitPrivate KeyProtectPythonProtecting Sensitive Information in Public Git Repositories<style type="text/css"> a.anchor-twitter, a:visited.anchor-twitter { font-weight: bolder; font-style: normal; text-decoration: none; outline: none; } iframe { width: 300px; height: 20px; } .bump-left { margin-left: 1em; } #post-container { font: 14px/1.231 helvetica,arial,clean,sans-serif; display: inline-block; position: relative; width: 640px; text-align: left; } #container-bg { background: url(http://www.bossylobster.com/images/blog/twitter-bg.png) no-repeat #EBEBEB; padding: 20px; margin: 8px 0; } #post-bg { background: #fff; color: #000; padding: 10px 12px 2px 12px; margin: 0; min-height: 60px; font-size: 18px; line-height: 22px; -moz-border-radius: 5px; -webkit-border-radius:5px; -moz-box-shadow:0 2px 2px rgba(0,0,0,0.2); -webkit-box-shadow:0 2px 2px rgba(0,0,0,0.2); box-shadow:0 2px 2px rgba(0,0,0,0.2); } #top-span { width: 100%; margin-bottom: 12px; padding-top: 8px; height: 40px; } #follow-span { float: right; width: 300px; font-size: 12px; text-align: right; } #name-span { line-height: 19px; } #id-img { float: left; margin: 0px 7px 0px 0px; width: 38px; height: 38px; padding: 0; border: none; } </style>On the (much too early) bus to work this morning, I was reading my Twitter feed and saw an <a href="https://twitter.com/#!/robhawkes/status/121593545202216960">interesting question</a> from <a href="https://twitter.com/#!/robhawkes">Rob Hawkes</a>:<br /><center> <div id="post-container"><div id="container-bg"><div id="post-bg"><span id="top-span"> <span id="follow-span"> <iframe allowtransparency="true" frameborder="0" scrolling="no" src="http://platform.twitter.com/widgets/follow_button.html#_=1319487235961&align=right&button=blue&id=twitter_tweet_button_2&lang=en&link_color=%230084B4&screen_name=robhawkes&show_count=false&show_screen_name=&text_color="></iframe> </span> <span id="name-span"> <a class="anchor-twitter" href="http://twitter.com/intent/user?screen_name=robhawkes" title="Rob Hawkes"> <img alt="Rob Hawkes" id="id-img" src="http://www.bossylobster.com/images/blog/robhawkes.jpg" /> </a> <strong> <a class="anchor-twitter" href="http://twitter.com/intent/user?screen_name=robhawkes" style="color: #0084b4;" title="Rob Hawkes">@robhawkes</a> </strong> <span style="color: #999999; font-size: 14px;"><br />Rob Hawkes</span> </span> </span> <br /><div style="margin: 1em 0em .5em 0em;">How do you handle config files in your apps when you use Git? I keep accidentally adding config files with sensitive data to Git. :( </div><div style="font-size: 12px;"><a class="anchor-twitter" href="https://twitter.com/#!/robhawkes/status/121593545202216960" style="color: #0084b4;" target="_blank" title="tweeted on October 5, 2011 7:32 am">October 5, 2011 7:32 am</a> via <a class="anchor-twitter" href="http://www.tweetdeck.com/" rel="nofollow" style="color: #0084b4;" target="blank">TweetDeck</a> <a class="anchor-twitter" href="https://twitter.com/intent/tweet?in_reply_to=121593545202216960" style="color: #0084b4;" title="Reply"> <strong class="bump-left">Reply</strong> </a> <a class="anchor-twitter" href="https://twitter.com/intent/retweet?tweet_id=121593545202216960" style="color: #0084b4;" title="Retweet"> <strong class="bump-left">Retweet</strong> </a> <a class="anchor-twitter" href="https://twitter.com/intent/favorite?tweet_id=121593545202216960" style="color: #0084b4;" title="Favorite"> <strong class="bump-left">Favorite</strong> </a> </div></div></div></div></center>Rob's Twitter followers made all kinds of recommendations and Rob eventually decided it was a solved problem, declaring "Best method I've found so far is creating a temporary config file and keeping that in git, then .gitignoring the real one." and then "Thanks for the config file tips! In the end I went with a "config.example.js" file stored in <a href="http://git-scm.com/">Git</a> and a "config.js" file that is ignored." For those following along at home, they mean the same thing.<br /><br />As Rob was probably intending, this can be used for deploying an app on your personal server, or for a sample App on a PaaS like <a href="http://code.google.com/appengine/">Google App Engine</a> or <a href="http://www.heroku.com/">Heroku</a>. When testing such an app, the ability to have a native environment locally is a huge convenience, but the overhead of remembering which private keys need to be hidden is a headache and sometimes completely neglected. But it shouldn't be, because git never forgets!<br /><br />Anyone who has used git for any substantial amount of time probably initially conceived of this hack when on first thought. (This is no insult to Rob, just the inevitability of the pattern.) But, by the time Rob posted his solution, I had moved on from this and came up a solution that I think does the trick a bit more thoroughly. I envisioned a solution which assumes people who checkout my code will want to keep their config in a specified path that is already in the repo; of course, I also wanted to share this with the interwebs.<br /><br /><span class="Apple-style-span">Anyhow, this is quick and dirty. First, create <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">config.js</span> and <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">_config.js </span>in the root directory of your git repository (the same directory that <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">.git</span> lives in). I intend </span><span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">config.js</span><span class="Apple-style-span"> to be the local copy with my actual passwords and keys and </span><span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">_config.js </span>to hold the master contents that actually show up in the public repo. For example, the contents of <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">config.js are</span>:<br /><div style="text-align: center;"><span class="Apple-style-span" style="background-color: white; color: purple; font-family: 'Courier New', Courier, monospace;">var SECRET = 'Nnkrndkmn978489MDkjw';</span></div>and the contents of <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">_config.js </span>are:<br /><div style="text-align: center;"><span class="Apple-style-span" style="background-color: white; color: purple; font-family: 'Courier New', Courier, monospace;">var SECRET = 'SECRET';</span></div>Since I <i><b>don't</b></i> want a duplicate in my repo, I put a rule in my <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">.gitignore</span> <a href="http://progit.org/book/ch2-2.html#ignoring_files">file</a> to ignore <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">_config.js</span>. (For those unfamiliar, this can be done just by including <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">_config.js </span>on its own line in the <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">.gitignore </span>file.) After doing so, I set up two <a href="http://progit.org/book/ch7-3.html">git hooks</a>, a pre-commit and post-commit hook.<br /><br />To "<i>install</i>" the hooks, just add the files <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">pre-commit </span>and <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">post-commit </span>to the <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">.git/hooks</span> subdirectory in your repo. They are nearly identical files, with a one-line difference. Both files simply swap the contents of <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">config.js</span><span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;"> </span>and <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">_config.js</span>, while <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">pre-commit </span>also adds <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">config.js</span> to the changelist. First I'll give you the contents of <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">pre-commit</span>, and then explain why it's cool/safe:<br /><pre class="prettyprint" style="background-color: white;">#!/usr/bin/env python<br /><br />import os<br /><br />hooks_dir = os.path.dirname(os.path.abspath(__file__))<br />relative_dir = os.path.join(hooks_dir, '../..')<br />project_root = os.path.abspath(relative_dir)<br /><br />git_included_config = os.path.join(project_root, 'config.js')<br />confidential_config = os.path.join(project_root, '_config.js')<br /><br />with open(git_included_config, 'rU') as fh:<br /> git_included_contents = fh.read()<br /><br />with open(confidential_config, 'rU') as fh:<br /> confidential_contents = fh.read()<br /><br />with open(git_included_config, 'w') as fh:<br /> fh.write(confidential_contents)<br /><br />with open(confidential_config, 'w') as fh:<br /> fh.write(git_included_contents)<br /><br />os.system('git add %s' % git_included_config)</pre>(Also note the contents of <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">post-commit </span>are exactly the same, except without the final statement: <span class="Apple-style-span" style="background-color: white; color: purple; font-family: 'Courier New', Courier, monospace;">os.system('git add %s' % git_included_config)</span>.)<br /><br />So what is happening in this file:<br /><ol><li>Uses the Python <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">os</span> module to determine the absolute path to the root directory in your project by using the absolute path of the hook file, backing up two directories and again find that absolute path.</li><li>Determines the two files which need to swap contents</li><li>Loads the contents into string variables and then writes them to the opposite files</li><li>(only in <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">pre-commit</span>) Adds the included file to the changelist before the commit occurs.</li></ol>Step 4 is actually the secret sauce. It puts cleaned, non-sensitive data into the checked in <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">config.js </span>file and then updates the changelist before making a commit, to ensure only the non-sensitive data goes in. Though you could do this yourself by making an initial commit with clean data and then never <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">git add</span>ing the file with your actual data, these hooks prevent an accident and allow you to update your local <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">_config.js </span>file with more fields as your config spec changes.<br /><br />But wait bossylobster, you say, what if one of the hooks doesn't occur? You are right! As <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">pre-commit </span>stands above, if the changelist is empty we have problems. Since the pre-commit hook changes <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">config.js</span> to the same value in HEAD, git will tell us either "<i><b>nothing to commit</b></i>" or "<i><b>no changes added to commit</b></i>". In this case, the commit will exit and the post-commit hook will never occur. <b><span class="Apple-style-span" style="font-size: large;">This is very bad</span></b>, since the contents of <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">config.js </span>and <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">_config.js </span>will be switched but not switched back. So, to account for this, we need to append the following code to the end of <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">pre-commit</span>:<br /><pre class="prettyprint" style="background-color: white;">with os.popen('git st') as fh:<br /> git_status = fh.read()<br /><br />if ('nothing to commit' in git_status or<br /> 'no changes added to commit' in git_status or<br /> 'nothing added to commit' in git_status):<br /> import sys<br /><br /> msg = '# From pre-commit hook: No commit necessary, ' \<br /> 'sensitive config unchanged. #'<br /> hash_head = '#' * len(msg)<br /> print ('%s\n%s\n%s\n\n' % (hash_head, msg, hash_head)),<br /><br /> with open(git_included_config, 'w') as fh:<br /> fh.write(git_included_contents)<br /><br /> with open(confidential_config, 'w') as fh:<br /> fh.write(confidential_contents)<br /><br /> sys.exit(1)</pre>For final versions see the <a href="http://www.bossylobster.com/scripts/pre-commit">pre-commit</a> and <a href="http://www.bossylobster.com/scripts/post-commit">post-commit</a> files. Thanks again to <a href="https://twitter.com/#!/robhawkes">Rob Hawkes</a> for the idea/work break over lunch! <a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a><br /><br /><b>Update</b>: <i>One of Rob's followers, <a href="https://twitter.com/#!/nrocy">Paul King</a>, found and <a href="https://twitter.com/#!/nrocy/status/124468167086051328">tweeted</a> a very different alternative that is also pretty cool. Check out the <a href="http://archive.robwilkerson.org/2010/03/02/git-tip-ignore-changes-to-tracked-files/">post</a> he found by <a href="https://twitter.com/#!/robwilkerson">Rob Wilkerson</a>.</i><br /><br /><b>Update</b>: <i>I swapped out a screen shot of the tweet for a CSS-ified version, inspired by and based on a design used on Mashable.</i><br /><br /><b>Update</b>: <i>Some change in git </i><i>causes empty commits to be allowed. </i><i>I either didn't notice this before or it just showed up in git. So I added </i><span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">sys.exit(1)</span><i> to force the pre-commit script to fail when nothing is changed and added a check for the phrase </i><span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">nothing added to commit</span><i> as well.</i>http://blog.bossylobster.com/2011/10/protecting.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-3062186777406786809Tue, 30 Aug 2011 00:21:00 +00002011-11-18T08:47:07.545-08:00Binary Quadratic FormConwayConway's TopographMathNumber TheoryProject EulerFinding (Fibonacci) Golden Nuggets Part 2This is the <i>mostly code</i> second half of a <a href="http://blog.bossylobster.com/2011/08/finding-fibonacci-golden-nuggets.html">two part post</a> that delivers on a promise of meaningful uses of some theory I overviewed in my last set of posts. If you see words like topograph, river, and base and you aren't sure what I mean, you may want to read that last <a href="http://blog.bossylobster.com/2011/08/conways-topograph-part-3.html">set of posts</a>.<br /><br />In the first half of this post, I outlined a solution to Project Euler <a href="http://projecteuler.net/index.php?section=problems&id=137">problem 137</a> and will continue with the solution here. Stop reading now if you don't want to be spoiled. There was no code in the first post, so this post will be mostly code, providing a pretty useful abstraction for dealing with binary quadratic forms.<br /><br />In the very specific solution, I was able to use one picture to completely classify all integer solutions to the equation \(5 x^2 - y^2 = 4\) due to some dumb luck. In the solution, we were able to use "Since every edge protruding from the river on the positive side has a value of 4 on a side...by the climbing lemma, we know all values above those on the river have value greater than 4," but this is no help when trying to find solutions to \(5 x^2 - y^2 = 9\), for example.<br /><br />To answer the question \(5 x^2 - y^2 = 9\), we'll use the same pretty picture, but emphasize different parts of it. As you can see below, to classify all the values, we only need to travel from the initial base<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/golden_nugget_first_base.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://www.bossylobster.com/images/blog/golden_nugget_first_base.png" /></a></div>along the river until we arrive at an identical base as the blue circles indicate below:<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/golden_nugget_next.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="390" src="http://www.bossylobster.com/images/blog/golden_nugget_next.png" width="640" /></a></div>As noted above, for problem 137, we luckily were concerned about finding values \(4\) or \(1\), and the climbing lemma saved us from leaving the river. However, as I've noted above with <span class="Apple-style-span" style="color: #6fa8dc;">#1</span>,<span class="Apple-style-span" style="color: #6fa8dc;"> #2</span>,<span class="Apple-style-span" style="color: #6fa8dc;"> #3</span>, and <span class="Apple-style-span" style="color: #6fa8dc;">#4</span>, there are four <i>tributaries</i> coming from the river where we can consider larger values. Using the <i>Arithmetic Progression Rule</i>, we find values \(19\), \(11\), \(11\), and \(19\) as the first set of values above the river. From this point, we can stop checking for solutions to \(f(x, y) = 9\) since the climbing lemma says all further values above the tributaries will be \(11\) or greater. Thus, the only solutions come via scaling solutions of \(f(x, y) = 1\) by a factor of \(3\) (using homogeneity of a quadratic form).<br /><br />Now for the (Python) code.<br /><br />First, the data structure will be representative of a base along the river, but will also include the previous and next faces (on the shared superbases) and so we'll call it a <i>juncture </i>(my term, not Conway's). Each face in a juncture needs to be represented by both the pair \((x, y)\) and the value that \(f\) takes on this face. For our sanity, we organize a juncture as a tuple \((B, P, N, F)\), (in that order)<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/juncture.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://www.bossylobster.com/images/blog/juncture.png" /></a></div>where \(P\) and \(N\) form a base straddling the river, with \(P\) taking the positive value and \(N\) negative, as well as \(B\) the face "back" according to our orientation and \(F\) the face "forward". Note, depending on the value of the form at \(F\), the river may "turn left" or "turn right" at the superbase formed by \(P\), \(N\) and \(F\).<br /><br />To move "along the river until we arrive at an identical base", we need a way to move "forward" (according to our imposed orientation) to the next juncture on the river. Moving along the river, we'll often come to superbases \((B, N, P)\) and need to calculate the forward face \(F\). To calculate \(F\), assume we have <a href="http://code.google.com/p/dhermes-project-euler/source/browse/python_code/conway_topograph.py#33">already written</a> a <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">plus</span> function that determines the vector at \(F\) by adding the vectors from \(P\) and \(N\) and determines the value at \(F\) by using the arithmetic progression rule with the values at all three faces in the superbase. Using this helper function, we can define a way to get the next juncture by turning left or right:<br /><pre class="prettyprint" style="background-color: white;">def next_juncture_on_river(juncture):<br /> B, P, N, F = juncture<br /> forward_val = F[1]<br /> if forward_val < 0:<br /> # turn left<br /> NEXT = plus(P, F, N[1])<br /> return (N, P, F, NEXT)<br /> elif forward_val > 0:<br /> # turn right<br /> NEXT = plus(N, F, P[1])<br /> return (P, F, N, NEXT)<br /> else:<br /> raise Exception("No infinite river here.")</pre><div id="footnote">Next, to know when to stop crawling on the river, we need to know when we have returned to an identical juncture, so we define:<br /><pre class="prettyprint" style="background-color: white;">def juncture_ident(juncture1, juncture2):<br /> B1, P1, N1, F1 = juncture1<br /> B2, P2, N2, F2 = juncture2<br /> return ((B1[1] == B2[1]) and (P1[1] == P2[1]) and<br /> (N1[1] == N2[1]) and (F1[1] == F2[1]))</pre>Using these functions, we can first find the recurrence that will take us from a base of solutions to all solutions and second, keep track of the positive faces on the river to generalize the solution of \(f(x, y) = z\). For both of these problems, we impose a simplification for the sake of illustration. We will only be considering quadratic forms \[f(x, y) = a x^2 + b y^2\] where \(a > 0\), \(b < 0\) and \(\sqrt{\left|\frac{a}{b}\right|}\) is not rational. This guarantees the existence of a river. We will pass such forms as an argument <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">form=(a, b)</span> to our functions. We start our river at the juncture defined by the trivial base \((1, 0), (0, 1)\)<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/trivial_base.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://www.bossylobster.com/images/blog/trivial_base.png" /></a></div>and crawl the river using the functions defined above. (<i><b>Note</b>:</i> \(f(1, -1) = a(1)^2 + b(-1)^2 = a + b\), <i>etc.</i>)<br /><br />To find the recurrence, we need just walk along the river until we get an identical juncture where the trivial base is replaced by the base \((p, q), (r, s)\). Using the same terminology as in <a href="http://blog.bossylobster.com/2011/08/finding-fibonacci-golden-nuggets.html">part one</a>, let the base vectors define a sequence \(\left\{(u_k, v_k)\right\}_{k \geq 0}\) with \(u_0 = (1, 0)\) and \(v_0 = (0, 1)\), then we have a recurrence \begin{align*}u_{k + 1} &= p u_k + q v_k \\ v_{k + 1} &= r u_k + s v_k. \end{align*} Using this -- \(u_{k + 2} - p u_{k + 1} - s(u_{k + 1} - p u_k) = q v_{k + 1} - s (q v_k) = q(r u_k)\) -- hence \(u\) satisfies the recurrence \(u_{k + 2} = (r q - p s)u_k + (p + s)u_{k + 1}\). (You can check that \(v\) satisfies this as well.) Hence our function to spit out the recurrence coefficients is:<br /><pre class="prettyprint" style="background-color: white;">def get_recurrence(form):<br /> a, b = form<br /> B = ((1, -1), a + b)<br /> P = ((1, 0), a)<br /> N = ((0, 1), b)<br /> F = ((1, 1), a + b)<br /> J_init = (B, P, N, F)<br /> J_curr = next_juncture_on_river(J_init)<br /> while not juncture_ident(J_init, J_curr):<br /> J_curr = next_juncture_on_river(J_curr)<br /><br /> final_B, final_P, final_N, final_F = J_curr<br /> p, q = final_P[0]<br /> r, s = final_N[0]<br /> return (r*q - p*s, p + s)</pre>For solving \(f(x, y) = z\), (\(z\) positive) we need to consider all the positive tributaries coming out of the river and let them grow and grow until the climbing lemma tells us we no longer need to consider values larger than \(z\). In order to consider tributaries, we describe a new kind of juncture. Instead of having a positive/negative base in the center of the juncture, we have two consecutive faces from the positive side<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/positive_root.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="208" src="http://www.bossylobster.com/images/blog/positive_root.png" width="640" /></a></div>and have the negative from across the river as the "back" face. With this definition, we first write a function to return all tributaries:<br /><pre class="prettyprint" style="background-color: white;">def all_positive_tributaries(form):<br /> # ...Initialization logic...<br /><br /> new_positives = []<br /> J_curr = next_juncture_on_river(J_init)<br /> while not juncture_ident(J_init, J_curr):<br /> # we add a new positive if the forward<br /> # value is positive<br /> forward = J_curr[-1]<br /> if forward[1] > 0:<br /> new_positives.append(J_curr)<br /> J_curr = next_juncture_on_river(J_curr)<br /><br /> # For each (B, P, N, F) in new_positives, we want to<br /> # transform to a juncture with positive values, which will<br /> # be (N, P_1, P_2, P_F)<br /> result = []<br /> for new_positive in new_positives:<br /> B, P, N, F = new_positive<br /> new_face = plus(P, F, N[1])<br /> tributary = (N, P, F, new_face)<br /> result.append(tributary)<br /> return result</pre>For each tributary, we can climb up away from the river until our values are too large. So we write a helper function to take a given tributary and a max value and recursively "climb" the topograph until we exceed the value. This function will naively return all possible faces (value and vector) without checking the actual values.<br /><pre class="prettyprint" style="background-color: white;">def seek_up_to_val(juncture, max_value):<br /> N, P_1, P_2, P_F = juncture<br /> if P_F[1] > max_value:<br /> return []<br /> result = [P_F]<br /><br /> turn_left = plus(P_1, P_F, P_2[1])<br /> J_left = (P_2, P_F, P_1, turn_left)<br /> result.extend(seek_up_to_val(J_left, max_value))<br /><br /> turn_right = plus(P_2, P_F, P_1[1])<br /> J_right = (P_1, P_F, P_2, turn_right)<br /> result.extend(seek_up_to_val(J_right, max_value))<br /> return result</pre>Finally, we can combine these two helper functions into a function which will find all solutions to \(f(x, y) = z\) above the river. We may have a pair (or pairs) \((x, y)\) on the topograph where \(f(x, y) = \frac{z}{k^2}\) for some integer \(k\); if so, this gives rise to a solution \((kx, ky)\) which we'll be sure to account for in our function.<br /><pre class="prettyprint" style="background-color: white;"><span class="Apple-style-span">def all_values_on_form(form, value):<br /> # Use a helper (factors) to get all positive integer factors of value<br /> factor_list = factors(value)<br /> # Use another helper (is_square) to determine which factors can be<br /> # written as value/k^2 for some integer k<br /> valid_factors = [factor for factor in factor_list<br /> if is_square(value/factor)]<br /><br /> tributaries = all_positive_tributaries(form)<br /> found = set()<br /> for tributary in tributaries:<br /> candidates = seek_up_to_val(tributary, value)<br /> found.update([candidate for candidate in candidates<br /> if candidate[1] in valid_factors])<br /> # Since each tributary is of the form (N, P_1, P_2, P_F) for<br /> # P_1, P_2 on the river, we need only consider P_1 and P_2 since<br /> # those faces above are in candidates. But P_2 will always be in<br /> # next tributary, so we need not count it. You may assume this ignores<br /> # the very final tributary, but here P_2 actually lies in the <br /> # second period of the river<br /> N, P_1, P_2, F = tributary<br /> if P_1[1] in valid_factors:<br /> found.add(P_1)<br /><br /> # Finally we must scale up factors to account for<br /> # the reduction by a square multiple<br /> result = []<br /> for face in found:<br /> (x, y), val = face<br /> if val < value:<br /> ratio = int(sqrt(value/val))<br /> x *= ratio<br /> y *= ratio<br /> result.append((x, y))<br /><br /> return result</span></pre>Combining <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">all_values_on_form</span> with <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">get_recurrence</span>, we can parameterize every existing solution!<br /><br />As far as Project Euler is concerned, in addition to Problem 137, I was able to write lightning fast solutions to <a href="http://projecteuler.net/index.php?section=problems&id=66">Problem 66</a>, <a href="http://projecteuler.net/index.php?section=problems&id=94">Problem 94</a>, <a href="http://projecteuler.net/index.php?section=problems&id=100">Problem 100</a>, <a href="http://projecteuler.net/index.php?section=problems&id=138">Problem 138</a> and <a href="http://projecteuler.net/index.php?section=problems&id=140">Problem 140</a> using tools based on the above -- a general purpose library for solving binary quadratic forms over integers!</div><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2011/08/finding-fibonacci-golden-nuggets-part-2.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-8793933354039507148Mon, 29 Aug 2011 01:14:00 +00002011-11-18T08:46:58.971-08:00Binary Quadratic FormConwayConway's TopographMathNumber TheoryProject EulerFinding (Fibonacci) Golden Nuggets Part 1As I mentioned in my last set of posts, the content would go somewhere and this post will be the first to deliver on that. In fact this is the <i>all math, no code</i> first half of a two part post that will deliver. If you see words like topograph, river, and base and you aren't sure what I mean, you may want to read that last <a href="http://blog.bossylobster.com/2011/08/conways-topograph-part-3.html">set of posts</a>.<br /><br />In this post, I outline a solution to Project Euler <a href="http://projecteuler.net/index.php?section=problems&id=137">problem 137</a>, so stop reading now if you don't want to be spoiled. There is no code here, but the <a href="http://blog.bossylobster.com/2011/08/finding-fibonacci-golden-nuggets-part-2.html">second half</a> of this post has a pretty useful abstraction.<br /><br />The problems asks us to consider \[A_F(z) = z F_1 + z^2 F_2 + z^3 F_3 + \ldots,\] the generating polynomial for the Fibonacci sequence<a href="http://www.blogger.com/post-edit.g?blogID=1697307561385480651&postID=8793933354039507148#footnote" style="color: white; text-decoration: none;">*</a>. The problem defines (without stating so), a sequence \(\left\{z_n\right\}_{n > 0}\) where \(A_F(z_n) = n\) and asks us to find the values \(n\) for which \(z_n\) is rational. Such a value \(n\) is called a <i>golden nugget</i>. As noted in the problem statement, \(A_F(\frac{1}{2}) = 2\), hence \(z_2 = \frac{1}{2}\) is rational and \(2\) is the first golden nugget.<br /><br />As a first step, we determine a criterion for \(n\) to be a golden nugget by analyzing the equation \(A_F(z) = n\). With the recurrence relation given by the Fibonacci sequence as inspiration, we consider \begin{align*}(z + z^2)A_F(z) = z^2 F_1 &+ z^3 F_2 + z^4 F_3 + \ldots \\ &+ z^3 F_1 + z^4 F_2 + z^5 F_3 + \ldots. \end{align*}Due to the fact that \(F_2 = F_1 = 1\) and the nature of the recurrence relation, we have \[(z +z^2)A_F(z) = z^2 F_2 + z^3 F_3 + z^4 F_4 + \ldots = A_F(z) -z\] which implies \[A_F(z) = \frac{z}{1 - z - z^2}.\] Now solving \(A_F(z) = n\), we have \[z = n - n z - n z^2 \Rightarrow n z^2 + (n + 1)z - n = 0.\] In order for \(n\) to be a golden nugget, we must have the solutions \(z\) rational. This only occurs if the discriminant \[(n + 1)^2 - 4(n)(-n) = 5 n^2 + 2 n + 1\] in the quadratic is the square of a rational.<br /><br />So we now positive seek values \(n\) such that \(5 n^2 + 2 n + 1 = m^2\) for some integer \(m\) (the value \(m\) must be an integer since a rational square root of an integer can only be an integer.) This equation is equivalent to \[25n^2 + 10n + 5 = 5m^2\] which is equivalent to \[5m^2 - (5 n + 1)^2 = 4.\] This is where Conway's topograph comes in. We'll use the topograph with the quadratic form \(f(x, y) = 5 x^2 - y^2\) to find our solutions. First note, a solution is only valuable if \(y \equiv 1 \bmod{5}\) since \(y = 5 n + 1\) must hold. Also, since \(\sqrt{5}\) is irrational, \(f\) can never take the value \(0\), but \(f(1, 0) = 5\) and \(f(0, 1) = -1\), hence there is a river on the topograph and the values of \(f\) occur in a periodic fashion. Finally, since we take pairs \((x, y)\) that occur on the topograph, we know \(x\) and \(y\) share no factors. Hence solutions \(f(x, y) = 4\) can come either come from pairs on the topograph or by taking a pair which satisfies \(f(x, y) = 1\) and scaling up by a factor of \(2\) (we will have \(f(2x, 2y) = 2^2 \cdot 1 = 4\) due to the homogeneity of \(f\)).<br /><br />Starting from the trivial base \(u = (1, 0)\) and \(v = (0, 1)\), the full period of the river has length \(8\) as seen below:<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/golden_nugget.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="366" src="http://www.bossylobster.com/images/blog/golden_nugget.png" width="640" /></a></div>(<i><b>Note</b>: in the above, the values denoted as combinations of \(u\) and \(v\) are the vectors for each face on the topograph while the numbers are the values of \(f\) on these vectors.</i>) Since every edge protruding from the river on the positive side has a value of \(4\) on a side (the value pairs are \((5, 4)\), \((4, 1)\), \((1, 4)\), and \((4, 5)\)), by the climbing lemma, we know all values above those on the river have value greater than \(4\). Thus, the only solutions we are concerned with -- \(f(x, y) = 1\) or \(4\) -- must appear on the river. Notice on the river, the trivial base \((u, v)\) is replaced by the base \((9 u + 20 v, 4 u + 9 v)\). This actually gives us a concrete recurrence for the river and with it we can get a complete understanding of our solution set.<br /><br />When we start from the trivial base, we only need consider moving to the right (orientation provided by the above picture) along the river since we only care about the absolute value of the coordinates (\(n\) comes from \(y\), so it must be positive). As such, we have a sequence of bases \(\left\{(u_k, v_k)\right\}_{k \geq 0}\) with \(u_0 = (1, 0)\), \(v_0 = (0, 1)\) and recurrence \begin{align*}u_{k + 1} &= 9 u_k + 20 v_k \\ v_{k + 1} &= 4 u_k + 9 v_k. \end{align*} This implies that both \(\left\{u_k\right\}\) and \(\left\{v_k\right\}\) satisfy the same relation \begin{align*}u_{k + 2} - 9 u_{k + 1} - 9(u_{k + 1} - 9 u_k) &= 20 v_{k + 1} - 9(20 v_k) = 20(4 u_k) \\ v_{k + 2} - 9 v_{k + 1} - 9(v_{k + 1} - 9 v_k) &= 4 u_{k + 1} - 9(4 u_k) = 4(20 v_k). \end{align*} With these recurrences, you can take the three base solutions on the river and quickly find each successive golden nugget. Since each value is a coordinate in a vector, it satisfies the same linear recurrence as the vector. Also, since each of the solution vectors occur as linear combinations of \(u_k\) and \(v_k\), they must satisfy the same recurrence as well.<br /><br />Since the recurrence is degree two, we need the first two values to determine the entire sequence. For the first solution we start with \(u_0 + v_0 = (1, 1)\) and \(u_1 + v_1 = (13, 29)\); for the second solution we start with \(u_0 + 2 v_0) = (2, 4)\) and \(u_1 + 2 v_1 = (17, 38)\); and for the third solution we start with \(5 u_0 + 11 v_0 = (5, 11)\) and \(5 u_1 + 11 v_1 = (89, 199)\). For the second solution, since \(f(1, 2) = 1\), we use homogeneity to scale up to \((2, 4)\) and \((34, 76)\) to start us off. With these values, we take the second coordinate along the recurrence and get the following values:<br /><br /><center><table border="1" style="border-collapse: collapse;"><tbody><tr><th>n</th><th>First</th><th>Second</th><th>Third</th></tr><tr><td>0</td><td><div style="text-align: center;">1</div></td><td><div style="text-align: center;">4</div></td><td><div style="text-align: center;">11</div></td></tr><tr><td>1</td><td><div style="text-align: center;">29</div></td><td><div style="text-align: center;">76</div></td><td><div style="text-align: center;">199</div></td></tr><tr><td>2</td><td><div style="text-align: center;">521</div></td><td><div style="text-align: center;">1364</div></td><td><div style="text-align: center;">3571</div></td></tr><tr><td>3</td><td><div style="text-align: center;">9349</div></td><td><div style="text-align: center;">24476</div></td><td><div style="text-align: center;">64079</div></td></tr><tr><td>4</td><td><div style="text-align: center;">167761</div></td><td><div style="text-align: center;">439204</div></td><td><div style="text-align: center;">1149851</div></td></tr><tr><td>5</td><td><div style="text-align: center;">3010349</div></td><td><div style="text-align: center;">7881196</div></td><td><div style="text-align: center;">20633239</div></td></tr><tr><td>6</td><td><div style="text-align: center;">54018521</div></td><td><div style="text-align: center;">141422324</div></td><td><div style="text-align: center;">370248451</div></td></tr><tr><td>7</td><td><div style="text-align: center;">969323029</div></td><td><div style="text-align: center;">2537720636</div></td><td><div style="text-align: center;">6643838879</div></td></tr><tr><td>8</td><td><div style="text-align: center;">17393796001</div></td><td><div style="text-align: center;">45537549124</div></td><td><div style="text-align: center;">119218851371</div></td></tr><tr><td>9</td><td><div style="text-align: center;">312119004989</div></td><td><div style="text-align: center;">817138163596</div></td><td><div style="text-align: center;">2139295485799</div></td></tr><tr><td>10</td><td><div style="text-align: center;">5600748293801</div></td><td><div style="text-align: center;">14662949395604</div></td><td><div style="text-align: center;">38388099893011</div></td></tr></tbody></table></center><br />We don't get our fifteenth golden nugget candidate (value must be one more than a multiple of \(5\)) until \(5600748293801\), which yields \(\boxed{n = 1120149658760}\). By no means did I do this by hand in real life; I didn't make a pretty representation of the river either. I just wanted to make the idea clear without any code. To get to the code (and the way you should approach this stuff), move on to the <a href="http://blog.bossylobster.com/2011/08/finding-fibonacci-golden-nuggets-part-2.html">second half</a> of this post.<br /><br /><div id="footnote"><i>*The Fibonacci sequence is given by \(F_0 = 0\), \(F_1 = 1\), and \(F_{n} = F_{n - 1} + F_{n - 2}\).</i></div><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2011/08/finding-fibonacci-golden-nuggets.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-7232903132197972116Tue, 23 Aug 2011 23:29:00 +00002011-11-18T08:47:47.716-08:00AlgebraBinary Quadratic FormConwayConway's TopographMathNumber TheoryConway's Topograph Part 3This is the third (continued from <a href="http://blog.bossylobster.com/2011/08/conways-topograph-part-2.html">Part 2</a>) in a series of three blog posts. In the following we'll investigate a few properties of an object called Conway’s topograph. <a href="http://en.wikipedia.org/wiki/John_Horton_Conway">John Conway</a> conjured up a way to understand a binary quadratic form – a very important algebraic object – in a geometric context. This is by no means original work, just my interpretation of some key points from his <a href="http://www.amazon.com/Sensual-Quadratic-Carus-Mathematical-Monographs/dp/0883850303">The Sensual (Quadratic) Form</a> that I'll need for some other posts.<br /><br /><hr /><br /><b>Definition</b>: For the form \(f(x, y) = a x^2 + h x y + b y^2\), we define the <i>discriminant</i> as the value \(ab - \left(\frac{1}{2}h\right)^2\). \(\Box\)<br /><br />The base \((1, 0)\) and \((0, 1)\) take values \(a\) and \(b\) on the form in the Definition above and are part of a sequence with common difference \(h\). In fact, if we know the values \(a'\), \(b'\) and the difference \(h'\) at any base (edge in the topograph), the value \(a'b' - \left(\frac{1}{2}h'\right)^2\) is independent of the base and the direction (left or right) which determines the sign of \(h'\) and hence equal to the discriminant. To see this, first note the sign of \(h'\) is immaterial since it is squared. Also, consider the two other bases (edges) in the superbase. As in the proof of the climbing lemma, one base takes values \(a' = a\) and \(b' = a + b + h\) with common difference \(h' = 2a + h\) which gives<br />\begin{align*}<br />a'b' - \left(\frac{1}{2}h'\right)^2 &= a(a + b + h) - \frac{1}{4}\left(2a + h\right)^2 \\<br />&= a^2 + a b + a h - \left(a^2 + a h + \frac{1}{4} h^2\right) \\<br />&= ab - \left(\frac{1}{2}h\right)^2.<br />\end{align*}<br />Similarly the other base in the given superbase gives<br />\begin{align*}<br />a'b' - \left(\frac{1}{2}h'\right)^2 &= (a + b + h)b - \frac{1}{4}\left(2b + h\right)^2 \\<br />&= b^2 + a b + b h - \left(b^2 + b h + \frac{1}{4} h^2\right) \\<br />&= ab - \left(\frac{1}{2}h\right)^2.<br />\end{align*}<br />Having showed that there are no cycles when starting from a given superbase, our work in understanding the topograph is not complete. We haven't actually showed that we can get from one superbase to any other superbases within the topograph. To show this, we'll use the discriminant and the following.<br /><br /><b>Definition</b>: A superbase \(W\) is called a well if all the edges at \(W\) point away from \(W\).<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/conway_well.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://www.bossylobster.com/images/blog/conway_well.png" /></a>\(\Box\)</div><br />Notice a well is dependent on the values, hence depends on the form \(f\). In a positive--valued topograph, we may find a well by traveling along the topograph in the opposite direction of the edges. Eventually, we must encounter a superbase where all arrows point out (as above), leaving us nowhere to travel and thus becoming our well. This is because, assuming the topograph is positive--valued, we can only decrease in value for so long (eventually the values must approach the minimum).<br /><br /><b>Lemma</b>: (The Well Lemma) For a positive--valued form \(f\) and a well (with respect to \(f\)) \(W\), the three values \(f\) takes on the faces in \(W\) are the smallest values that \(f\) takes on the topograph.<br /><br /><b>Proof</b>: Using the labels from the well in the definition above, the <i>Arithmetic Progression Rule</i> for our differences gives<br />\[2\alpha = b + c - a, \quad 2\beta = c + a - b, \quad 2\gamma = a + b - c\]<br />and solving,<br />\[a = \beta + \gamma, \quad b = \alpha + \gamma, \quad c = \beta + \alpha.\]<br />Let the superbase \(W = \left\{e_1, e_2, e_3\right\}\). Since \(W\) is a superbase, we may write any vector as<br />\[v = m_1 e_1 + m_2 e_2 + m_3 e_3\]<br />for \(m_1\), \(m_2\), \(m_3 \in \mathbf{Z}\). Also due to the fact that \(W\) is a superbase, \(e_1 + e_2 + e_3 = (0, 0)\) and so we may also write<br />\[v = (m_1 - k) e_1 + (m_2 - k) e_2 + (m_3 - k) e_3\]<br />for \(k \in \mathbf{Z}\). From this it is clear only the differences of the \(m_i\) matter. With this as our inspiration we write<br />\[f(v) = \alpha(m_2 - m_3)^2 + \beta(m_1 - m_3)^2 + \gamma(m_1 - m_2)^2,\]<br />a formula discovered by Selling.<br /><br />To verify this, notice both sides of the equation are quadratic forms in \(v\) and<br />\begin{align*}<br />f(e_1) = a &= \beta + \gamma = \alpha \cdot 0^2 + \beta \cdot 1^2 + \gamma \cdot 1^2 \\<br />f(e_2) = b &= \alpha + \gamma = \alpha \cdot 1^2 + \beta \cdot 0^2 + \gamma \cdot (-1)^2 \\<br />f(e_3) = c &= \beta + \alpha = \alpha \cdot (-1)^2 + \beta \cdot (-1)^2 + \gamma \cdot 0^2.<br />\end{align*}<br />hence they must be equal since both sides are quadratics that agree on more than two points.<br /><br />If two of the \(m_i\) are equal, then \(v\) must be an integral multiple of the third vector, hence the value \(f(v)\) will be at least as large as the value of \(f\) on the third vector. If not, all the differences must be non--zero (hence greater than or equal to \(1\) in absolute value, since integers), thus<br />\[f(v) \geq \alpha \cdot 1^2 + \beta \cdot 1^2 + \gamma \cdot 1^2\]<br />which is greater than or equal to each of \(a = \beta + \gamma\), \(b = \alpha + \gamma\), and \(c = \beta + \alpha\) since all of \(\alpha\), \(\beta\), and \(\gamma\) are non--negative. \(\Box\)<br /><br /><b>Corollary</b>: The topograph is connected; one may travel along the topograph from any given superbase to any other.<br /><br /><b>Proof</b>: Using the same quadratic form \(f\) as we did to show the topograph had no cycles, we can show it is connected. Any arbitrary superbase is on the topograph, hence must be in some connected component of the topograph, but there may be more than one component. Since \(f\) is positive--valued, we must have some well in this component. But, by the above, the values at a well must be the absolute lowest values \(f\) takes on the topograph. This implies the well must take the values \(1\), \(1\), \(1\) and shows all superbases must be in the same component. \(\Box\)<br /><br />From this point, we will concentrate on a special type of form relevant to our discussion. For a form \(f\) which takes both positive and negative values, but never \(0\), the topograph has a special path that separates the which separates the faces where takes a positive value and those where \(f\) takes a negative value.<br /><br /><b>Claim</b>: If a form \(f\) takes both positive and negative values, but not zero, then there is a unique path of connected edges separating the positive and negative values. What's more, the values that occur on this river do so periodically.<br /><br /><b>Proof</b>: Since the topograph is connected, there must be some edge where positive and negative values meet. As we proceed along adjacent edges, we can choose to follow a path of edges which will separate positive and negative (each subsequent value must be positive or negative, allowing us to "turn" left or right).<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/conway_river.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="277" src="http://www.bossylobster.com/images/blog/conway_river.png" width="640" /></a></div>On first sight, there is no reason that this path should be unique. However, with the climbing lemma in mind, starting on the positive side of the path and moving away from the negative values, we must have only positive values. Using the logic of the climbing lemma instead with negative values, we similarly see that starting on the negative side and more away from the positive values will yield all negative numbers below the path. Hence nowhere above the path can positive and negative values meet and similarly below. Thus the path must be unique.<br /><br />To show this path is periodic, we must utilize the discriminant. For each edge along the path, we have some positive value \(a\) and a negative \(b\) (by definition of the path) and the common difference \(h\). Thus the determinant \(D\) must be negative since the product \(ab\) is, hence<br />\[\left|D\right| = \left|ab\right| + \left(\frac{1}{2}h\right)^2.\]<br />Thus, both \(\left(\frac{1}{2}h\right)^2\) and \(\left|ab\right|\) are bounded by \(\left|D\right|\). So \(a\), \(b\) and \(h\) are bounded (by \(\left|D\right|\). Thus we have finitely many possible triples \((a, b, h)\), hence some value must be repeated in the path. This forces the path to be periodic since the triple starting from one triple \((a, b, h)\) determines next triple along the path and hence the entire path.<br /><br />This path is so crucial that we give it it's own name.<br /><br /><b>Definition</b>: If a form \(f\) takes both positive and negative values, but not zero, we call the path separating the positive and negative values the <b>river</b>. \(\Box\)<br /><br />Thanks for reading, I'll make use of all this in a few days!<br /><br /><b>Update</b>: <i>This material is intentionally aimed at an intermediate (think college freshman/high school senior) audience. One can go deeper with it, and I'd love to get more technical off the post.</i><br /><br /><i><span class="Apple-style-span" style="font-style: normal;"><b>Update</b>: <i>All images were created with the <a href="http://www.texample.net/tikz/examples/">tikz</a> LaTeX library and can be compiled with native LaTeX if pgf is installed.</i></span></i><br /><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2011/08/conways-topograph-part-3.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-8265201298971491629Tue, 23 Aug 2011 18:11:00 +00002011-11-18T08:47:41.279-08:00AlgebraBinary Quadratic FormConwayConway's TopographMathNumber TheoryConway's Topograph Part 2This is the second (continued from <a href="http://blog.bossylobster.com/2011/08/conways-topograph-part-1.html">Part 1</a>) in a series of three blog posts. In the following we'll investigate a few properties of an object called Conway’s topograph. <a href="http://en.wikipedia.org/wiki/John_Horton_Conway">John Conway</a> conjured up a way to understand a binary quadratic form – a very important algebraic object – in a geometric context. This is by no means original work, just my interpretation of some key points from his <a href="http://www.amazon.com/Sensual-Quadratic-Carus-Mathematical-Monographs/dp/0883850303">The Sensual (Quadratic) Form</a> that I'll need for some other posts.<br /><br /><hr /><br />In the following, as mentioned in Part 1, "when referring to a base/superbase, we are referring to the lax equivalent of these notions."<br /><br />To begin to form the topograph, note each superbase \(\left\{e_1, e_2, e_3\right\}\) contains only three bases<br />\[\left\{e_1, e_2\right\}, \left\{e_2, e_3\right\}, \left\{e_3, e_1\right\}\]<br />as subsets. Going the other direction, a base \(\left\{e_1, e_2\right\}\) can only possibly be contained as a subset of two superbases:<br />\[\langle e_1, e_2, (e_1 + e_2)\rangle, \langle e_1, e_2, (e_1 - e_2)\rangle.\]<br />With these two facts in hand, we can begin to form the geometric structure of the topograph. The interactions between bases and superbases (as well as the individual vectors themselves) give us the form. In the graph, we join each superbase (\(\bigcirc\)) to the three bases (\(\square\)) in it.<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/conway_edges_nodes.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://www.bossylobster.com/images/blog/conway_edges_nodes.png" /></a></div>Each edge connecting two superbases (\(\bigcirc\)) represents a base and we mark each of these edges with a (\(\square\)) in the middle. Since each base can only be in two superbases, we have well--defined endpoints for each base (edge). Similarly, since each superbase contains three bases as subsets, each superbase (endpoint) has three bases (edges) coming out of it.<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/conway_face.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://www.bossylobster.com/images/blog/conway_face.png" /></a></div>As we traverse each edge (base) surrounding a given vector (\(e_1\) above), we move from superbase (vertex) to superbase (vertex), and form a face. Starting from a base \(e_1, e_2\), traveling along each of the new faces encountered we begin to form the full (labeled) topograph below:<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/conway_growing_graph.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="571" src="http://www.bossylobster.com/images/blog/conway_growing_graph.png" width="640" /></a></div>Notice the <i>values</i> of \(f\) on the combinations of \(e_1\) and \(e_2\) is immaterial to the above discussion, hence the shape of the topograph doesn't depend on \(f\).<br /><br />If we know the values of \(f\) at some superbase, it is actually possibly to find the values of \(f\) at vectors (faces) we encounter on the topograph without actually knowing \(f\).<br /><br /><b>Claim</b>: For vectors \(v_1, v_2\),<br />\[f(v_1 + v_2) + f(v_1 - v_2) = 2\left(f(v_1) + f(v_2)\right)\]<br /><b>Proof</b>: Exercise. (If you really can't get it, let me know in the comments.) \(\Box\)<br /><br />This implies that if<br />\[a = f(v_1), \quad b = f(v_2), \quad c = f(v_1 + v_2), \quad d = f(v_1 - v_2)\]<br />then \(d\), \(a + b\), \(c\) form an arithmetic progression with common difference \(h\). This so--called <i>Arithmetic Progression Rule</i> allows us to mark each edge with a direction based on the value of \(h\). Hence if \(d < a + b < c\), we have \(h > 0\) and the following directed edge:<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/conway_directed_edge.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://www.bossylobster.com/images/blog/conway_directed_edge.png" /></a></div><div class="separator" style="clear: both; text-align: left;"></div><div class="separator" style="clear: both; text-align: left;">Obviously starting from a base \(e_1, e_2\), one wonders if it is possible to move to any pair \((x, y)\) with \(x\) and \(y\) coprime along the topograph. It turns out that we can; the topograph forms a structure called a tree, and all nodes are connected.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b>Lemma</b>: (Climbing Lemma) Given a superbase \(Q\) with the surrounding faces taking values \(a\), \(b\), and \(c\) as below, if the \(a\), \(b\) and the common difference \(h\) are all positive, then \(c\) is positive and the other two edges at \(Q\) point away from \(Q\).</div><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/conway_directed_edge_superbase.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://www.bossylobster.com/images/blog/conway_directed_edge_superbase.png" /></a></div><b>Proof</b>: First \(c\) is positive because \(h = c - (a + b)\), hence \(c = a + b + h > 0\). The two other edges at \(Q\) have common differences \((a + c) - b\) and \((b + c) - a\). Since \(c = a + b + h\) is greater than both \(a\) and \(b\), these differences are positive. \(\Box\)<br /><br />Notice also that this establishes two new triples \((a, a + b + h, 2 a + h)\) and \((b, a + b + h, 2 b + h)\) that continue to point away from each successive superbase and hence <i>climb</i> the topograph. We can use this lemma (along with a specific form) to show that there are no cycles in the topograph, i.e. the topograph doesn't loop back on itself.<br /><br />Consider the form which takes the following values at a given superbase:<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/conway_no_cycle.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://www.bossylobster.com/images/blog/conway_no_cycle.png" /></a></div>Due to the symmetry, we may consider traveling along an edge in any direction from this superbase identically. Picking an arbitrary direction, we reach the following superbase:<br /><div class="separator" style="clear: both; text-align: center;"><a href="http://www.bossylobster.com/images/blog/conway_connected.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://www.bossylobster.com/images/blog/conway_connected.png" /></a></div>Since the values must increase indefinitely as laid out by the climbing lemma, the form can't loop back on itself; if it were to, it would need to loop back to a smaller value. Since this holds in all directions from the original well, there are no cycles.<br /><br />Follow along to <a href="http://blog.bossylobster.com/2011/08/conways-topograph-part-3.html">Part 3</a>.<br /><br /><b>Update</b>: <i>This material is intentionally aimed at an intermediate (think college freshman/high school senior) audience. One can go deeper with it, and I'd love to get more technical off the post.</i><br /><br /><b>Update</b>: <i>All images were created with the <a href="http://www.texample.net/tikz/examples/">tikz</a> LaTeX library and can be compiled with native LaTeX if pgf is installed.</i><br /><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2011/08/conways-topograph-part-2.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-7641034392870464146Tue, 23 Aug 2011 17:30:00 +00002011-11-18T08:47:34.440-08:00AlgebraBinary Quadratic FormConwayConway's TopographMathNumber TheoryConway's Topograph Part 1This is the first in a series of three blog posts. In the following we'll investigate a few properties of an object called Conway’s topograph. <a href="http://en.wikipedia.org/wiki/John_Horton_Conway">John Conway</a> conjured up a way to understand a binary quadratic form – a very important algebraic object – in a geometric context. This is by no means original work, just my interpretation of some key points from his <a href="http://www.amazon.com/Sensual-Quadratic-Carus-Mathematical-Monographs/dp/0883850303">The Sensual (Quadratic) Form</a> that I'll need for some other posts.<br /><br /><hr /><br /><div class="p1"></div><div class="p1"><b>Definition</b>: A binary quadratic form \(f\) is an equation of the form:</div><div class="p1">\[f(x, y) = A x^2 + H x y + B y^2.\]</div><div class="p1">That is, a function of two variables which is homogeneous of degree two. The coefficients \(A\), \(H\), and \(B\) and variables \(x\) and \(y\) are often real numbers, rational numbers or integers. <span class="Apple-style-span">\(\Box\)</span></div><div class="p1"><span class="Apple-style-span"><br /></span></div><div class="p1"><span class="Apple-style-span"></span></div><div class="p1"><span class="Apple-style-span">When we require the coefficients \(A\), \(H\), and \(B\) as well as the variables \(x, y\) to be integers, we get an integer--valued form. In his <i>Disquisitiones Arithmeticae</i>, Gauss asked (and largely answered) the fundamental question: what integer values can each form take? For example, you may have seen the form</span></div><div class="p1"><span class="Apple-style-span">\[f(x, y) = x^2 + y^2,\]</span></div><div class="p1"><span class="Apple-style-span">where it was determined that the only primes (Gaussian primes) occuring were \(2\) and those odd primes congruent to 1 modulo 4.</span></div><div class="p1"><span class="Apple-style-span"><br /></span></div><div class="p1"><span class="Apple-style-span">As each form \(f\) is homogenous degree two, \(f(\lambda x, \lambda y) = \lambda^2 f(x, y)\). As a result, if we can understand the values of \(f\) for pairs \((x, y)\) which don't share any factors, we can understand the entire set of values that \(f\) takes. Also, letting \(\lambda = -1\), there is no change in the value of \(f\) since \(\lambda^2 = 1\), hence it suffices to think of \(v = (x, y)\) as \(\pm v\), i.e. \(\left\{(x, y), (-x, -y)\right\}\).</span></div><div class="p1"><span class="Apple-style-span"><br /></span></div><div class="p1"><span class="Apple-style-span">For integers \(x\) and \(y\), any point \((x, y)\) can be expressed as an integral linear combination of the vectors \(e_1 = (1, 0)\) and \(e_2 = (0, 1)\). So if we like, we can express all relevant inputs for \(f\) in terms of two vectors. However, instead considering \(e_2 = (1, 1)\), we have</span></div><div class="p1"><span class="Apple-style-span">\[(x - y) \cdot e_1 + y \cdot e_2 = (x, y)\]</span></div><div class="p1"><span class="Apple-style-span">and realize a different pair \(e_1, e_2\) which again yield all possible integer valued vectors as integral linear combinations.</span></div><div class="p1"><span class="Apple-style-span"><br /></span></div><div class="p1"></div><div class="p1"><span class="Apple-style-span"><b>Definition</b>: A <i>strict base</i> is an ordered pair \((e_1, e_2)\) whose integral linear combinations are exactly all vectors with integer coordinates. A <i>lax base</i> is a set \(\left\{\pm e_1, \pm e_2\right\}\) obtained from a strict base. \(\Box\)</span></div><div class="p1"><span class="Apple-style-span"><br /></span></div><div class="p1"><span class="Apple-style-span"><b>Definition</b>: A <i>strict superbase</i> is an ordered triple \((e_1, e_2, e_3)\), for which \(e_1 + e_2 + e_3 = (0, 0)\) and \((e_1, e_2)\) is a strict base (i.e., with strict vectors), and a <i>lax superbase</i> is a set \(\langle\pm e_1, \pm e_2, \pm e_3\rangle\) where \((e_1, e_2, e_3)\) is a strict superbase. \(\Box\)</span></div><div class="p1"><span class="Apple-style-span"><br /></span></div><div class="p1"><span class="Apple-style-span">For our (and Conway's) purposes, it is useful to consider the lax notions and leave the strict notions as an afterthought since a binary quadratic form is unchanged given a sign change. From here forward, for a vector \(v\), we use the notation \(v\) interchangeably with \(\pm v\) and when referring to a base/superbase, we are referring to the lax equivalent of these notions.</span><br /><br />Follow along to <a href="http://blog.bossylobster.com/2011/08/conways-topograph-part-2.html">Part 2</a>.<br /><br /><b>Update</b>: <i>This material is intentionally aimed at an intermediate (think college freshman/high school senior) audience. One can go deeper with it, and I'd love to get more technical off the post.</i></div><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2011/08/conways-topograph-part-1.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-1506288107151567633Wed, 17 Aug 2011 15:55:00 +00002011-11-28T13:55:06.254-08:00BenchmarkComparisonDynamic LanguageJavascriptJavascript EngineJITJust-in Time Compilenode.jsPerformanceProject EulerPyPyPythonV8The Lesson V8 Can Teach Python and Other Dynamic LanguagesBeing unable to completely give up math for computers, I am naturally drawn to <a href="http://projecteuler.net/">Project Euler</a> and as a result solved a <a href="http://code.google.com/p/dhermes-project-euler/source/browse/#git%2Fpython_code%2Fcomplete">ridiculous number</a> of the problems posted there while learning Python. A few months ago (March 13), after reading <a href="http://jsninja.com/">Secrets of a Javascript Ninja</a>, I decided to begin converting my <a href="https://github.com/dhermes/ProjectEuler/commit/663ee638c6b8255d00b84173b0ecad1af2c53af1">solutions to Javascript</a>. A month and a half later I <a href="https://github.com/dhermes/ProjectEuler/commit/72c092ccf82c3933944584c2479d2e7ca0ef06f7">came back</a> to it, and then finally two months after that, I <a href="https://github.com/dhermes/ProjectEuler/commit/f19f85978aeeac3310b2175812d53bbea884d73b">began to take it seriously</a>.<br /><br />After making this decision, I noticed the prime Sieve of Eratosthenes was mighty fast when I ran it in Chrome, maybe even faster than my beloved Python. I tabled the thought for a little, but never really forgot it. So a few weeks ago (early August 2011), I <i>finally</i> got a working install of <a href="http://nodejs.org/">node</a> running on my machine and was able to make more of this thought. (I say <i>finally installed</i> because on two previous tries I gave up because of conflicts with my version of gcc, coupled with the fact that I had no good reason to use node.)<br /><br />When I originally did the conversion, I had skipped problem 8, because my implementation required pulling in the problem data as text from a file. While hanging out with <a href="http://twitter.com/#!/borismus">Boris</a> and <a href="https://twitter.com/#!/ebidel">Eric</a> from on the Chrome Developer Relations team, I decided to give it another go on node (xhr requests not allowed) and found it to be quite simple with <span class="Apple-style-span" style="background-color: white; color: purple; font-family: 'Courier New', Courier, monospace;">readFileSync</span> in the node native <span class="Apple-style-span" style="background-color: white; color: purple; font-family: 'Courier New', Courier, monospace;">fs</span> module. After witnessing this, over this weekend, I decided to harness the power of <a href="http://code.google.com/p/v8/">V8</a> -- the Javascript engine that powers Chrome and node -- and run all my scripts locally with node. So over a two day period, I <a href="http://code.google.com/p/dhermes-project-euler/source/detail?r=87b2cf2128be9d13d3b374d8eba9cb4ad808c982">hack-hack-hacked</a> my way into converting the Python solutions for problems 11 through 50 (the remaining unconverted) into their Javascript equivalents, while also converting a good portion of my hefty <a href="http://code.google.com/p/dhermes-project-euler/source/browse/python_code/functions.py">functions</a> module.<br /><br />Once this was done, I had also found I could replace most of the nice parts about Python with my own equivalent. For example, I was able to replace functionality I needed from the Python <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">set</span> datatype with<br /><pre class="prettyprint" style="background-color: white;">function uniq(arr) {<br /> var result = {};<br /> for (var i = 0, val; val = arr[i]; i++) {<br /> result[val] = true;<br /> }<br /><br /> return Object.keys(result);<br />};<br /></pre>and I was able to replace the (amazingly) useful Python handling of <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">long</span> integers with a non-native node package called <a href="https://github.com/substack/node-bigint">bigint</a> that uses libgmp among other usings. Of course, for Python's secret sauce -- the list comprehension -- I was able to substitute enough <span class="Apple-style-span" style="background-color: white; color: purple; font-family: 'Courier New', Courier, monospace;">filter</span>, <span class="Apple-style-span" style="background-color: white; color: purple; font-family: 'Courier New', Courier, monospace;">reduce</span> and <span class="Apple-style-span" style="background-color: white; color: purple; font-family: 'Courier New', Courier, monospace;">map</span> statements to almost make it seem like I had never left Pythonland. After doing all this, I also ended up writing my own <a href="http://code.google.com/p/dhermes-project-euler/source/browse/js/operator.js">operator.js</a> to replace the wonderful Python native module <a href="http://docs.python.org/library/operator.html">operator</a>, and my own <a href="http://code.google.com/p/dhermes-project-euler/source/browse/js/timer.js">timer.js</a> to stand in for the Python native module <a href="http://docs.python.org/library/time.html">time</a>.<br /><br />Finally, I had working code and could do a side by side comparison of V8 and the Python interpreter. <b>Update</b>: <i>I added a column for <a href="http://pypy.org/">PyPy</a>, a just in time implementation of Python. </i>Here is what I found (averaging the runtime over 10 separate calls to each function, the results are):<br /><br /><center><table border="1" style="border-collapse: collapse;"><tbody><tr><th align="left">Problem</th><th align="left">Answer</th><th align="left">Python</th><th align="left">Javascript</th><th align="left">Ratio (PY/JS)</th><th align="left">PyPy</th></tr><tr><td>1*</td><td>233168</td><td>1804ms</td><td>1215ms</td><td>1.48</td><td>385ms</td></tr><tr><td>2*</td><td>4613732</td><td>247ms</td><td>102ms</td><td>2.42</td><td>85ms</td></tr><tr><td>3*</td><td>6857</td><td>4725ms</td><td>1508ms</td><td>3.13</td><td>582ms</td></tr><tr><td>4</td><td>906609</td><td>8708ms</td><td>149ms</td><td>58.44</td><td>282ms</td></tr><tr><td>5*</td><td>232792560</td><td>136ms</td><td>186ms</td><td>0.73</td><td>114ms</td></tr><tr><td>6*</td><td>25164150</td><td>10ms</td><td>4ms</td><td>2.50</td><td>6ms</td></tr><tr><td>7</td><td>104743</td><td>656ms</td><td>12ms</td><td>54.67</td><td>11ms</td></tr><tr><td>8*</td><td>40824</td><td>18045ms</td><td>5014ms</td><td>3.60</td><td>7042ms</td></tr><tr><td>9</td><td>31875000</td><td>610ms</td><td>3ms</td><td>203.33</td><td>8ms</td></tr><tr><td>10</td><td>142913828922</td><td>6628ms</td><td>167ms</td><td>39.69</td><td>116ms</td></tr><tr><td>11</td><td>70600674</td><td>49ms</td><td>2ms</td><td>24.50</td><td>11ms</td></tr><tr><td>12</td><td>76576500</td><td>5127ms</td><td>203ms</td><td>25.26</td><td>100ms</td></tr><tr><td>13*</td><td>5537376230</td><td>1795ms</td><td>10710ms</td><td>0.17</td><td>1423ms</td></tr><tr><td>14</td><td>837799</td><td>5572ms</td><td>1712ms</td><td>3.25</td><td>362ms</td></tr><tr><td>15*</td><td>137846528820</td><td>54ms</td><td>18ms</td><td>3.00</td><td>55ms</td></tr><tr><td>16*</td><td>1366</td><td>1844ms</td><td>265ms</td><td>6.96</td><td>462ms</td></tr><tr><td>17</td><td>21124</td><td>87ms</td><td>4ms</td><td>21.75</td><td>7ms</td></tr><tr><td>18*</td><td>1074</td><td>2291ms</td><td>1790ms</td><td>1.28</td><td>1090ms</td></tr><tr><td>19*</td><td>171</td><td>2254ms</td><td>336ms</td><td>6.71</td><td>342ms</td></tr><tr><td>20*</td><td>648</td><td>1061ms</td><td>9154ms</td><td>0.12</td><td>374ms</td></tr><tr><td>21</td><td>31626</td><td>18910ms</td><td>1038ms</td><td>18.22</td><td>728ms</td></tr><tr><td>22</td><td>871198282</td><td>188ms</td><td>7ms</td><td>26.86</td><td>8ms</td></tr><tr><td>23</td><td>4179871</td><td>83318ms</td><td>1120ms</td><td>74.39</td><td>1295ms</td></tr><tr><td>24*</td><td>2783915460</td><td>206ms</td><td>210ms</td><td>0.98</td><td>139ms</td></tr><tr><td>25</td><td>4782</td><td>5865ms</td><td>35ms</td><td>167.57</td><td>232ms</td></tr><tr><td>26</td><td>983</td><td>28ms</td><td>18ms</td><td>1.56</td><td>4ms</td></tr><tr><td>27</td><td>-59231</td><td>645738ms</td><td>22536ms</td><td>28.65</td><td>28288ms</td></tr><tr><td>28*</td><td>669171001</td><td>8509ms</td><td>1037ms</td><td>8.21</td><td>981ms</td></tr><tr><td>29</td><td>9183</td><td>184ms</td><td>96ms</td><td>1.92</td><td>20ms</td></tr><tr><td>30</td><td>443839</td><td>52167ms</td><td>1037ms</td><td>50.31</td><td>877ms</td></tr><tr><td>31</td><td>73682</td><td>9606ms</td><td>257ms</td><td>37.38</td><td>154ms</td></tr><tr><td>32</td><td>45228</td><td>206888ms</td><td>12096ms</td><td>17.10</td><td>4266ms</td></tr><tr><td>33</td><td>100</td><td>300ms</td><td>6ms</td><td>50.00</td><td>15ms</td></tr><tr><td>34</td><td>40730</td><td>7462ms</td><td>2447ms</td><td>3.05</td><td>247ms</td></tr><tr><td>35</td><td>55</td><td>8617ms</td><td>848ms</td><td>10.16</td><td>242ms</td></tr><tr><td>36</td><td>872187</td><td>189788ms</td><td>2183ms</td><td>86.94</td><td>3532ms</td></tr><tr><td>37</td><td>748317</td><td>2389022ms</td><td>71845ms</td><td>33.25</td><td>61551ms</td></tr><tr><td>38</td><td>932718654</td><td>506ms</td><td>10ms</td><td>50.60</td><td>12ms</td></tr><tr><td>39</td><td>840</td><td>178ms</td><td>6ms</td><td>29.67</td><td>12ms</td></tr><tr><td>40*</td><td>210</td><td>326ms</td><td>202ms</td><td>1.61</td><td>119ms</td></tr><tr><td>41</td><td>7652413</td><td>2627ms</td><td>133ms</td><td>19.75</td><td>65ms</td></tr><tr><td>42</td><td>162</td><td>65ms</td><td>7ms</td><td>9.29</td><td>8ms</td></tr><tr><td>43</td><td>16695334890</td><td>38ms</td><td>2ms</td><td>19.00</td><td>2ms</td></tr><tr><td>44</td><td>5482660</td><td>384013ms</td><td>27744ms</td><td>13.84</td><td>6621ms</td></tr><tr><td>45*</td><td>1533776805</td><td>17ms</td><td>4ms</td><td>4.25</td><td>8ms</td></tr><tr><td>46</td><td>5777</td><td>2864ms</td><td>202ms</td><td>14.18</td><td>65ms</td></tr><tr><td>47</td><td>134043</td><td>400967ms</td><td>12838ms</td><td>31.23</td><td>4425ms</td></tr><tr><td>48</td><td>9110846700</td><td>46ms</td><td>16ms</td><td>2.88</td><td>6ms</td></tr><tr><td>49</td><td>296962999629</td><td>115ms</td><td>8ms</td><td>14.38</td><td>13ms</td></tr><tr><td>50</td><td>997651</td><td>3277ms</td><td>80ms</td><td>40.96</td><td>51ms</td></tr><tr><td colspan="6">*These were very quick to run, so the runtimes are the time taken to run 10000 times.</td></tr></tbody></table></center><br />As you'll notice, standard Python gets its butt kicked. I was kind of saddened by this, but in the end, just giddy that our web is faster because of it (90% of my life is digital) and also that we can do scripting faster on the server side (attribute to <a href="http://twitter.com/#!/borismus">Boris Smus</a>) because of the node project (thanks Ryan Dahl).<br /><br />Standard Python is actually slower in 46 of the 50 problems. In 28 of the 46 node is faster by a factor of 10 or greater, in 9 of those 28 by a factor of 50 or greater and in 2 of the 9 by a factor of 100 or greater! The only 4 in which Python was faster were from the <i><b>n = 10000</b></i> sample. In fact, I was able to pinpoint exactly why:<br /><ul><li>#5 - My own Javascript implementation of <span class="Apple-style-span" style="background-color: white; color: purple; font-family: 'Courier New', Courier, monospace;">gcd</span> is slower than the native (<span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">from fractions import gcd</span>) Python library (resulting in a difference of 50 ms over 10000 iterations)</li><li>#13 - The node package <span class="Apple-style-span" style="background-color: white; color: purple; font-family: 'Courier New', Courier, monospace;">bigint</span> is slower than the Python native <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">long int</span> (Javascript is slower by a factor of 6)</li><li>#20 - The node package <span class="Apple-style-span" style="background-color: white; color: purple; font-family: 'Courier New', Courier, monospace;">bigint</span> is slower than the Python native <span class="Apple-style-span" style="color: lime; font-family: 'Courier New', Courier, monospace;">long int</span> (Javascript is slower by a factor of 8.5)</li><li>#24 - Having to perform two <span class="Apple-style-span" style="background-color: white; color: purple; font-family: 'Courier New', Courier, monospace;">slice</span>s is slower in Javascript than in Python and there is no good way to just remove one element (resulting in a difference of 4 ms over 10000 iterations; a little bit about that <a href="http://ejohn.org/blog/javascript-array-remove/">here</a>)</li></ul><div>So what, you ask, is that lesson I speak of? Well, Javascript didn't used to be this fast. How did it get that way? The brilliant and inspired people behind V8 rethought the Javascript compile steps and after much work, we now have code that is closer to the metal (attribute to: <a href="https://twitter.com/#!/ebidel">Eric Bidelman</a>, i.e. closer to machine code) than we had ever had before. The use of just-in-time compilation and other incredible techniques has taken a formerly slow and clunky language (Javascript) which was used well beyond its original intended scope, and turned it into a damn fast dynamic language. Hopefully, this movement will make its way to Python and other dynamic languages and we can all have our code end up this close to the metal.<br /><br /><b>Update</b>: <i>In response to the comments, I ran the same code on the same machine, but with PyPy in place of Python. This is the direction I hope standard Python goes in and commend the guys pumping out faster and faster just in time implementations over at PyPy. I went through and counted 20 node wins, 29 PyPy wins and 1 tie. (I reserve the right to update the post with more detailed metrics.) While I do commend them, the results don't really belong in this post because PyPy is still an offshoot. (However, as I understand, they both have ties to C, as PyPy uses GCC to compile to bytecode and V8 is written in C++. Feel free to supplement my knowledge in the comments.)</i><br /><br /><b>Update</b>: <i>All benchmarking was run on my Mac Pro Desktop with a 3.2 GHz Quad-Core Intel Xeon processor and 4 cores for a total of 12 GB 1066 MHz DDR3 memory. I used Python version 2.6.1, node version 0.4.9, and PyPy version 1.5 (running on top of Python 2.7.1 with GCC 4.0.1).</i></div><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2011/08/lesson-v8-can-teach-python-and-other.htmlnoreply@blogger.com (Danny Hermes)29tag:blogger.com,1999:blog-1697307561385480651.post-4255405360798955481Sat, 23 Jul 2011 02:02:00 +00002011-11-18T08:44:51.442-08:00GoogleGoogle CodesiteIn-App PaymentsWoo hooIt's official, I am a grown up!<br /><a href="http://code.google.com/apis/inapppayments/articles/index.html" target="_blank">http://code.google.com/apis/inapppayments/articles/index.html</a><br /><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2011/07/woo-hoo.htmlnoreply@blogger.com (Danny Hermes)0tag:blogger.com,1999:blog-1697307561385480651.post-8570118492196452081Mon, 18 Jul 2011 21:19:00 +00002011-11-18T08:48:35.324-08:00AlgebraContinued FractionsMathQuadratic IrrationalSquare RootContinued fraction expansions of irrational square rootsI had no idea (until this Thursday, July 16) that I had never seen a proof of the fact that the continued fraction expansion of \(\sqrt{D}\) is periodic whenever \(D\) is not a perfect square. But have no fear, I found out about something called a <i>reduced quadratic irrational</i> and now have a proof. Here we go.<br /><br /><b>Definition</b>: An irrational root \(\alpha\) of a quadratic equation with integer coefficients is called <i>reduced</i> if \(\alpha > 1\) and its conjugate \(\tilde{\alpha}\) satisfies \(-1 < \tilde{\alpha} < 0\). \(\Box\)<br /><br />Solutions (since assumed real) of such quadratics can be written as $$\alpha = \frac{\sqrt{D} + P}{Q}$$ where \(D, P, Q \in \mathbf{Z}\) and \(D, Q > 0\). It is also possible (though not required) to ensure that \(Q\) divides \(D - P^2\). This is actually a necessary assumption for some of the stuff I do, is mentioned <a href="http://en.wikipedia.org/wiki/Periodic_continued_fraction#Relation_to_quadratic_irrationals" target="_blank">here</a> and generally frustrated the heck out of me, so that. As an example for some enlightenment, notice $$\alpha = \frac{2 + \sqrt{7}}{4}$$ is reduced but \(4\) does not divide \(7 - 2^2\). However, if we write this as \(\frac{8 + \sqrt{112}}{16}\), we have our desired condition.<br /><br /><b>Definition</b>: We say a reduced quadratic irrational \(\alpha\) is <i>associated</i> to \(D\) if we can write $$\alpha = \frac{P + \sqrt{D}}{Q}$$ and \(Q\) divides \(D - P^2\). \(\Box\)<br /><b><br /></b><br /><b>Lemma 1</b>: Transforming a reduced irrational root \(\alpha\) associated to \(D\) into its integer part and fractional part via $$\alpha = \lfloor \alpha \rfloor + \frac{1}{\alpha'},$$ the resulting quadratic irrational \(\alpha'\) is reduced and associated to \(D\) as well. (This is what one does during continued fraction expansion, and as I did with \(\sqrt{2}\) during my <a href="http://blog.bossylobster.com/2011/07/continued-fractions-for-greater-good.html">last post</a>.)<br /><br /><b>Proof</b>: Letting $$\alpha = \frac{\sqrt{D} + P}{Q}$$ and \(X = \lfloor \alpha \rfloor\), we have $$\frac{1}{\alpha'} = \frac{\sqrt{D} - (QX - P)}{Q}.$$<br /><br /><ul><li>Since \(\sqrt{D}\) is irrational, we must have \(\frac{1}{\alpha'} > 0\) and since \(\frac{1}{\alpha'}\) is the fractional part we know $$0 < \frac{1}{\alpha'} < 1 \Rightarrow \alpha' > 1.$$ </li><li>Transforming $$\alpha' = \frac{Q}{\sqrt{D} - (QX - P)} \cdot \frac{\sqrt{D} + (QX - P)}{\sqrt{D} + (QX - P)} = \frac{\sqrt{D} + (QX - P)}{\frac{1}{Q}\left(D - (QX - P)^2\right)},$$ we have \(P' = QX - P\) and \(Q' = \frac{1}{Q}\left(D - (QX - P)^2\right)\) and need to show \(Q' \in \mathbf{Z}\). But \(D - (QX - P)^2 \equiv D - P^2 \bmod{Q}\) and since \(\alpha\) is associated to \(D\), \(Q\) must divide this quantity, hence \(Q'\) is an integer.</li><li>Since \(X = \lfloor\frac{\sqrt{D} + P}{Q}\rfloor\) is an integer and \(\alpha\) is irrational, we know \(X < \frac{\sqrt{D} + P}{Q}\) hence \(P' = QX - P < \sqrt{D}\) forcing \(\tilde{\alpha}' < 0\).</li><li>Since \(\alpha > 1\) we know \(X \geq 1 \Leftrightarrow 0 \leq X - 1\). Thus \begin{align*}\tilde{\alpha} = \frac{P - \sqrt{D}}{Q} &< 0 \leq X - 1 \\ \Rightarrow Q &< \sqrt{D} + (QX - P) \\ \Rightarrow Q(\sqrt{D} - (QX - P)) &< D - (QX - P)^2 \\ \Rightarrow -\tilde{\alpha}' = \frac{\sqrt{D} - (QX - P)}{\frac{1}{Q}\left(D - (QX - P)^2\right)} &< 1 \end{align*} hence \(\tilde{\alpha}' > -1\) and \(\alpha'\) is reduced.</li><li>Since \(Q' = \frac{1}{Q}\left(D - (P')^2\right)\), we know $$D - (P')^2 \equiv Q Q' \equiv 0 \bmod{Q'}$$ hence \(\alpha'\) is associated to \(D\).</li></ul><br /><b><span class="Apple-style-span" style="font-weight: normal;">Thus \(\alpha'\) is both reduced and associated to \(D\). \(\Box\)</span></b><br /><b><br /></b><br /><b>Lemma 2</b>: There are finitely many reduced quadratic irrationals associated to a fixed \(D\).<br /><br /><b>Proof</b>: As above write an arbitrary reduced irrational as \(\alpha = \frac{\sqrt{D} + P}{Q}\). Since \(\alpha > 1\) and \(\tilde{\alpha} > -1\), we know \(\alpha + \tilde{\alpha} = \frac{2P}{Q} > 0\) hence with the assumption \(Q > 0\) we have \(P > 0\). Since \(\tilde{\alpha} < 0\) we also have \(P < \sqrt{D}\). Also, since \(\alpha > 1\) by assumption we have \(Q < P + \sqrt{D} < 2\sqrt{D}\) thus there are finitely many choices for both \(P\) and \(Q\), forcing finitely many reduced quadratic irrationals associated to a fixed \(D\) (this amount is strictly bounded above by \(2D\)). \(\Box\)<br /><br /><b>Claim</b>: The continued fraction expansion of \(\sqrt{D}\) is periodic whenever \(D\) is not a perfect square.<br /><br /><b>Proof</b>: We'll use Lemma 1 to establish a series of reduced quadratic irrationals associated to \(D\) and then use Lemma 2 to assert this series must repeat (hence be periodic) due to the finite number of such irrationals.<br /><br />Write \(a_0 = \lfloor \sqrt{D} \rfloor\) and \(\sqrt{D} = a_0 + \frac{1}{\alpha_0}\). From here, we will prove<br /><br /><ul><li>\(\alpha_0\) is a reduced quadratic irrational associated to \(D\).</li><li>By defining \(a_{i+1} = \lfloor \alpha_i \rfloor\) and \(\alpha_i = a_{i + 1} + \frac{1}{\alpha_{i + 1}}\), \(\alpha_{i + 1}\) is also a reduced quadratic irrational associated to \(D\) (assuming all \(\alpha\) up until \(i\) are as well).</li></ul><br /><br />Since \(\frac{1}{\alpha_0}\) is the fractional part of the irrational \(\sqrt{D}\), we have $$0 < \frac{1}{\alpha_0} < 1 \Rightarrow \alpha_0 > 1.$$ By simple algebra, we have $$\alpha_0 = \frac{a_0 + \sqrt{D}}{D - a_0^2}, \qquad \tilde{\alpha_0} = \frac{a_0 - \sqrt{D}}{D - a_0^2}.$$ Since \(a_0\) is the floor, we know \(a_0 - \sqrt{D} < 0 \Rightarrow \tilde{\alpha_0} < 0\). Since \(D \in \mathbf{Z} \Rightarrow \sqrt{D} > 1\) and \(\sqrt{D} > a_0\), we have $$1 < \sqrt{D} + a_0 \Rightarrow \sqrt{D} - a_0 < D - a_0^2 \Rightarrow a_0 - \sqrt{D} > -(D - a_0^2) \Rightarrow \tilde{\alpha_0} > -1.$$ Thus \(\alpha_0\) is a reduced quadratic irrational. Since \(P_0 = a_0\) and \(Q_0 = D - a_0^2 = D - P_0^2\), \(Q_0\) clearly divides \(D - P_0^2\) so \(\alpha_0\) is associated to \(D\) as well.<br /><br />Following the recurrence defined, since each \(\alpha_i\) is a reduced quadratic irrational, each \(a_i \geq 1\). Also, by Lemma 1, each \(\alpha_{i + 1}\) is reduced and associated to \(D\) since \(\alpha_0\) is. By Lemma 2, we only have finitely many choices for these, hence there must be some smallest \(k\) for which \(\alpha_k = \alpha_0\). Since \(\alpha_{i + 1}\) is determined completely by \(\alpha_i\) we will then have \(\alpha_{k + j} = \alpha_j\) for all \(j > 0\), hence the \(\alpha_i\) are periodic. Similarly, as the \(a_i\) for \(i > 0\) are determined completely by \(\alpha_{i - 1}\), the \(a_i\) must be periodic as well, forcing the continued fraction expansion $$\sqrt{D} = a_0 + \cfrac{1}{a_1 + \cfrac{1}{a_2 + \ddots}}$$ to be periodic.\(\Box\)<br /><br /><b>Update:</b> I <a href="http://www.proofwiki.org/wiki/Continued_Fraction_Expansion_of_Irrational_Square_Root" target="_blank">posted this</a> on <img border="0" src="http://2.bp.blogspot.com/-W5Bi5WJPoZw/TiUQmeH06-I/AAAAAAAAA3U/RI-_kAcBlo4/s1600/175px-Logo.png" style="background-color: white;" valign="middle" /><br /><a href="https://profiles.google.com/114760865724135687241" rel="author" style="display: none;">About Bossy Lobster</a>http://blog.bossylobster.com/2011/07/continued-fraction-expansions-of.htmlnoreply@blogger.com (Danny Hermes)0