Jekyll2018-04-02T14:35:01+00:00http://harrisongoldste.in/Harrison GoldsteinHarrison Goldstein is currently pursuing a masters degree in Computer Science at Cornell University. His main area of interest is programming languages, both theory and applications.
The (Regular) Language of Dance2018-04-02T00:00:00+00:002018-04-02T00:00:00+00:00http://harrisongoldste.in/languages/2018/04/02/language-of-dance<p>Over the last month or so, I’ve gotten really into swing dancing. It’s a new and
exciting challenge for me—I’m not the most coordinated person, and dancing is
a unique kind of social interaction. For the most part, this has taken me well
out of my comfort zone. Luckily, as is often the case with learning new things,
I’ve found ways to relate these new ideas to things I already know. In
particular, I realized that if you reeeally squint, swing dancing looks kind of
like a regular language.</p>
<h3 id="crash-course-in-swing-dancing">Crash Course in Swing Dancing</h3>
<div style="text-align: center; margin: 20px">
<img src="../../../../img/swingout.gif" />
</div>
<p>The dance that I’ve been learning is usually called
<a href="https://en.wikipedia.org/wiki/East_Coast_Swing">East Coast Swing</a>. It’s a
really fun social dance that is set to quick swing music.</p>
<p>The basic East Coast Swing step is a six-count pattern—depending on the tempo
of the music, the pattern is either “rock step, triple step, triple step” or
“rock step, step, step”. Often, dancers will also incorporate elements of <em>Lindy
Hop</em>, a very closely related dance, which uses the eight-count pattern “rock
step, triple step, step, step, triple step”. The latter is what’s happening in
the gif above.</p>
<p>Of course, there’s a lot more to dancing than footwork. In swing, partners dance
in a number of different holds and positions, and there are also tons of
different moves that make the dance fun and interesting. I highly recommend
checking out some YouTube videos of people dancing to get an idea of what I
mean.</p>
<h3 id="a-dancing-machine">A Dancing Machine</h3>
<p>When I started learning different swing moves, I realized that each move was
sort of a “transition” from one “state” to another. For example, the partners
might be dancing in closed position and the lead might use a “tuck-turn” to
transition to open position. More subtly, a left-side pass from open position
might leave the couple back in open position, but with the leader’s hand on top
of the follow’s hand (normal open position has the leader’s hands under the
follow’s).</p>
<p>As a computer scientist, when I hear “states” and “transitions” I immediately
think of
<a href="https://en.wikipedia.org/wiki/Deterministic_finite_automaton">finite automata</a>.
A finite automaton is a mathematical structure that is often written out as a
graph like the ones below. The nodes are states, and the edges represent
transitions. When a node <em>p</em> has an edge labeled <em>a</em> to another node <em>q</em>, you’re
allowed to “do <em>a</em>” to transition from <em>p</em> to <em>q</em>. Here is an automata that I
built based on the first few moves that I learned when I started doing
swing.<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup></p>
<div style="text-align: center; margin: 20px">
<img width="90%" src="../../../../img/dance.svg" />
</div>
<p>As you can see, even with just a few moves and states, the diagram gets pretty
complicated pretty quickly. Focusing in on a particular state often helps to
clear things up; for example, looking at “closed”, we see that a basic step
would keep the couple in closed position, and a tuck-turn would transition to
open position.</p>
<h3 id="keeping-count">Keeping Count</h3>
<p>As we increase the complexity of the moves that we allow, there are a number of
interesting things that we can add to the model. Here’s a slightly more
complicated automaton that incorporates some 8-count “Lindy Hop” moves.</p>
<div style="text-align: center; margin: 20px">
<img width="90%" src="../../../../img/dance2.svg" />
</div>
<p>The main thing to notice is that when I added 8-count moves, I added extra
states. Strictly speaking, dancing 8-count moves doesn’t correspond to a
different <em>physical</em> position, but it does correspond to a different <em>mental</em>
position. When leading, it is important to be able to communicate your intent.
Your partner needs to know that you want to start dancing 8-count patterns after
a while of doing 6-count, and vice versa. Since my partners and I are usually
beginners, I try to use a simple move (like a basic) to go between standard East
Coast and Lindy.</p>
<h3 id="speaking-my-language">Speaking My Language</h3>
<p>It should be clear that if we focus in on only the moves and positions, we can
represent the space of East Coast Swing routines using a finite automaton. At
any given point in the dance, all of the “valid” moves would be available as
transitions, and other ones wouldn’t. It is a well known fact in computer
science that every finite automaton corresponds to a (regular)
<em>language</em>—formally some set of sequences of symbols (strings) that satisfies
some property. In this case, the “symbols” are swing moves, and the property is
that the whole sequence makes sense as a dance. This means that we literally
have a “language” of swing dance, and it corresponds exactly to routines.</p>
<p>Taking this a step further, we can examine exactly which kinds of strings are in
the language of East Coast Swing. According to my “beginner” automaton above,
the string</p>
<blockquote>
<p>basic, basic, tuck-turn, left-side pass, right-side pass</p>
</blockquote>
<p>is a valid dance, but</p>
<blockquote>
<p>right-side pass, basic</p>
</blockquote>
<p>is not (since you can’t do a right-side pass from closed position) and neither
is</p>
<blockquote>
<p>tuck-turn, left-side-pass, left-side pass</p>
</blockquote>
<p>because the lead’s hand would get all flipped around.</p>
<p>When I lead a dance, I can keep a model like this in my head. Obviously, I’m not
actually simulating the automaton in earnest, nor am I really thinking about
strings and languages, but thinking about moves this way gives me a framework
for deciding what to do next. If the “strings” that my partner and I dance are
in the language of swing, I can be sure that the dance will feel natural.</p>
<h3 id="next-steps-">Next Steps <img height="30px" src="../../../../img/steps.jpg" /></h3>
<p>I have a lot of cool ideas for ways to use this language model of dance, and
I’ll probably write another post exploring one or more of those options later.
For now, I’ll just mention a couple of my ideas.</p>
<p><strong>Non-determinism.</strong></p>
<p>The automata above are both (mostly) deterministic. This means that the current
state completely determines the effect of a move. Put more simply, it means that
there are no two arrows with the same label coming from the same source state.</p>
<p>Technically speaking, labeling both “closed” and “closed, 8-count” as start
states is a form of non-determinism, but this is resolved as soon as the dance
starts, so it isn’t very interesting. A more interesting form of non-determinism
is “<script type="math/tex">\varepsilon</script>-transitions”. An <script type="math/tex">\varepsilon</script>-transition is a transition
that happens without a symbol being read, or, in this case, without a move being
done. It turns out that these transitions don’t actually change the set of
languages that automata can represent, but they do offer a compact and intuitive
way to consider state changes that don’t depend on the input/moves.</p>
<p>If I allowed myself to use <script type="math/tex">\varepsilon</script>-transitions, I could encode things
like the music speeding up, or even something as silly as <em>getting tired</em>. In
both of these cases, we might want to restrict ourselves to simpler moves that
might be less time-consuming and labor-intensive. In the model, this would
correspond to a set of <script type="math/tex">\varepsilon</script>-transitions into a copy of the original
machine with the same states but fewer transitions.</p>
<p><strong>Matching up with music.</strong></p>
<p>Dancing is rarely done without music, and swing is no exception. A large part of
“getting good” at swing is developing musicality—a sense of how the music
should effect your choice of moves. If we want to express musicality in our
automata model, one place to start might be representing the music as an
automaton as well.</p>
<div style="text-align: center; margin: 20px">
<img width="60%" src="../../../../img/dance3.svg" />
</div>
<p>This simple automaton counts out the four beats in a measure of swing music.
Let’s say we want to enforce that an 8-count move should always start at the
beginning of a measure. This won’t always happen, since a 6-count moves often
end mid-measure, but it usually feels better.</p>
<p>One way we could do this would be to simulate both automata simultaneously. Each
time we do a 6-count move, we advance the music automaton 6 steps, and each time
we do a 8-count move, advance it 8 steps. Now, we can make sure that 8-count
moves start at the beginning of a measure by making sure that the music
automaton is in state 1 before we do an 8-count move. I haven’t quite worked out
a formal way of doing this yet, but I’m sure there’s a clean way to work the
music into the model.</p>
<h3 id="conclusion">Conclusion</h3>
<p>I am fully aware that I made a ton of simplifications when talking about
dancing. There’s a lot more to swing than strings of moves, and even if you do
ignore all of the human and musical aspects, it is still way more complex than I
made it out to be. My point wasn’t to fully capture all of swing dance in one
computer science formalism—I just wanted to explore the space and see if I
could learn something.</p>
<p>Dancing is fun<sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup>, and computer science is fun<sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup>. Putting them together is a
natural way for me to explore each in a little bit more depth, and in this case,
I think I got something pretty cool out of it.</p>
<p><br />
<br /></p>
<hr />
<p><br /></p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>If you are already familiar with automata, you’ll notice that I’m ignoring
accepting states. We could theoretically use accepting states to decide
which moves are “fun” to end a routine on and which aren’t. Here, for the
sake of simplicity, I’m going to assume that every state is accepting except
for the implicit “trap” state. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>For sufficiently musical definitions of fun. <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>For sufficiently nerdy definitions of fun. <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Over the last month or so, I’ve gotten really into swing dancing. It’s a new and exciting challenge for me—I’m not the most coordinated person, and dancing is a unique kind of social interaction. For the most part, this has taken me well out of my comfort zone. Luckily, as is often the case with learning new things, I’ve found ways to relate these new ideas to things I already know. In particular, I realized that if you reeeally squint, swing dancing looks kind of like a regular language.Derivatives of Regular Expressions2017-09-30T00:00:00+00:002017-09-30T00:00:00+00:00http://harrisongoldste.in/languages/2017/09/30/derivatives-of-regular-expressions<blockquote>
<p>Quick disclaimer: The ideas in this blog post are not my original work. I am
paraphrasing from lectures given by both Nate Foster and Dexter Kozen at
Cornell University, and adding some of my own intuition and insights where I
think it is helpful. My intent is to increase awareness of a cool thing that I
am excited about, not to pass any of this work off as my own.</p>
</blockquote>
<p>Regular expressions come up a lot in computer science. From a theory
perspective, they are a compact and intuitive way to understand regular
languages. In practice, they allow programmers to recognize phone numbers,
search for files, and even parse HTML.<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup> Up until about a month ago, I
thought I knew everything I wanted to know about regular expressions, and then I
discovered Brzozowski derivatives.</p>
<p>Before I start, let’s take a step back and define exactly what we mean by
regular expressions. Here’s a nice inductive definition:</p>
<script type="math/tex; mode=display">r ::= \varnothing \mid a \mid r_1 + r_2 \mid r_1 r_2 \mid r^* \quad\quad a
\in \Sigma</script>
<p>Note that this definition is minimal—I don’t include things like <script type="math/tex">r^+</script> or
<script type="math/tex">r?</script> because they can be written in terms of the other operators. One bit of
common notation that I <em>will</em> use is <script type="math/tex">\varepsilon</script> to instead of
<script type="math/tex">\varnothing^*</script>; both are the regular expression denoting the empty string.</p>
<p>At this point, I’ll assume you have a general understanding of how to interpret
regular expressions; so if I write <script type="math/tex">a^*b^* + c</script>, you should know that it
denotes any string that is either zero or more <script type="math/tex">a</script>’s followed zero or more
<script type="math/tex">b</script>’s, or just <script type="math/tex">c</script>.</p>
<h3 id="the-brzozowski-derivative">The Brzozowski Derivative</h3>
<p>With notation out of the way, we can start to look at what a Brzozowski
derivative is.<sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup> Intuitively, it is a way of partially interpreting a regular
expression. The derivative of <script type="math/tex">r</script> with respect to a character <script type="math/tex">a</script>,
<script type="math/tex">D_a(r)</script>, is a new regular expression that matches all strings from <script type="math/tex">r</script> that
started with an <script type="math/tex">a</script>, but without the <script type="math/tex">a</script>. We “take <script type="math/tex">a</script> off the front of
<script type="math/tex">r</script>”. For example,</p>
<script type="math/tex; mode=display">D_b(foo + bar + baz) = ar + az</script>
<p>Since <script type="math/tex">foo</script> doesn’t start with a <script type="math/tex">b</script>, we dropped that part of the expression
altogether. For each of the other pieces, we just took a <script type="math/tex">b</script> off of the front.</p>
<p>Now that we understand what we’re going for, let’s actually define a way to
compute <script type="math/tex">D_a(r)</script>. We’ll do it inductively, step by step.</p>
<hr />
<script type="math/tex; mode=display">D_a(\varnothing) = \varnothing</script>
<p>This one should be pretty obvious. If you take <script type="math/tex">a</script> off of every string in
<script type="math/tex">\varnothing</script>… well there were no strings to begin with.</p>
<hr />
<script type="math/tex; mode=display">% <![CDATA[
D_a(c) = \begin{cases}
\varepsilon & a = c \\
\varnothing & a \neq c
\end{cases} %]]></script>
<p>The idea here is that if you try to take <script type="math/tex">a</script> off of the string <script type="math/tex">a</script>, you get
an empty string back, and if you try to take <script type="math/tex">a</script> off of the string <script type="math/tex">c</script>
(where <script type="math/tex">c</script> is some character that isn’t <script type="math/tex">a</script>), you just can’t do it.</p>
<hr />
<script type="math/tex; mode=display">D_a(r_1 + r_2) = D_a(r_1) + D_a(r_2)</script>
<p>If you want to take an <script type="math/tex">a</script> off the front of an alternation, you can either
take it off of the first expression, or off of the second.</p>
<hr />
<script type="math/tex; mode=display">D_a(r_1r_2) = D_a(r_1)r_2 + E(r_1)D_a(r_2)</script>
<p>Uh oh. What does <script type="math/tex">E(r)</script> mean? It’s actually totally straightforward, and I’ll
define it in detail soon. For now, just know that <script type="math/tex">E(r) = \varepsilon</script> if
<script type="math/tex">r</script> can denote the empty string, and <script type="math/tex">\varnothing</script> otherwise. With that in
mind, this statement says that taking <script type="math/tex">a</script> off of a concatenation either means
taking <script type="math/tex">a</script> off of the first expression, or <strong>if the first expression can be
empty</strong> taking <script type="math/tex">a</script> off of the second expression.</p>
<hr />
<script type="math/tex; mode=display">D_a(r^*) = D_a(r)r^*</script>
<p>Finally, we can say that taking an <script type="math/tex">a</script> off of a sequence of <script type="math/tex">r</script>’s means
taking <script type="math/tex">a</script> off of the first <script type="math/tex">r</script>, and leaving a sequence of <script type="math/tex">r</script>’s after
that. This looks a little silly, but if you play around with it for a bit, it
should make sense.<sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup></p>
<h3 id="making-observations">Making Observations</h3>
<p>Let’s go back and define <script type="math/tex">E(r)</script>, which we’ll call the observation function.
Remember that it “observes” whether <script type="math/tex">r</script> can denote the empty string, and
returns <script type="math/tex">\varepsilon</script> or <script type="math/tex">\varnothing</script> accordingly. Here’s the definition:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{aligned}
E(\varnothing) &= \varnothing \\
E(a) &= \varnothing \\
E(r_1 + r_2) &= E(r_1) + E(r_2) \\
E(r_1r_2) &= E(r_1)E(r_2) \\
E(r^*) &= \varepsilon
\end{aligned} %]]></script>
<p>The only tricky thing here is convincing yourself that the <script type="math/tex">+</script> and <script type="math/tex">\cdot</script>
cases work. These facts might help:<sup id="fnref:4"><a href="#fn:4" class="footnote">4</a></sup></p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{aligned}
\varnothing + r &= r \\
r + \varnothing &= r \\
\varnothing r &= \varnothing \\
r \varnothing &= \varnothing \\
\varepsilon r &= r \\
r \varepsilon &= r
\end{aligned} %]]></script>
<p>It turns out that <script type="math/tex">E</script> will be more important than just helping us define the
derivative. We can actually use the observation function to tell us about which
strings match a given expression.</p>
<h3 id="matching-strings">Matching Strings</h3>
<p>We’re finally ready to implement a regular expression matcher. Let’s can extend
our derivative function from earlier to handle entire strings:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{aligned}
\textbf{D}_{\varepsilon}(r) &= r \\
\textbf{D}_{ax}(r) &= \textbf{D}_x(D_a(r))
\end{aligned} %]]></script>
<p>You can think of this as taking a derivative with respect to each character of
the string, in order, and accumulating the result. I now claim that <script type="math/tex">r</script>
matches a string <script type="math/tex">x</script> if and only if</p>
<script type="math/tex; mode=display">E(\textbf{D}_x(r)) = \varepsilon</script>
<p>So how does this work? Well, <script type="math/tex">\textbf{D}_x(r)</script> goes character-by-character in
<script type="math/tex">x</script>, taking each character off of <script type="math/tex">r</script>. This means that by the end, we will
have a regular expression that matches everything left in <script type="math/tex">r</script> after taking the
string <script type="math/tex">x</script> off the front.</p>
<p>If we take <script type="math/tex">x</script> off of the strings in <script type="math/tex">r</script> and that set contains the empty
string, then it must be the case that <script type="math/tex">x</script> was in <script type="math/tex">r</script> to start with!
Conversely, if we know that <script type="math/tex">r</script> matched <script type="math/tex">x</script> to start with, then removing
<script type="math/tex">x</script> from <script type="math/tex">x</script> would leave us with <script type="math/tex">\varepsilon</script>.</p>
<p>Practically, this means that we can use Brzozowski derivatives to write regular
expression matchers in code! I have a
<a href="https://gist.github.com/hgoldstein95/0fe2def7591b44391521d988f28abf03">Haskell implementation</a>
as a gist on GitHub that you can check out, and I am also currently writing a
verified version in Coq.</p>
<h3 id="why-im-excited">Why I’m Excited</h3>
<p>When I first learned about regular expressions formally, we were given a process
for implementing them:</p>
<ol>
<li>Transform the regular expression into an <script type="math/tex">\varepsilon</script>-NFA, using a
Thompson construction.</li>
<li>Turn that <script type="math/tex">\varepsilon</script>-NFA into a normal NFA.</li>
<li>Determinize the NFA to get a DFA.</li>
<li>Run the DFA on the input string.</li>
</ol>
<p>There are things that I love about this algorithm too. It relies on the amazing
result that regular expressions, NFAs, and DFAs are all the same, and the
Thompson construction itself is really brilliant. But there’s just something
that feels so nice and PL-ey about the derivative approach. Rather than deal
with intermediate representations and stateful algorithms, we can just define
our desired result by induction, and write pure functions that capture our
intent. The Brzozowski derivatives are also totally <em>symbolic</em>. The whole
process is just replacing symbols with other symbols, which obviates the need
for any complex reasoning.</p>
<p>Ultimately, this algorithm captures the reason that I study programming
languages. For me, doing computer science isn’t about just solving the
problem.<sup id="fnref:5"><a href="#fn:5" class="footnote">5</a></sup> It’s about seeing the structure of the problem that you are working
with, and letting that structure guide you to an answer. It’s about avoiding
complex decision procedures in favor of symbolic manipulations that simplify and
transform your goal. At the end of the day, Brzozowski derivatives are just a
different way of looking at regular expressions—but I think they’re a really
freaking cool way of looking at regular expressions, so I wrote a blog post.</p>
<p><br />
<br /></p>
<hr />
<p><br /></p>
<p>Notes:</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>For those of you who don’t get the joke, this
<a href="https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454">stack overflow answer</a>
is a must-read. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>Technically, this is the Brzozowski <em>Syntactic</em> Derivative. There is also
a Semantic Derivative that deals with DFAs and their denotations. <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>If you reeeeally squint at these last two definitions you might see
something familiar. The concatenation and star rules here are similar in
structure to the product and power rules for derivatives in calculus. I
doubt this is just a coincidence. If I find a satisfying reason why, I’ll
probably write another post about it. <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
<li id="fn:4">
<p>I’m being sort of sloppy with my notation around equality. What I really
mean is that <script type="math/tex">[\![\varnothing + r ]\!] = [\![ r ]\!]</script>, etc., so <script type="math/tex">E(r)</script>
might not actually be equal to <script type="math/tex">\varepsilon</script> or <script type="math/tex">\varnothing</script>, but it
will always be denotationally equal to one or the other. <a href="#fnref:4" class="reversefootnote">↩</a></p>
</li>
<li id="fn:5">
<p>To be clear, there’s nothing wrong with “just solving the problem”—in
fact, that’s usually a far more effective approach. <a href="#fnref:5" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Quick disclaimer: The ideas in this blog post are not my original work. I am paraphrasing from lectures given by both Nate Foster and Dexter Kozen at Cornell University, and adding some of my own intuition and insights where I think it is helpful. My intent is to increase awareness of a cool thing that I am excited about, not to pass any of this work off as my own.Sitcom Drama and Compilers2017-05-26T00:00:00+00:002017-05-26T00:00:00+00:00http://harrisongoldste.in/languages/2017/05/26/sitcoms-and-compilers<p>Let’s say we’re watching a sitcom. Peter is our lovable, naive protagonist, and
he’s dating Mallory. Mallory isn’t good for Peter, and always tries to convince
him to do the wrong thing. We don’t like Mallory.</p>
<h2 id="scene-1">Scene 1</h2>
<p>Peter and Mallory have been dating for a few months, and Mallory asks Peter to
share a very personal <em>memory</em> with her. Peter isn’t sure, so he goes to talk to
his friends Charlie and Rob. As always, Charlie is very trusting. He says</p>
<blockquote>
<p>“Of course you can share that memory with her. You trust her, and so do I!”</p>
</blockquote>
<p>Rob, on the other hand, sees a problem. He notices that Mallory could
potentially turn this good <em>memory</em> into a bad one for Peter. She might <em>mutate</em>
it in Peter’s mind, totally ruining it! He recommends that Peter keep the memory
to himself.</p>
<blockquote>
<p>“You’re just saying that because you don’t like her!”</p>
</blockquote>
<p>Peter exclaims. But Rob is calm and holds his ground. He reiterates that sharing
memories can be dangerous, and suggests other ways for the couple to connect.</p>
<p>In the end, Peter appreciates that Rob was looking out for him, and decides to
keep Mallory’s access to his memories somewhat restricted.</p>
<h3 id="scene"><em>Scene</em></h3>
<p>In the world of programming, Charlie is played by the C compiler, and Rob is the
compiler for <a href="https://www.rust-lang.org/en-US/">Rust</a>. Rust is an up-and-coming
systems language that leverages <em>linear types</em> to guarantee memory-safety
without a garbage collector. Whereas C is happy to let you share references to
memory whenever you want, Rust prevents a whole host of bugs by restricting how
memory can be shared. Specifically, if data is passed to a function that intends
to mutate it, the reference needs to be explicitly marked as mutable. The
following code won’t compile, since <code class="highlighter-rouge">x</code> is never marked as mutable.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">let</span> <span class="n">x</span> <span class="o">=</span> <span class="mi">42</span><span class="p">;</span>
<span class="nf">do_mutation</span><span class="p">(</span><span class="o">&</span><span class="n">x</span><span class="p">);</span></code></pre></figure>
<p>This code works without a problem.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">let</span> <span class="k">mut</span> <span class="n">x</span> <span class="o">=</span> <span class="mi">42</span><span class="p">;</span>
<span class="nf">do_mutation</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">x</span><span class="p">);</span></code></pre></figure>
<p>There are actually a lot of other cool features in Rust, so if you haven’t seen
it, I highly recommend checking it out.</p>
<p>In any case, I prefer to have friends like Rob and the Rust compiler at times
like this. Sure, they don’t trust my judgment as easily, but why should they? I
make mistakes all of the time, and I want a friend who watches my back.</p>
<h2 id="scene-2">Scene 2</h2>
<p>Peter still doesn’t realize that Mallory is trouble, and now they’ve been dating
for a while. Mallory asks Peter if they can move in together, but, again, Peter
isn’t sure. This time he talks to his friends Henry and Ike.</p>
<p>Henry is a hopeless romantic, and doesn’t really see how bad Mallory is for
Peter. He thinks that there’s no harm in them living together. Ike, disagrees!
He says</p>
<blockquote>
<p>“Peter, what if you realize that you don’t like Mallory in a few months? Won’t
it be hard for you to break up?”</p>
</blockquote>
<p>Ike realizes that, in this case, it’s possible that the relationship could never
<em>terminate</em>. Furthermore, Ike realizes that this isn’t the right <em>type</em> of
relationship for Peter, because he understands what Peter <em>values</em>.</p>
<p>In this case too, Peter gets upset:</p>
<blockquote>
<p>“Why can’t I just see what living together is like? Everyone else says it’s
worth a try!”</p>
</blockquote>
<p>But Ike reminds Peter that while it’s great for a relationship to last forever,
that situation should never happen by accident. A month later, Peter is happy
that Ike saved him from a nearly impossible breakup.</p>
<h3 id="scene-1"><em>Scene</em></h3>
<p>Here, Henry is the Haskell compiler, and Ike is
the <a href="https://www.idris-lang.org/">Idris</a> compiler. Idris is a <em>dependently
typed</em> language, which means that it’s types can contain actual program values.
(Note that we could also have cast Agda or Coq as Ike, but I really like Idris
because it attempts to be more of a general purpose language.) In this case,
Haskell doesn’t mind that you’ve written a non-terminating program. Also, even
though it is statically typed with a very powerful type system, Haskell types
can’t depend on program values.</p>
<p>Idris, on the other hand, has a totality checker, which means that it can tell
if a program will terminate or not. Obviously, a completely correct totality
checker would solve the halting problem, but it turns out that writing a
<em>mostly</em> correct totality checker is possible, using the right heuristics. In
the case of Idris, the totality checker decides if a program <em>might not</em> halt,
occasionally drawing an overly conservative conclusion. If the programmer is
confident that his or her program does, in fact, halt, he or she can provide a
proof that it does. Also, if a program is supposed to loop forever (i.e. a
server or REPL), it can be marked as <code class="highlighter-rouge">partial</code>.</p>
<p>In addition to totality checking, Idris has a dependent type system that allows
you to make strong guarantees about your programs. Types can contain values, so
things like “a list of length <em>n</em>”, or even “a balanced binary tree” can be
expressed in types. Again, I prefer this kind of experience to the alternative.
Of course, there are plenty of times that I end up fighting with the Idris
compiler because it won’t let me do something, but often I realize that the
thing that I was trying to do was a mistake anyway. I’d take Ike as a friend
over Henry any day.</p>
<h2 id="in-all-seriousness">In all seriousness…</h2>
<p>As silly as all of this is, I do really believe that a compiler (like a good
friend), should do everything it can to keep you from making avoidable mistakes.
Even the most experienced programmers write bugs, and if the compiler doesn’t
catch them, the client will. Personally, I hope that languages with
super-powered type systems like Rust and Idris start to gain footing in
real-world settings. I’ll feel much more confident in other people’s programs
when I know that they’ve been consulting with the Robs and Ikes of the world.</p>Let’s say we’re watching a sitcom. Peter is our lovable, naive protagonist, and he’s dating Mallory. Mallory isn’t good for Peter, and always tries to convince him to do the wrong thing. We don’t like Mallory.Encryption and Adjunctions2017-05-26T00:00:00+00:002017-05-26T00:00:00+00:00http://harrisongoldste.in/category-theory/2017/05/26/encryption-and-adjunctions<blockquote>
<p>This post goes pretty deep into Category Theory, fairly quickly. For
explanations of some of the concepts used in this post, I highly recommend
<a href="https://bartoszmilewski.com/2014/10/28/category-theory-for-programmers-the-preface/">this blog</a>.
The post on <a href="https://bartoszmilewski.com/2016/04/18/adjunctions/">adjunctions</a>
is especially relevant.</p>
</blockquote>
<p>In Category Theory, a monad is (loosely) defined as a functor <script type="math/tex">T</script> with two
natural transformations:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{aligned}
\eta &: Id \to T \\
\mu &: T \circ T \to T
\end{aligned} %]]></script>
<p>I wondered what it would look like to replace the <script type="math/tex">\mu</script> with a slightly
different natural transformation:</p>
<script type="math/tex; mode=display">\mu' : T \circ T \to Id</script>
<p>My intuition was that this might be useful for some kind of security
application. If the functor represented some kind of encryption, then <script type="math/tex">\eta</script>
allows one party to encrypt some data, and <script type="math/tex">\mu'</script> allows the data to be used
after being properly decrypted.</p>
<p>In practice, however, this didn’t quite make sense. The biggest problem was that
encryption and decryption are inverses, but they aren’t symmetric. It seemed
that if this was going to work, I’d need two functors (call them <script type="math/tex">L</script> and
<script type="math/tex">R</script>), and a natural transformation:</p>
<script type="math/tex; mode=display">\epsilon : L \circ R \to Id</script>
<p>One party can encrypt some data using <script type="math/tex">R</script>, and the other can apply <script type="math/tex">L</script> and
use <script type="math/tex">\epsilon</script> to retrieve the data.</p>
<p>If you know some Category Theory, you might know where this is going:
adjunctions! An adjunction is a pair of functors <script type="math/tex">L</script> and <script type="math/tex">R</script> with the
following natural transformations:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{aligned}
\eta &: Id \to R \circ L \\
\epsilon &: L \circ R \to Id
\end{aligned} %]]></script>
<p>We write <script type="math/tex">L \dashv R</script> to express this condition. Note that <script type="math/tex">\epsilon</script> is
exactly what we wrote before! If we want to represent encryption as a pair of
functors, it would be smart to choose two <em>adjoined</em> functors. (If you’re
curious, the <script type="math/tex">\eta</script> can actually be understood as the same <script type="math/tex">\eta</script> from the
monad definition. For any adjunction, <script type="math/tex">R \circ L</script> is a monad.)</p>
<p>Since we’d like this construct to be useful, we want candidates for <script type="math/tex">L</script> and
<script type="math/tex">R</script> that are endofunctors (functors from a category to itself) in the category
where objects are <strong>types</strong>, and morphisms are <strong>pure functions</strong>. (This
category is often called <script type="math/tex">Hask</script>, named after the Haskell programming
language.) One such adjunction is</p>
<script type="math/tex; mode=display">(X, -) \dashv (X \to -)</script>
<p>where <script type="math/tex">L</script> is the product (or <code class="highlighter-rouge">Tuple</code>) functor, and <script type="math/tex">R</script> is the exponential
(or <code class="highlighter-rouge">Reader</code>) functor. Using actual code (I’m
using <a href="https://www.idris-lang.org/">Idris</a> here), the functors are just data
types, and are written as follows:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">L</span> <span class="n">k</span> <span class="n">a</span> <span class="o">=</span> <span class="kt">MkL</span> <span class="n">k</span> <span class="n">a</span>
<span class="kr">data</span> <span class="kt">R</span> <span class="n">k</span> <span class="n">a</span> <span class="o">=</span> <span class="kt">MkR</span> <span class="p">(</span><span class="n">k</span> <span class="o">-></span> <span class="n">a</span><span class="p">)</span></code></pre></figure>
<p>We can convince ourselves that these functors are adjoined by implementing
<script type="math/tex">\eta</script> and <script type="math/tex">\epsilon</script>, which are polymorphic functions that are
conventionally called <code class="highlighter-rouge">unit</code> and <code class="highlighter-rouge">counit</code> respectively:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">unit</span> <span class="o">:</span> <span class="n">a</span> <span class="o">-></span> <span class="kt">R</span> <span class="n">k</span> <span class="p">(</span><span class="kt">L</span> <span class="n">k</span> <span class="n">a</span><span class="p">)</span>
<span class="n">unit</span> <span class="n">x</span> <span class="o">=</span> <span class="kt">MkR</span> <span class="p">(</span><span class="nf">\</span><span class="n">y</span> <span class="o">=></span> <span class="kt">MkL</span> <span class="n">y</span> <span class="n">x</span><span class="p">)</span>
<span class="n">counit</span> <span class="o">:</span> <span class="kt">L</span> <span class="n">k</span> <span class="p">(</span><span class="kt">R</span> <span class="n">k</span> <span class="n">a</span><span class="p">)</span> <span class="o">-></span> <span class="n">a</span>
<span class="n">counit</span> <span class="p">(</span><span class="kt">MkL</span> <span class="n">y</span> <span class="p">(</span><span class="kt">MkR</span> <span class="n">f</span><span class="p">))</span> <span class="o">=</span> <span class="n">f</span> <span class="n">y</span></code></pre></figure>
<p>But what does any of this have to do with encryption? To answer that question,
we really need to figure out better names for <code class="highlighter-rouge">L</code> and <code class="highlighter-rouge">R</code>. Let’s start with <code class="highlighter-rouge">R</code>.
The key insight here is that <code class="highlighter-rouge">R</code> sort of <em>hides</em> data behind a function call. It
takes a value of type <code class="highlighter-rouge">a</code>, and requires that we have a value of type <code class="highlighter-rouge">k</code> if we
want our value back. Let’s rename <code class="highlighter-rouge">R</code> to <code class="highlighter-rouge">Encrypted</code>, and write a function
<code class="highlighter-rouge">encrypt</code> as follows:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Encrypted</span> <span class="n">k</span> <span class="n">a</span> <span class="o">=</span> <span class="kt">MkEncrypted</span> <span class="p">(</span><span class="n">k</span> <span class="o">-></span> <span class="n">a</span><span class="p">)</span>
<span class="n">encrypt</span> <span class="o">:</span> <span class="p">(</span><span class="n">k</span> <span class="o">:</span> <span class="kt">Type</span><span class="p">)</span> <span class="o">-></span> <span class="n">a</span> <span class="o">-></span> <span class="kt">Encrypted</span> <span class="n">k</span> <span class="n">a</span>
<span class="n">encrypt</span> <span class="kr">_</span> <span class="n">x</span> <span class="o">=</span> <span class="kt">MkEncrypted</span> <span class="o">$</span> <span class="nf">\</span><span class="kr">_</span> <span class="o">=></span> <span class="n">x</span></code></pre></figure>
<p>This function is the reason that I opted to use a dependently typed language
like Idris over a more standard language. In order to get any use out of this
function, we need to be able to specify what type <code class="highlighter-rouge">k</code> actually is; that requires
passing a type to encrypt as if it were data.</p>
<p>Now that we have the <code class="highlighter-rouge">Encrypted</code> functor, we can make a guess at what <code class="highlighter-rouge">L</code> is
supposed to be. The name I settled on (somewhat unsurprisingly) was <code class="highlighter-rouge">Decrypter</code>;
this is because the key contained within the tuple can be used to decrypt some
encrypted value.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Decrypter</span> <span class="n">k</span> <span class="n">a</span> <span class="o">=</span> <span class="kt">MkDecrypter</span> <span class="n">k</span> <span class="n">a</span></code></pre></figure>
<p>If we rewrite <code class="highlighter-rouge">counit</code> from before, we can finally get:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">decrypt</span> <span class="o">:</span> <span class="kt">Decrypter</span> <span class="n">k</span> <span class="p">(</span><span class="kt">Encrypted</span> <span class="n">k</span> <span class="n">a</span><span class="p">)</span> <span class="o">-></span> <span class="n">a</span>
<span class="n">decrypt</span> <span class="p">(</span><span class="kt">MkDecrypter</span> <span class="n">x</span> <span class="p">(</span><span class="kt">MkEncrypted</span> <span class="n">f</span><span class="p">))</span> <span class="o">=</span> <span class="n">f</span> <span class="n">x</span></code></pre></figure>
<p>With all of this machinery, we can:</p>
<ol>
<li>Pick a type <code class="highlighter-rouge">k</code>; in a dependently typed language, this type can be a proof of
some sort.</li>
<li>Call <code class="highlighter-rouge">encrypt k</code> on some value of type <code class="highlighter-rouge">a</code> to get an <code class="highlighter-rouge">Encrypted k a</code>.</li>
<li>Use the <code class="highlighter-rouge">MkDecrypter</code> constructor, along with a valid value of type <code class="highlighter-rouge">k</code> to
make a <code class="highlighter-rouge">Decrypter k (Encrypted k a)</code>.</li>
<li>Call <code class="highlighter-rouge">decrypt</code> to get the original value out.</li>
</ol>
<h2 id="conclusion">Conclusion</h2>
<p>My main purpose with this exploration was to gain a deeper understanding of how
Category Theory interacts with real-world programming problems. While the end
result is not particularly useful, it does give some interesting insight into
what a <em>proof-relevant</em> encryption system would look like.</p>
<p>In addition, I found it extremely interesting how two inverse concepts like
encryption and decryption map nicely onto adjoined functors. While it is easy to
see that adjoined functors are inverses conceptually, it is exciting to see how
they model those behaviors in practice.</p>This post goes pretty deep into Category Theory, fairly quickly. For explanations of some of the concepts used in this post, I highly recommend this blog. The post on adjunctions is especially relevant.Elm, First Impressions2017-05-25T00:00:00+00:002017-05-25T00:00:00+00:00http://harrisongoldste.in/languages/2017/05/25/elm-first-impressions<p>I like exploring new programming languages and paradigms in my spare time. Here
are some of my thoughts on Elm.</p>
<p>Elm is a purely functional, strongly typed language for web development. It’s a
very opinionated language, with a very powerful run-time that is designed to make
writing web applications easy. There are some things that I really like about
Elm, and some things that I find frustrating. Your mileage may vary.</p>
<h2 id="pros">Pros</h2>
<h3 id="the-elm-architecture">The Elm Architecture</h3>
<p>All Elm applications are written with the same general design pattern. The
general structure is similar to things like Redux and Flux (which is actually
based on Elm):</p>
<ul>
<li><code class="highlighter-rouge">model</code>: A single object, encapsulating the entire state of the application.</li>
<li><code class="highlighter-rouge">update</code>: A pure function that takes a message and a model and produces a new
model.</li>
<li><code class="highlighter-rouge">view</code>: A pure function that takes a model and produces instructions on how to
render the application.</li>
</ul>
<p>This pattern is called “The Elm Architecture”, and the run-time supports it
directly. Once you specify these three components, the run-time sets up a model
and renders a view. Then, it listens for messages from the view, passes each
one to the <code class="highlighter-rouge">update</code> function, changes the model accordingly, and re-renders only
the parts of the view that changed.</p>
<p>I really like this approach because it manages abstraction in a really
intelligent way. On one hand, I have access to (and am expected to deal with)
all of the application-specific parts of my project. As a programmer, I need to
specify the application state, how that state changes, and what that state
“looks like”. On the other hand, machinery that is especially general (the
wiring) is taken out of the programmer’s control completely. (There isn’t a lot
of configuration in Elm; in general, if the run-time want’s to handle something,
you’re expected to let it.)</p>
<p>A nice side effect of this is that Elm is actually really fast. In some sense,
the Architecture encompasses all of the slowest parts of the application—this
makes it free to heavily optimize those pieces.</p>
<h3 id="static-typing">Static Typing</h3>
<p>The other <strong>major</strong> benefit of Elm is that it is statically typed. This means
that the compiler (and not the Chrome developer console) catches your mistakes.
I could go on for a long time about the benefits of a good type system, but I’ll
leave that for another blog post.</p>
<h2 id="cons">Cons</h2>
<h3 id="no-type-classes">No Type Classes</h3>
<p>Since Elm looks so much like Haskell, I often expect it to behave like Haskell.
While it does most of the time, sometimes it falls short. One large place this
happens is with type classes; since Elm does not support type classes it
misses out on some of the really nice features that come along with them.</p>
<p>For example, rather than use <code class="highlighter-rouge">do</code> notation to deal with monads, we need to
explicitly bind arguments into monadic functions (in Elm, most types define a
function called <code class="highlighter-rouge">andThen</code> for this purpose). Keep in mind that this problem
is related to type classes because Haskell’s <code class="highlighter-rouge">do</code> is tied to the <code class="highlighter-rouge">Monad</code> type
class; anything that implements <code class="highlighter-rouge">Monad</code> supports <code class="highlighter-rouge">do</code> notation.</p>
<p>Things like <code class="highlighter-rouge">do</code> notation would be a nice to have, but in the end, it isn’t such
a big deal. One thing that is a big deal is how Elm deals with comparisons. In
Haskell, we have <code class="highlighter-rouge">Ord a</code> which allows a user to define comparisons for their own
types. Elm uses something called <code class="highlighter-rouge">comparable</code>, does the same job as <code class="highlighter-rouge">Ord</code>,
without being a proper type class. Basically, a function</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">f</span> <span class="o">:</span> <span class="n">a</span> <span class="o">-></span> <span class="kt">Int</span></code></pre></figure>
<p>can take any argument at all, but a function</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">g</span> <span class="o">:</span> <span class="n">comparable</span> <span class="o">-></span> <span class="kt">Int</span></code></pre></figure>
<p>can only take an argument that permits comparisons. Unfortunately, the only
types that are <code class="highlighter-rouge">comparable</code> are <code class="highlighter-rouge">Int</code>, <code class="highlighter-rouge">Float</code>, <code class="highlighter-rouge">Time</code>, <code class="highlighter-rouge">Char</code>, and <code class="highlighter-rouge">String</code>—
that’s it. There’s no way to make a user defined type comparable, since
<code class="highlighter-rouge">comparable</code> is just a built-in language construct and not a formal type class.
This is especially frustrating since the built in type <code class="highlighter-rouge">Dict</code> (a dictionary
based on a balanced binary tree) has the following interface:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">get</span> <span class="o">:</span> <span class="n">comparable</span> <span class="o">-></span> <span class="kt">Dict</span> <span class="n">comparable</span> <span class="n">v</span> <span class="o">-></span> <span class="kt">Maybe</span> <span class="n">v</span></code></pre></figure>
<p>The result is that no user defined types can ever be the key of a dictionary,
even if there is a perfectly reasonable way to compare them.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Overall, I really like Elm. It’s been fun to work with, and it’s definitely
mature enough to be usable for some projects. It has some drawbacks, and I’d
hesitate to put it into production just yet, but it’s certainly heading in the
right direction.</p>I like exploring new programming languages and paradigms in my spare time. Here are some of my thoughts on Elm.Why not try Spacemacs?2017-05-24T00:00:00+00:002017-05-24T00:00:00+00:00http://harrisongoldste.in/misc/2017/05/24/why-not-try-spacemacs<blockquote>
<p><strong>Disclaimer</strong>: I hope this post does not offend anyone. I do mention both
<strong>vim</strong> and <strong>emacs</strong>, so reader disgression is advised.</p>
</blockquote>
<p>When I first started programming, my teacher forced us to write our code in
Windows notepad. Not even <em>notepad++</em>. Notepad.</p>
<p>Looking back, I suppose I understand his rationale; if we learned to write code
with all of the assistance that an IDE provides, we wouldn’t understand what is
actually going on. (Clicking “Run” is very different from typing <code class="highlighter-rouge">javac</code> at the
command line, and I’m glad that I started off with that deeper understanding.)
By the time that I was headed to college, I understood that I could write code
even without the bells and whistles of IDEs and code editors.</p>
<p>But that didn’t mean I could write <em>good</em> code.</p>
<p>Since those days, I have learned to rely on and appreciate the tools that are
available to help me do my work. Writing code is constant multitasking: a
programmer is simultaneously solving high level problems and dealing with the
nuance of implementing the solutions. Any experienced programmer will tell you
that it takes a lot of focus to write good code, so I am glad that there is such
a rich ecosystem of tools to make the process easier. I have used a number of
code editors and IDEs, from Eclipse to Sublime Text to, most recently, Vim. For
the last year or so, I learned to love Vim and it’s amazingly efficient editing
style. It’s how I write most of my code, as well as most things that aren’t
code. It’s completely changed the way that I think about programming.</p>
<p>Lately, though, I have gotten a little bit frustrated with Vim. While I
appreciate its philosophy of configurability, it has become increasingly
frustrating to actually configure all of the behaviors that I want. I’ve also
had more and more need for advanced code editing features like debugging and
linting, which are difficult or impossible to actually get set up in Vim.</p>
<p>Enter Emacs.</p>
<p>A few weeks ago, some lucky YouTube video recommendations led me to watch a few
talks on Emacs. From what I could see, it had all of the powerful features that
I was looking for, along with the same level of “hackability” that I had gotten
used to in Vim. Even more importantly, Emacs has <code class="highlighter-rouge">evil-mode</code>. Evil stands for
<strong>E</strong>xtensible <strong>vi L</strong>ayer, and is basically a full implementation of Vim in
Emacs Lisp. (((Emacs uses a dialect of lisp to configure behavior instead of a
custom language like vimscript.))) That meant that I could have my cake and eat
it too; the editing style of Vim with the power of Emacs.</p>
<p>There was just one problem left to tackle: Emacs pinky. I’ll be honest, I don’t
have the largest hands in the world, and the thought of reaching for control or
alt any time I wanted to do something was not particularly appealing.</p>
<p>Enter spacemacs.</p>
<p><a href="http://spacemacs.org">Spacemacs</a> is a custom Emacs distribution that is built
on <code class="highlighter-rouge">evil-mode</code> and configured with Vim users in mind. Basically, Spacemacs
changes almost all of the Emacs key-bindings to the spacebar (the “space”, in
spacemacs) followed by a short string of characters. For example, <code class="highlighter-rouge">SPC f s</code>
saves the current buffer, and <code class="highlighter-rouge">SPC g s</code> displays an interactive window with Git
status information. In keeping with Vim’s philosophy, these bindings all all
mnemonic; “f s” corresponds to “file, save”, “g s” for “git, status”, etc. It
was shocking how quick it was to get used to, and before I knew it I had written
a couple thousand lines of code (and this blog post) in spacemacs.</p>
<p>So, if you happen to be like me (comfortable with Vim, but looking for a more
powerful code editor), give spacemacs a try. I can say for sure that it is the
first editor in a while that I’ve felt actually has what I need.</p>Disclaimer: I hope this post does not offend anyone. I do mention both vim and emacs, so reader disgression is advised.