tag:blogger.com,1999:blog-50125652552251085172024-02-19T07:48:02.888-05:00Shayne Fletcher"Hooked" on programmingShayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comBlogger106125tag:blogger.com,1999:blog-5012565255225108517.post-46259734495766378332023-01-14T13:27:00.000-05:002023-01-14T13:27:17.230-05:00Cabal package macros (MIN_VERSION_xyz)<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<title>-</title>
<style>
html {
line-height: 1.5;
font-family: Georgia, serif;
font-size: 20px;
color: #1a1a1a;
background-color: #fdfdfd;
}
body {
margin: 0 auto;
max-width: 36em;
padding-left: 50px;
padding-right: 50px;
padding-top: 50px;
padding-bottom: 50px;
hyphens: auto;
overflow-wrap: break-word;
text-rendering: optimizeLegibility;
font-kerning: normal;
}
@media (max-width: 600px) {
body {
font-size: 0.9em;
padding: 1em;
}
h1 {
font-size: 1.8em;
}
}
@media print {
body {
background-color: transparent;
color: black;
font-size: 12pt;
}
p, h2, h3 {
orphans: 3;
widows: 3;
}
h2, h3, h4 {
page-break-after: avoid;
}
}
p {
margin: 1em 0;
}
a {
color: #1a1a1a;
}
a:visited {
color: #1a1a1a;
}
img {
max-width: 100%;
}
h1, h2, h3, h4, h5, h6 {
margin-top: 1.4em;
}
h5, h6 {
font-size: 1em;
font-style: italic;
}
h6 {
font-weight: normal;
}
ol, ul {
padding-left: 1.7em;
margin-top: 1em;
}
li > ol, li > ul {
margin-top: 0;
}
blockquote {
margin: 1em 0 1em 1.7em;
padding-left: 1em;
border-left: 2px solid #e6e6e6;
color: #606060;
}
code {
font-family: Menlo, Monaco, 'Lucida Console', Consolas, monospace;
font-size: 85%;
margin: 0;
}
pre {
margin: 1em 0;
overflow: auto;
}
pre code {
padding: 0;
overflow: visible;
overflow-wrap: normal;
}
.sourceCode {
background-color: transparent;
overflow: visible;
}
hr {
background-color: #1a1a1a;
border: none;
height: 1px;
margin: 1em 0;
}
table {
margin: 1em 0;
border-collapse: collapse;
width: 100%;
overflow-x: auto;
display: block;
font-variant-numeric: lining-nums tabular-nums;
}
table caption {
margin-bottom: 0.75em;
}
tbody {
margin-top: 0.5em;
border-top: 1px solid #1a1a1a;
border-bottom: 1px solid #1a1a1a;
}
th {
border-top: 1px solid #1a1a1a;
padding: 0.25em 0.5em 0.25em 0.5em;
}
td {
padding: 0.125em 0.5em 0.25em 0.5em;
}
header {
margin-bottom: 4em;
text-align: center;
}
#TOC li {
list-style: none;
}
#TOC ul {
padding-left: 1.3em;
}
#TOC > ul {
padding-left: 0;
}
#TOC a:not(:hover) {
text-decoration: none;
}
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
div.columns{display: flex; gap: min(4vw, 1.5em);}
div.column{flex: auto; overflow-x: auto;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
ul.task-list li input[type="checkbox"] {
width: 0.8em;
margin: 0 0.8em 0.2em -1.6em;
vertical-align: middle;
}
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { color: #008000; } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { color: #008000; font-weight: bold; } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
.display.math{display: block; text-align: center; margin: 0.5rem auto;}
</style>
<!--[if lt IE 9]>
<script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
<![endif]-->
</head>
<body>
<h1 id="cabal-package-macros-min_version_xyz">Cabal package macros
(<code>MIN_VERSION_xyz</code>)</h1>
<p><code>cabal build ...</code> generates <code>cabal_macros.h</code>
containing e.g. for v 3.5.0 a definition like,</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode c"><code class="sourceCode c"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="co">/* package hlint-3.5 */</span></span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="pp">#ifndef VERSION_hlint</span></span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="pp">#define VERSION_hlint "3.5"</span></span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="pp">#endif </span><span class="co">/* VERSION_hlint */</span></span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a><span class="pp">#ifndef MIN_VERSION_hlint</span></span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a><span class="pp">#define MIN_VERSION_hlint(x,y,z) (\</span></span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a><span class="pp"> (x) < 3 || \</span></span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a><span class="pp"> (x) == 3 && (y) < 5 || \</span></span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a><span class="pp"> (x) == 3 && (y) == 5 && (z) <= 0)</span></span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a><span class="pp">#endif </span><span class="co">/* MIN_VERSION_hlint */</span></span></code></pre></div>
<p>This macro is a compile time predicate. Use to test the
<code>hlint</code> configured package version is at least
<code>x.y.z</code>.</p>
<p>We might for example test if the configured package version is at
least 3.6.0 by</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode c"><code class="sourceCode c"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>MIN_VERSION_hlint<span class="op">(</span><span class="fl">3.6</span><span class="er">.0</span><span class="op">)</span></span></code></pre></div>
<p>By substitution into the macro body we arrive at</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode c"><code class="sourceCode c"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="dv">3</span> <span class="op"><</span> <span class="dv">3</span> <span class="op">||</span> <span class="dv">3</span> <span class="op">==</span> <span class="dv">3</span> <span class="op">&&</span> <span class="dv">6</span> <span class="op"><</span> <span class="dv">5</span> <span class="op">||</span> <span class="dv">3</span> <span class="op">==</span> <span class="dv">3</span> and <span class="dv">6</span> <span class="op">==</span> <span class="dv">5</span> and <span class="dv">0</span> <span class="op"><=</span> <span class="dv">0</span></span></code></pre></div>
<p>to conclude, no it is not.</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-12448052015891743612022-08-07T17:16:00.003-04:002022-08-13T14:44:11.302-04:00Testing a new stack resolver<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<title>-</title>
<style>
html {
line-height: 1.5;
font-family: Georgia, serif;
font-size: 20px;
color: #1a1a1a;
background-color: #fdfdfd;
}
body {
margin: 0 auto;
max-width: 36em;
padding-left: 50px;
padding-right: 50px;
padding-top: 50px;
padding-bottom: 50px;
hyphens: auto;
overflow-wrap: break-word;
text-rendering: optimizeLegibility;
font-kerning: normal;
}
@media (max-width: 600px) {
body {
font-size: 0.9em;
padding: 1em;
}
}
@media print {
body {
background-color: transparent;
color: black;
font-size: 12pt;
}
p, h2, h3 {
orphans: 3;
widows: 3;
}
h2, h3, h4 {
page-break-after: avoid;
}
}
p {
margin: 1em 0;
}
a {
color: #1a1a1a;
}
a:visited {
color: #1a1a1a;
}
img {
max-width: 100%;
}
h1, h2, h3, h4, h5, h6 {
margin-top: 1.4em;
}
h5, h6 {
font-size: 1em;
font-style: italic;
}
h6 {
font-weight: normal;
}
ol, ul {
padding-left: 1.7em;
margin-top: 1em;
}
li > ol, li > ul {
margin-top: 0;
}
blockquote {
margin: 1em 0 1em 1.7em;
padding-left: 1em;
border-left: 2px solid #e6e6e6;
color: #606060;
}
code {
font-family: Menlo, Monaco, 'Lucida Console', Consolas, monospace;
font-size: 85%;
margin: 0;
}
pre {
margin: 1em 0;
overflow: auto;
}
pre code {
padding: 0;
overflow: visible;
overflow-wrap: normal;
}
.sourceCode {
background-color: transparent;
overflow: visible;
}
hr {
background-color: #1a1a1a;
border: none;
height: 1px;
margin: 1em 0;
}
table {
margin: 1em 0;
border-collapse: collapse;
width: 100%;
overflow-x: auto;
display: block;
font-variant-numeric: lining-nums tabular-nums;
}
table caption {
margin-bottom: 0.75em;
}
tbody {
margin-top: 0.5em;
border-top: 1px solid #1a1a1a;
border-bottom: 1px solid #1a1a1a;
}
th {
border-top: 1px solid #1a1a1a;
padding: 0.25em 0.5em 0.25em 0.5em;
}
td {
padding: 0.125em 0.5em 0.25em 0.5em;
}
header {
margin-bottom: 4em;
text-align: center;
}
#TOC li {
list-style: none;
}
#TOC a:not(:hover) {
text-decoration: none;
}
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
.display.math{display: block; text-align: center; margin: 0.5rem auto;}
</style>
<!--[if lt IE 9]>
<script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
<![endif]-->
</head>
<body>
<h1 id="testing-a-new-stack-resolver">Testing a new stack resolver</h1>
<p>When there’s been a new GHC release, it can take a little while for there to be a stack resolver for it. The following procedure can be used for some local testing if you want to get ahead of the game<a href="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a>.</p>
<h2 id="where-stack-looks-for-resolvers">Where stack looks for resolvers</h2>
<p>If you execute the command,</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a> <span class="va">$stack</span> setup <span class="at">--resolver</span> ghc-x.y.z</span></code></pre></div>
<p>you’ll see something like</p>
<pre><code> No setup information found for ghc-x.y.z on your platform.</code></pre>
<p>if there isn’t yet a resolver for ghc-x.y.z.</p>
<p>The default set of resolvers <code>stack</code> “knows about” are those enumerated in the file <a href="https://github.com/commercialhaskell/stackage-content/blob/master/stack/stack-setup-2.yaml">stack-setup-2.yaml</a> in the <a href="https://github.com/commercialhaskell/stackage-content">commericalhaskell/stackage-content</a> repository.</p>
<h2 id="create-a-local-stack-setup-file">Create a local stack setup file</h2>
<ul>
<li>Start by downloading the release binary package tarball of interest from <a href="https://www.haskell.org/ghc/">www.haskell.org</a> and note its
<ul>
<li>size (in bytes):</li>
</ul>
<div class="sourceCode" id="cb3"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a> <span class="fu">stat</span> <span class="at">-f%z</span> ghc-x.y.z-x86_64-apple-darwin.tar.xz</span></code></pre></div>
<ul>
<li><code>SHA1</code> hash:</li>
</ul>
<div class="sourceCode" id="cb4"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a> <span class="ex">shasum</span> <span class="at">-a</span> 1 ghc-x.y.z-x86_64-apple-darwin.tar.xz</span></code></pre></div>
<ul>
<li><code>SHA256</code> hash:</li>
</ul>
<div class="sourceCode" id="cb5"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a> <span class="ex">shasum</span> <span class="at">-a</span> 256 ghc-x.y.z-x86_64-apple-darwin.tar.xz</span></code></pre></div></li>
<li>Now, create <code>stack-setup-2-ghc-x.y.z.yaml</code> with contents along the lines of</li>
</ul>
<div class="sourceCode" id="cb6"><pre class="sourceCode yaml"><code class="sourceCode yaml"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">ghc</span><span class="kw">:</span></span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">macosx</span><span class="kw">:</span></span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">x.y.z</span><span class="kw">:</span></span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">url</span><span class="kw">:</span><span class="at"> http://downloads.haskell.org/~ghc/x.y.z/ghc-x.y.z-x86_64-apple-darwin.tar.xz</span></span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">content-length</span><span class="kw">:</span><span class="at"> </span><span class="dv">177152992</span></span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">sha1</span><span class="kw">:</span><span class="at"> 2dbd726860ed2c0ea04c7aca29c22df20b952ee1</span></span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">sha256</span><span class="kw">:</span><span class="at"> f2e8366fd3754dd9388510792aba2d2abecb1c2f7f1e5555f6065c3c5e2ffec4</span></span></code></pre></div>
<p><em>Update: Here’s an even easier way.</em></p>
<div class="sourceCode" id="cb7"><pre class="sourceCode yaml"><code class="sourceCode yaml"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">ghc</span><span class="kw">:</span></span>
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">macosx</span><span class="kw">:</span></span>
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">x.y.z</span><span class="kw">:</span></span>
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">url</span><span class="kw">:</span><span class="at"> /path/to/ghc-x.y.z-x86_64-apple-darwin.tar.xz</span></span></code></pre></div>
<h2 id="install-ghc-version-x.y.z-via-the-setup-file">Install GHC version <code>x.y.z</code> via the setup file</h2>
<p>The following <code>stack setup</code> command will use the setup file created above to download and install ghc-x.y.z:</p>
<div class="sourceCode" id="cb8"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a> <span class="ex">stack</span> <span class="at">--setup-info-yaml</span> stack-setup-2-ghc-x.y.z.yaml <span class="at">--resolver</span> ghc-x.y.z setup</span></code></pre></div>
<p>Once <code>stack</code> has got GHC installed, there’s no further any need to pass a <code>setup-info-yaml</code> argument to subsequent <code>stack</code> commands, it’s ready to go!</p>
<section class="footnotes footnotes-end-of-document" role="doc-endnotes">
<hr />
<ol>
<li id="fn1" role="doc-endnote"><p><em>Kindly explained to me by Mike Pilgrem in this <a href="https://github.com/commercialhaskell/stack/issues/5761">ticket</a>. This note is biased towards my needs on macOS. See the linked issue for further details especially for Windows installations.</em><a href="#fnref1" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
</ol>
</section>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-33307607543955091042021-05-19T00:41:00.011-04:002021-06-01T22:43:07.804-04:00Annotations in GHC<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<title>annotations</title>
<style>
html {
line-height: 1.5;
font-family: Georgia, serif;
font-size: 20px;
color: #1a1a1a;
background-color: #fdfdfd;
}
body {
margin: 0 auto;
max-width: 36em;
padding-left: 50px;
padding-right: 50px;
padding-top: 50px;
padding-bottom: 50px;
hyphens: auto;
word-wrap: break-word;
text-rendering: optimizeLegibility;
font-kerning: normal;
}
@media (max-width: 600px) {
body {
font-size: 0.9em;
padding: 1em;
}
}
@media print {
body {
background-color: transparent;
color: black;
font-size: 12pt;
}
p, h2, h3 {
orphans: 3;
widows: 3;
}
h2, h3, h4 {
page-break-after: avoid;
}
}
p {
margin: 1em 0;
}
a {
color: #1a1a1a;
}
a:visited {
color: #1a1a1a;
}
img {
max-width: 100%;
}
h1, h2, h3, h4, h5, h6 {
margin-top: 1.4em;
}
h5, h6 {
font-size: 1em;
font-style: italic;
}
h6 {
font-weight: normal;
}
ol, ul {
padding-left: 1.7em;
margin-top: 1em;
}
li > ol, li > ul {
margin-top: 0;
}
blockquote {
margin: 1em 0 1em 1.7em;
padding-left: 1em;
border-left: 2px solid #e6e6e6;
color: #606060;
}
code {
font-family: Menlo, Monaco, 'Lucida Console', Consolas, monospace;
font-size: 85%;
margin: 0;
}
pre {
margin: 1em 0;
overflow: auto;
}
pre code {
padding: 0;
overflow: visible;
}
.sourceCode {
background-color: transparent;
overflow: visible;
}
hr {
background-color: #1a1a1a;
border: none;
height: 1px;
margin: 1em 0;
}
table {
margin: 1em 0;
border-collapse: collapse;
width: 100%;
overflow-x: auto;
display: block;
font-variant-numeric: lining-nums tabular-nums;
}
table caption {
margin-bottom: 0.75em;
}
tbody {
margin-top: 0.5em;
border-top: 1px solid #1a1a1a;
border-bottom: 1px solid #1a1a1a;
}
th {
border-top: 1px solid #1a1a1a;
padding: 0.25em 0.5em 0.25em 0.5em;
}
td {
padding: 0.125em 0.5em 0.25em 0.5em;
}
header {
margin-bottom: 4em;
text-align: center;
}
#TOC li {
list-style: none;
}
#TOC a:not(:hover) {
text-decoration: none;
}
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; } /* Alert */
code span.an { color: #008000; } /* Annotation */
code span.at { } /* Attribute */
code span.bu { } /* BuiltIn */
code span.cf { color: #0000ff; } /* ControlFlow */
code span.ch { color: #008080; } /* Char */
code span.cn { } /* Constant */
code span.co { color: #008000; } /* Comment */
code span.cv { color: #008000; } /* CommentVar */
code span.do { color: #008000; } /* Documentation */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.im { } /* Import */
code span.in { color: #008000; } /* Information */
code span.kw { color: #0000ff; } /* Keyword */
code span.op { } /* Operator */
code span.ot { color: #ff4000; } /* Other */
code span.pp { color: #ff4000; } /* Preprocessor */
code span.sc { color: #008080; } /* SpecialChar */
code span.ss { color: #008080; } /* SpecialString */
code span.st { color: #008080; } /* String */
code span.va { } /* Variable */
code span.vs { color: #008080; } /* VerbatimString */
code span.wa { color: #008000; font-weight: bold; } /* Warning */
.display.math{display: block; text-align: center; margin: 0.5rem auto;}
</style>
<!--[if lt IE 9]>
<script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
<![endif]-->
</head>
<body>
<h1 id="annotations-in-ghc">Annotations in GHC</h1>
<p>Starting with ghc-9.2.1, parse trees contain “annotations” (these are, for example, comments and the locations of keywords). This represents a non-trivial upgrade of GHC parse trees. If you work with GHC ASTs in your project, there will be no avoiding getting to know about them. This note is a summary overview of annotations: the where and how of their representations.</p>
<p>In-tree annotations enable exact-printing of GHC ASTs. This feature and the reformulation of the GHC AST with in-tree annotations to support it was conceived of and implemented by Alan Zimmerman (<span class="citation" data-cites="alan_zimm">@alan_zimm</span>). The achievement is of truly epic scale.</p>
<h2 id="annotations-on-syntactic-elements">Annotations on syntactic elements</h2>
<p>An <code>EpaLocation</code> is a span giving the exact location of a keyword in parsed source.</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="kw">data</span> <span class="dt">EpaLocation</span> <span class="ot">=</span> <span class="dt">EpaSpan</span> <span class="dt">RealSrcSpan</span> <span class="op">|</span> <span class="dt">EpaDelta</span> <span class="dt">DeltaPos</span></span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="kw">data</span> <span class="dt">DeltaPos</span> <span class="ot">=</span> <span class="op">...</span></span></code></pre></div>
<p>The parser only inserts <code>EpaSpans</code>.</p>
<p>A <code>DotFieldOcc</code> arises in expressions like <code>(.e)</code> (field-selector) or <code>a.e</code> (field-selection) when <code>OverloadedRecordDot</code> is enabled. A <code>DotFieldOcc</code> value in the parse phase is associated with an <code>AnnFieldLabel</code> in its extension field (annotations in ghc-9.2.1 lean heavily on the facilities afforded by <a href="https://arxiv.org/abs/1610.04799">TTG</a>). The <code>AnnFieldLabel</code> contains the location of the ‘<code>.</code>’. <code>AnnFieldLabel</code> is an “annotation type”. You’ll recognize annotation types (there are many) by the convention that their names are prefixed <code>Ann</code>.</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="co">-- GHC.Hs.Expr</span></span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="kw">data</span> <span class="dt">AnnFieldLabel</span></span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> <span class="ot">=</span> <span class="dt">AnnFieldLabel</span> {</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a><span class="ot"> afDot ::</span> <span class="dt">Maybe</span> <span class="dt">EpaLocation</span></span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> }</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="kw">instance</span> <span class="dt">XCDotFieldOcc</span> (<span class="dt">GhcPass</span> _) <span class="ot">=</span> <span class="dt">EpAnn</span> <span class="dt">AnnFieldLabel</span></span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a><span class="co">-- Language.Haskell.Syntax.Expr</span></span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a><span class="kw">data</span> <span class="dt">DotFieldOcc</span> p</span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a> <span class="ot">=</span> <span class="dt">DotFieldOcc</span></span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a> {<span class="ot"> dfoExt ::</span> <span class="dt">XCDotFieldOcc</span> p</span>
<span id="cb2-12"><a href="#cb2-12" aria-hidden="true" tabindex="-1"></a> ,<span class="ot"> dfoLabel ::</span> <span class="dt">XRec</span> p <span class="dt">FieldLabelString</span></span>
<span id="cb2-13"><a href="#cb2-13" aria-hidden="true" tabindex="-1"></a> }</span>
<span id="cb2-14"><a href="#cb2-14" aria-hidden="true" tabindex="-1"></a> <span class="op">|</span> <span class="dt">XDotFieldOcc</span> <span class="op">!</span>(<span class="dt">XXDotFieldOcc</span> p)</span></code></pre></div>
<p>(<em>What <code>XRec p FieldLabelString</code> means will be explained in the next section.</em>)</p>
<p>Note that the extension field <code>dfoExt</code> doesn’t contain a “raw” <code>AnnFieldLabel</code>, rather, it contains an <code>EpAnn AnnFieldLabel</code>.</p>
<p><code>EPAnn</code>, envelopes an annotation. It associates a base location for the start of the syntactic element containing the annotation along with any comments enclosed in the source span of the element to which the <code>EPAnn</code> is attached. <code>EpAnnUnsed</code> is used when an annotation is required but there’s no annotation available to envelope (e.g one obvious case being in generated code).</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="kw">data</span> <span class="dt">EpAnn</span> ann</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="ot">=</span> <span class="dt">EpAnn</span> {<span class="ot"> entry ::</span> <span class="dt">Anchor</span></span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> ,<span class="ot"> anns ::</span> ann</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> ,<span class="ot"> comments ::</span> <span class="dt">EpAnnComments</span> }</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> <span class="op">|</span> <span class="dt">EpAnnNotUsed</span></span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a><span class="kw">data</span> <span class="dt">EpAnnComments</span> <span class="ot">=</span> <span class="op">...</span></span></code></pre></div>
<p>It’s the <code>Anchor</code> type where the base location is held.</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="kw">data</span> <span class="dt">Anchor</span> <span class="ot">=</span> <span class="dt">Anchor</span> {<span class="ot"> anchor ::</span> <span class="dt">RealSrcSpan</span>,<span class="ot"> anchor_op ::</span> <span class="dt">AnchorOperation</span> }</span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a><span class="kw">data</span> <span class="dt">AnchorOperator</span> <span class="ot">=</span> <span class="op">...</span></span></code></pre></div>
<h2 id="annotations-on-source-spans">Annotations on source spans</h2>
<p>Annotations don’t just get attached to syntactic elements, they frequently get attached to source spans too.</p>
<div class="sourceCode" id="cb5"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="kw">data</span> <span class="dt">SrcSpanAnn'</span> a <span class="ot">=</span> <span class="dt">SrcSpanAnn</span> {<span class="ot"> ann ::</span> a,<span class="ot"> locA ::</span> <span class="dt">SrcSpan</span> }</span></code></pre></div>
<p>Usually <code>SrcSpanAnn'</code> is used with <code>EpAnn</code> and that combination is named a <code>SrcAnn</code>.</p>
<div class="sourceCode" id="cb6"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="kw">data</span> <span class="dt">SrcAnn</span> ann <span class="ot">=</span> <span class="dt">SrcSpanAnn'</span> (<span class="dt">EpAnn</span> ann)</span></code></pre></div>
<p>There are many annotation types. The most ubiquitous are <code>AnnListItem</code>, <code>NameAnn</code>, <code>AnnList</code>, <code>AnnPragma</code> and <code>AnnContext</code>. Their use is common enough that names are given to their <code>SrcAnn</code> types (which you recall, wrap them in <code>EpAnn</code> and associate them with a <code>SrcSpan</code>).</p>
<div class="sourceCode" id="cb7"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="dt">SrcSpanAnnA</span> <span class="ot">=</span> <span class="dt">SrcAnn</span> <span class="dt">AnnListItem</span></span>
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="dt">SrcSpanAnnN</span> <span class="ot">=</span> <span class="dt">SrcAnn</span> <span class="dt">NameAnn</span></span>
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="dt">SrcSpanAnnL</span> <span class="ot">=</span> <span class="dt">SrcAnn</span> <span class="dt">AnnList</span></span>
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="dt">SrcSpanAnnP</span> <span class="ot">=</span> <span class="dt">SrcAnn</span> <span class="dt">AnnPragma</span></span>
<span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="dt">SrcSpanAnnC</span> <span class="ot">=</span> <span class="dt">SrcAnn</span> <span class="dt">AnnContext</span></span></code></pre></div>
<p>Of these, <code>SrcSpanAnnA</code> is used as a sort of “default” annotation.</p>
<p>What do you do with generalized <code>SrcSpan</code> types like these? You locate things with them.</p>
<div class="sourceCode" id="cb8"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="dt">LocatedA</span> <span class="ot">=</span> <span class="dt">GenLocated</span> <span class="dt">SrcSpanAnnA</span></span>
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="dt">LocatedN</span> <span class="ot">=</span> <span class="dt">GenLocated</span> <span class="dt">SrcSpanAnnN</span></span>
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="dt">LocatedL</span> <span class="ot">=</span> <span class="dt">GenLocated</span> <span class="dt">SrcSpanAnnL</span></span>
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="dt">LocatedP</span> <span class="ot">=</span> <span class="dt">GenLocated</span> <span class="dt">SrcSpanAnnP</span></span>
<span id="cb8-6"><a href="#cb8-6" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="dt">LocatedC</span> <span class="ot">=</span> <span class="dt">GenLocated</span> <span class="dt">SrcSpanAnnC</span></span></code></pre></div>
<p>These type synonyms are only for the most commonly used annoation types. The general case is <code>LocatedAn an</code>.</p>
<div class="sourceCode" id="cb9"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="dt">LocatedAn</span> an <span class="ot">=</span> <span class="dt">GenLocated</span> (<span class="dt">SrcAnn</span> an)</span></code></pre></div>
<p>To recap, a <code>LocatedAn an</code> is a <code>GenLocated (SrcAnn an)</code> which is a <code>GenLocated (SrcSpanAnn' (EpAnn an))</code>.</p>
<h2 id="abstracting-over-locations">Abstracting over locations</h2>
<p>Syntax definitions are generalized with respect to location information. That is, rather than hard-coding <code>SrcSpan</code> into syntax type definitions as we used to, type families are used in their place so that the structure of the syntax including locations can be described without fixing concrete types for the locations where you’d once have had a source span type.</p>
<p>It works like this. In <code>Language.Haskell.Syntax.Extension</code> there is this definition:</p>
<div class="sourceCode" id="cb10"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="kw">family</span> <span class="dt">XRec</span> p a <span class="ot">=</span> r <span class="op">|</span> r <span class="ot">-></span> a</span></code></pre></div>
<p>Locations are specified in terms of <code>XRec</code>s. For example in <code>Language.Haskell.Syntax.Expr</code> we have this:</p>
<div class="sourceCode" id="cb11"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="dt">LHsExpr</span> p <span class="ot">=</span> <span class="dt">XRec</span> p (<span class="dt">HsExpr</span> p)</span></code></pre></div>
<p>How <code>XRec p (HsExpr p)</code> is mapped onto a specific type in GHC is achieved in the following way. First in <code>Language.Haskell.Syntax.Extension</code> there is the following definition:</p>
<div class="sourceCode" id="cb12"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="kw">family</span> <span class="dt">Anno</span> a <span class="ot">=</span> b</span></code></pre></div>
<p>Then, in <code>GHC.Hs.Extension</code> this definition:</p>
<div class="sourceCode" id="cb13"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="kw">instance</span> <span class="dt">XRec</span> (<span class="dt">GhcPass</span> p) a <span class="ot">=</span> <span class="dt">GenLocated</span> (<span class="dt">Anno</span> a) a</span></code></pre></div>
<p>Specific choices for each syntatic element can then be made for GHC’s use of the parse tree and phase. For example, in <code>GHC.Hs.Expr</code> we have the following.</p>
<div class="sourceCode" id="cb14"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a><span class="kw">type</span> <span class="kw">instance</span> <span class="dt">Anno</span> (<span class="dt">HsExpr</span> (<span class="dt">GhcPass</span> pass)) <span class="ot">=</span> <span class="dt">SrcSpanAnnA</span></span></code></pre></div>
<p>To see how this works, consider what that means for the located expression type <code>LHsExpr GhcPs</code> in GHC. We have <code>LHsExpr GhcPs</code> is <code>XRec GhcPs (HsExpr GhcPs)</code> which is <code>GenLocated (Anno (HsExpr GhcPs)) (HsExpr GhcPs)</code> or <code>GenLocated SrcSpanAnnA (HsExpr GhcPs)</code> (or, <code>LocatedA (HsExpr GhcPs))</code> if you like).</p>
<p>Expanding further we have <code>GenLocated SrcSpanAnnA (HsExpr GhcPs)</code> is <code>GenLocated (SrcAnn AnnListItem) (HsExpr GhcPs)</code>. So ultimately, <code>LHsExpr GhcPs</code> is <code>GenLocated (SrcSpanAnn' (EpAnn AnnListItem)) (HsExpr GhcPs)</code>.</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-58748948526093319802021-04-09T17:33:00.003-04:002021-06-01T23:00:41.718-04:00arith-cxx-tagless-final<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<title>arith-cxx-final-tagless</title>
<style>
html {
line-height: 1.5;
font-family: Georgia, serif;
font-size: 20px;
color: #1a1a1a;
background-color: #fdfdfd;
}
body {
margin: 0 auto;
max-width: 36em;
padding-left: 50px;
padding-right: 50px;
padding-top: 50px;
padding-bottom: 50px;
hyphens: auto;
word-wrap: break-word;
text-rendering: optimizeLegibility;
font-kerning: normal;
}
@media (max-width: 600px) {
body {
font-size: 0.9em;
padding: 1em;
}
}
@media print {
body {
background-color: transparent;
color: black;
font-size: 12pt;
}
p, h2, h3 {
orphans: 3;
widows: 3;
}
h2, h3, h4 {
page-break-after: avoid;
}
}
p {
margin: 1em 0;
}
a {
color: #1a1a1a;
}
a:visited {
color: #1a1a1a;
}
img {
max-width: 100%;
}
h1, h2, h3, h4, h5, h6 {
margin-top: 1.4em;
}
h5, h6 {
font-size: 1em;
font-style: italic;
}
h6 {
font-weight: normal;
}
ol, ul {
padding-left: 1.7em;
margin-top: 1em;
}
li > ol, li > ul {
margin-top: 0;
}
blockquote {
margin: 1em 0 1em 1.7em;
padding-left: 1em;
border-left: 2px solid #e6e6e6;
color: #606060;
}
code {
font-family: Menlo, Monaco, 'Lucida Console', Consolas, monospace;
font-size: 85%;
margin: 0;
}
pre {
margin: 1em 0;
overflow: auto;
}
pre code {
padding: 0;
overflow: visible;
}
.sourceCode {
background-color: transparent;
overflow: visible;
}
hr {
background-color: #1a1a1a;
border: none;
height: 1px;
margin: 1em 0;
}
table {
margin: 1em 0;
border-collapse: collapse;
width: 100%;
overflow-x: auto;
display: block;
font-variant-numeric: lining-nums tabular-nums;
}
table caption {
margin-bottom: 0.75em;
}
tbody {
margin-top: 0.5em;
border-top: 1px solid #1a1a1a;
border-bottom: 1px solid #1a1a1a;
}
th {
border-top: 1px solid #1a1a1a;
padding: 0.25em 0.5em 0.25em 0.5em;
}
td {
padding: 0.125em 0.5em 0.25em 0.5em;
}
header {
margin-bottom: 4em;
text-align: center;
}
#TOC li {
list-style: none;
}
#TOC a:not(:hover) {
text-decoration: none;
}
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
.display.math{display: block; text-align: center; margin: 0.5rem auto;}
</style>
<!--[if lt IE 9]>
<script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
<![endif]-->
</head>
<body>
<h1 id="arith-cxx-tagless-final">arith-cxx-tagless-final</h1>
<p>The aim is to prove out the idea of an interpreter where the “front end” (lexical analysis) is in Rust and the “back end” (interpretation) is in C++.</p>
<ul>
<li>We’ll parse and evaluate a simple language of arithmetic expressions;
<ul>
<li>The parser is implemented using <a href="https://github.com/Geal/nom"><code>nom</code></a> (a Rust parser combinator library);</li>
</ul></li>
<li>The <a href="http://okmij.org/ftp/tagless-final/index.html">“tagless-final” idiom</a> is used to split the front and back ends;</li>
<li>The interop between C++ and Rust is expressed using the <a href="https://github.com/dtolnay/cxx">CXX library</a>.</li>
</ul>
<h2 id="parser">Parser</h2>
<p>Additive expression syntax we’ll define by this grammar:</p>
<pre><code> expr := term ('+' term)*
term := lit | '-' term | '(' expr ')'
lit := digits</code></pre>
<p>The key idea of the program is this: Don’t define an abstract syntax tree type, values of which are produced by parsing. Rather, as parsing unfolds, call functions defined by the the folllowing trait.</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">trait</span> ExprSyn<span class="op">:</span> <span class="bu">Clone</span> <span class="op">{</span></span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> lit(n<span class="op">:</span> <span class="dt">i64</span>) <span class="op">-></span> <span class="dt">Self</span><span class="op">;</span></span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> neg(t<span class="op">:</span> <span class="dt">Self</span>) <span class="op">-></span> <span class="dt">Self</span><span class="op">;</span></span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> add(u<span class="op">:</span> <span class="dt">Self</span><span class="op">,</span> v<span class="op">:</span> <span class="dt">Self</span>) <span class="op">-></span> <span class="dt">Self</span><span class="op">;</span></span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>With that understood, we implement the parser as a Rust library in the following way.</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">mod</span> parse <span class="op">{</span></span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="kw">use</span> <span class="pp">nom::</span><span class="op">{</span></span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> <span class="pp">branch::</span>alt<span class="op">,</span></span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> <span class="pp">bytes::complete::</span>tag<span class="op">,</span></span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> <span class="pp">character::complete::</span><span class="dt">char</span><span class="op">,</span></span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> <span class="pp">character::complete::</span><span class="op">{</span>digit1 <span class="kw">as</span> digit<span class="op">,</span> space0 <span class="kw">as</span> space<span class="op">},</span></span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a> <span class="pp">combinator::</span><span class="op">{</span>map<span class="op">,</span> map_res<span class="op">},</span></span>
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a> <span class="pp">multi::</span>fold_many0<span class="op">,</span></span>
<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a> <span class="pp">sequence::</span><span class="op">{</span>delimited<span class="op">,</span> pair<span class="op">,</span> preceded<span class="op">},</span></span>
<span id="cb3-10"><a href="#cb3-10" aria-hidden="true" tabindex="-1"></a> IResult<span class="op">,</span></span>
<span id="cb3-11"><a href="#cb3-11" aria-hidden="true" tabindex="-1"></a> <span class="op">};</span></span>
<span id="cb3-12"><a href="#cb3-12" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-13"><a href="#cb3-13" aria-hidden="true" tabindex="-1"></a> <span class="kw">use</span> <span class="kw">super</span><span class="pp">::</span>ExprSyn<span class="op">;</span></span>
<span id="cb3-14"><a href="#cb3-14" aria-hidden="true" tabindex="-1"></a> <span class="kw">use</span> <span class="pp">std::</span><span class="dt">str</span><span class="pp">::</span><span class="bu">FromStr</span><span class="op">;</span></span>
<span id="cb3-15"><a href="#cb3-15" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-16"><a href="#cb3-16" aria-hidden="true" tabindex="-1"></a> <span class="kw">type</span> ParseResult<span class="op"><</span><span class="ot">'a</span><span class="op">,</span> E<span class="op">></span> <span class="op">=</span> IResult<span class="op"><&</span><span class="ot">'a</span> <span class="dt">str</span><span class="op">,</span> E<span class="op">>;</span></span>
<span id="cb3-17"><a href="#cb3-17" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-18"><a href="#cb3-18" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> lit<span class="op"><</span>E<span class="op">:</span> ExprSyn<span class="op">></span>(i<span class="op">:</span> <span class="op">&</span><span class="dt">str</span>) <span class="op">-></span> ParseResult<span class="op"><</span>E<span class="op">></span> <span class="op">{</span></span>
<span id="cb3-19"><a href="#cb3-19" aria-hidden="true" tabindex="-1"></a> map_res(delimited(space<span class="op">,</span> digit<span class="op">,</span> space)<span class="op">,</span> <span class="op">|</span>x<span class="op">|</span> <span class="op">{</span></span>
<span id="cb3-20"><a href="#cb3-20" aria-hidden="true" tabindex="-1"></a> <span class="bu">FromStr</span><span class="pp">::</span>from_str(x)<span class="op">.</span>map(<span class="pp">E::</span>lit)</span>
<span id="cb3-21"><a href="#cb3-21" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span>)(i)</span>
<span id="cb3-22"><a href="#cb3-22" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb3-23"><a href="#cb3-23" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-24"><a href="#cb3-24" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> neg<span class="op"><</span>E<span class="op">:</span> ExprSyn<span class="op">></span>(i<span class="op">:</span> <span class="op">&</span><span class="dt">str</span>) <span class="op">-></span> ParseResult<span class="op"><</span>E<span class="op">></span> <span class="op">{</span></span>
<span id="cb3-25"><a href="#cb3-25" aria-hidden="true" tabindex="-1"></a> map(delimited(space<span class="op">,</span> preceded(<span class="dt">char</span>(<span class="ch">'-'</span>)<span class="op">,</span> term)<span class="op">,</span> space)<span class="op">,</span> <span class="pp">E::</span>neg)(i)</span>
<span id="cb3-26"><a href="#cb3-26" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb3-27"><a href="#cb3-27" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-28"><a href="#cb3-28" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> par<span class="op"><</span>E<span class="op">:</span> ExprSyn<span class="op">></span>(i<span class="op">:</span> <span class="op">&</span><span class="dt">str</span>) <span class="op">-></span> ParseResult<span class="op"><</span>E<span class="op">></span> <span class="op">{</span></span>
<span id="cb3-29"><a href="#cb3-29" aria-hidden="true" tabindex="-1"></a> delimited(space<span class="op">,</span> delimited(tag(<span class="st">"("</span>)<span class="op">,</span> expr<span class="op">,</span> tag(<span class="st">")"</span>))<span class="op">,</span> space)(i)</span>
<span id="cb3-30"><a href="#cb3-30" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb3-31"><a href="#cb3-31" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-32"><a href="#cb3-32" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> term<span class="op"><</span>E<span class="op">:</span> ExprSyn<span class="op">></span>(i<span class="op">:</span> <span class="op">&</span><span class="dt">str</span>) <span class="op">-></span> ParseResult<span class="op"><</span>E<span class="op">></span> <span class="op">{</span></span>
<span id="cb3-33"><a href="#cb3-33" aria-hidden="true" tabindex="-1"></a> alt((lit<span class="op">,</span> neg<span class="op">,</span> par))(i)</span>
<span id="cb3-34"><a href="#cb3-34" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb3-35"><a href="#cb3-35" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-36"><a href="#cb3-36" aria-hidden="true" tabindex="-1"></a> <span class="kw">pub</span> <span class="kw">fn</span> expr<span class="op"><</span>E<span class="op">:</span> ExprSyn<span class="op">></span>(i<span class="op">:</span> <span class="op">&</span><span class="dt">str</span>) <span class="op">-></span> ParseResult<span class="op"><</span>E<span class="op">></span> <span class="op">{</span></span>
<span id="cb3-37"><a href="#cb3-37" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> (i<span class="op">,</span> init) <span class="op">=</span> term(i)<span class="op">?;</span></span>
<span id="cb3-38"><a href="#cb3-38" aria-hidden="true" tabindex="-1"></a> fold_many0(pair(<span class="dt">char</span>(<span class="ch">'+'</span>)<span class="op">,</span> term)<span class="op">,</span> init<span class="op">,</span> <span class="op">|</span>acc<span class="op">,</span> (_<span class="op">,</span> val)<span class="op">:</span> (<span class="dt">char</span><span class="op">,</span> E)<span class="op">|</span> <span class="op">{</span></span>
<span id="cb3-39"><a href="#cb3-39" aria-hidden="true" tabindex="-1"></a> <span class="pp">E::</span>add(acc<span class="op">,</span> val)</span>
<span id="cb3-40"><a href="#cb3-40" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span>)(i)</span>
<span id="cb3-41"><a href="#cb3-41" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb3-42"><a href="#cb3-42" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>To wire that up to C++ we need express a CXX “bridge”.</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="kw">use</span> <span class="pp">cxx::</span>SharedPtr<span class="op">;</span></span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a><span class="at">#[</span><span class="pp">cxx::</span>bridge<span class="at">]</span></span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">mod</span> ffi <span class="op">{</span></span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> <span class="kw">unsafe</span> <span class="kw">extern</span> <span class="st">"C++"</span> <span class="op">{</span></span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> <span class="pp">include!</span>(<span class="st">"arith_final_tagless/include/cpp_repr.hpp"</span>)<span class="op">;</span></span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> <span class="kw">type</span> Cpp_repr<span class="op">;</span></span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> lit(i<span class="op">:</span> <span class="dt">i64</span>) <span class="op">-></span> SharedPtr<span class="op"><</span>Cpp_repr<span class="op">>;</span></span>
<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> neg(e<span class="op">:</span> SharedPtr<span class="op"><</span>Cpp_repr<span class="op">></span>) <span class="op">-></span> SharedPtr<span class="op"><</span>Cpp_repr<span class="op">>;</span></span>
<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> add(l<span class="op">:</span> SharedPtr<span class="op"><</span>Cpp_repr<span class="op">>,</span> r<span class="op">:</span> SharedPtr<span class="op"><</span>Cpp_repr<span class="op">></span>) <span class="op">-></span> SharedPtr<span class="op"><</span>Cpp_repr<span class="op">>;</span></span>
<span id="cb4-12"><a href="#cb4-12" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb4-13"><a href="#cb4-13" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb4-14"><a href="#cb4-14" aria-hidden="true" tabindex="-1"></a> <span class="kw">extern</span> <span class="st">"Rust"</span> <span class="op">{</span></span>
<span id="cb4-15"><a href="#cb4-15" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> parse(s<span class="op">:</span> <span class="dt">String</span>) <span class="op">-></span> <span class="dt">Result</span><span class="op"><</span>SharedPtr<span class="op"><</span>Cpp_repr<span class="op">>>;</span></span>
<span id="cb4-16"><a href="#cb4-16" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb4-17"><a href="#cb4-17" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>The header file <code>cpp_repr.hpp</code> referenced by the bridge contains these prototypes.</p>
<div class="sourceCode" id="cb5"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="pp">#pragma once</span></span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a><span class="pp">#include </span><span class="im"><memory></span></span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a><span class="pp">#include </span><span class="im"><cstdint></span></span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> Cpp_repr <span class="op">{</span></span>
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a> <span class="dt">int64_t</span> expr<span class="op">;</span></span>
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a><span class="kw">using</span> <span class="dt">cpp_repr_t</span> <span class="op">=</span> <span class="bu">std::</span>shared_ptr<span class="op"><</span>Cpp_repr<span class="op">>;</span></span>
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-12"><a href="#cb5-12" aria-hidden="true" tabindex="-1"></a><span class="dt">cpp_repr_t</span> lit<span class="op">(</span><span class="dt">int64_t</span> i<span class="op">);</span></span>
<span id="cb5-13"><a href="#cb5-13" aria-hidden="true" tabindex="-1"></a><span class="dt">cpp_repr_t</span> neg<span class="op">(</span><span class="dt">cpp_repr_t</span> t<span class="op">);</span></span>
<span id="cb5-14"><a href="#cb5-14" aria-hidden="true" tabindex="-1"></a><span class="dt">cpp_repr_t</span> add<span class="op">(</span><span class="dt">cpp_repr_t</span> t<span class="op">,</span> <span class="dt">cpp_repr_t</span> u<span class="op">);</span></span></code></pre></div>
<p>The existence of that header is enough to finish off the Rust library.</p>
<div class="sourceCode" id="cb6"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="at">#[</span>allow<span class="at">(</span>non_camel_case_types<span class="at">)]</span></span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">type</span> CppRepr_t <span class="op">=</span> SharedPtr<span class="op"><</span><span class="pp">ffi::</span>Cpp_repr<span class="op">>;</span></span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a><span class="kw">impl</span> ExprSyn <span class="kw">for</span> CppRepr_t <span class="op">{</span></span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> lit(i<span class="op">:</span> <span class="dt">i64</span>) <span class="op">-></span> CppRepr_t <span class="op">{</span></span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a> <span class="pp">ffi::</span>lit(i)</span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> neg(t<span class="op">:</span> CppRepr_t) <span class="op">-></span> CppRepr_t <span class="op">{</span></span>
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a> <span class="pp">ffi::</span>neg(t)</span>
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> add(t1<span class="op">:</span> CppRepr_t<span class="op">,</span> t2<span class="op">:</span> CppRepr_t) <span class="op">-></span> CppRepr_t <span class="op">{</span></span>
<span id="cb6-12"><a href="#cb6-12" aria-hidden="true" tabindex="-1"></a> <span class="pp">ffi::</span>add(t1<span class="op">,</span> t2)</span>
<span id="cb6-13"><a href="#cb6-13" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb6-14"><a href="#cb6-14" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb6-15"><a href="#cb6-15" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-16"><a href="#cb6-16" aria-hidden="true" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">fn</span> parse(s<span class="op">:</span> <span class="dt">String</span>) <span class="op">-></span> <span class="dt">Result</span><span class="op"><</span>CppRepr_t<span class="op">,</span> <span class="dt">String</span><span class="op">></span> <span class="op">{</span></span>
<span id="cb6-17"><a href="#cb6-17" aria-hidden="true" tabindex="-1"></a> <span class="kw">match</span> <span class="pp">parse::expr::</span><span class="op"><</span>CppRepr_t<span class="op">></span>(s<span class="op">.</span>as_str()) <span class="op">{</span></span>
<span id="cb6-18"><a href="#cb6-18" aria-hidden="true" tabindex="-1"></a> <span class="cn">Ok</span>((_s<span class="op">,</span> rep)) <span class="op">=></span> <span class="cn">Ok</span>(rep)<span class="op">,</span></span>
<span id="cb6-19"><a href="#cb6-19" aria-hidden="true" tabindex="-1"></a> <span class="cn">Err</span>(e) <span class="op">=></span> <span class="cn">Err</span>(<span class="pp">format!</span>(<span class="st">"{}"</span><span class="op">,</span> e))<span class="op">,</span></span>
<span id="cb6-20"><a href="#cb6-20" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb6-21"><a href="#cb6-21" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<h2 id="evaluator">Evaluator</h2>
<p>We put the C++ part of the implementation in <code>cpp_repr.cpp</code> in a separate library.</p>
<div class="sourceCode" id="cb7"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a><span class="pp">#include </span><span class="im"><iostream></span></span>
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a><span class="pp">#include </span><span class="im">"arith_final_tagless/include/cpp_repr.hpp"</span></span>
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a><span class="dt">cpp_repr_t</span> lit<span class="op">(</span><span class="dt">int64_t</span> i<span class="op">)</span> <span class="op">{</span></span>
<span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a> <span class="cf">return</span> <span class="bu">std::</span>shared_ptr<span class="op"><</span>Cpp_repr<span class="op">>{</span><span class="kw">new</span> Cpp_repr<span class="op">{</span>i<span class="op">}};</span></span>
<span id="cb7-7"><a href="#cb7-7" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb7-8"><a href="#cb7-8" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-9"><a href="#cb7-9" aria-hidden="true" tabindex="-1"></a><span class="dt">cpp_repr_t</span> neg<span class="op">(</span><span class="dt">cpp_repr_t</span> t<span class="op">)</span> <span class="op">{</span></span>
<span id="cb7-10"><a href="#cb7-10" aria-hidden="true" tabindex="-1"></a> <span class="cf">return</span> <span class="bu">std::</span>shared_ptr<span class="op"><</span>Cpp_repr<span class="op">>{</span><span class="kw">new</span> Cpp_repr<span class="op">{-</span>t<span class="op">-></span>expr<span class="op">}};</span></span>
<span id="cb7-11"><a href="#cb7-11" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb7-12"><a href="#cb7-12" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-13"><a href="#cb7-13" aria-hidden="true" tabindex="-1"></a><span class="dt">cpp_repr_t</span> add<span class="op">(</span><span class="dt">cpp_repr_t</span> t<span class="op">,</span> <span class="dt">cpp_repr_t</span> u<span class="op">)</span> <span class="op">{</span></span>
<span id="cb7-14"><a href="#cb7-14" aria-hidden="true" tabindex="-1"></a> <span class="cf">return</span> <span class="bu">std::</span>shared_ptr<span class="op"><</span>Cpp_repr<span class="op">>{</span><span class="kw">new</span> Cpp_repr<span class="op">{</span>t<span class="op">-></span>expr <span class="op">+</span> u<span class="op">-></span>expr<span class="op">}};</span></span>
<span id="cb7-15"><a href="#cb7-15" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<h2 id="interpreter">Interpreter</h2>
<p>The REPL in <code>main.cpp</code> brings the Rust and C++ libraries together into an executable.</p>
<div class="sourceCode" id="cb8"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a><span class="pp">#include </span><span class="im"><iostream></span></span>
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a><span class="pp">#include </span><span class="im">"rust/cxx.h"</span><span class="pp"> </span><span class="co">// 'rust::Error'</span></span>
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a><span class="pp">#include </span><span class="im">"arith_final_tagless/src/arith_final_tagless.rs.h"</span><span class="pp"> </span><span class="co">// 'parse'</span></span>
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-6"><a href="#cb8-6" aria-hidden="true" tabindex="-1"></a><span class="dt">int</span> main<span class="op">()</span> <span class="op">{</span></span>
<span id="cb8-7"><a href="#cb8-7" aria-hidden="true" tabindex="-1"></a> <span class="dt">char</span> <span class="at">const</span><span class="op">*</span> prompt <span class="op">=</span> <span class="st">"</span><span class="sc">\n</span><span class="st">% "</span><span class="op">;</span></span>
<span id="cb8-8"><a href="#cb8-8" aria-hidden="true" tabindex="-1"></a> <span class="bu">std::</span>cout<span class="op"> <<</span> <span class="st">"Additive expression evalutator (type CTRL+D to exit)"</span> <span class="op"><<</span> prompt<span class="op">;</span></span>
<span id="cb8-9"><a href="#cb8-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-10"><a href="#cb8-10" aria-hidden="true" tabindex="-1"></a> <span class="bu">std::</span>string<span class="op"> </span>line<span class="op">;</span></span>
<span id="cb8-11"><a href="#cb8-11" aria-hidden="true" tabindex="-1"></a> <span class="cf">while</span><span class="op">(</span><span class="bu">std::</span>getline<span class="op">(</span><span class="bu">std::</span>cin<span class="op">,</span> line<span class="op">))</span> <span class="op">{</span></span>
<span id="cb8-12"><a href="#cb8-12" aria-hidden="true" tabindex="-1"></a> <span class="cf">try</span> <span class="op">{</span></span>
<span id="cb8-13"><a href="#cb8-13" aria-hidden="true" tabindex="-1"></a> <span class="cf">if</span> <span class="op">(</span><span class="kw">auto</span> p <span class="op">=</span> parse<span class="op">(</span>rust<span class="op">::</span>String<span class="op">{</span>line<span class="op">}))</span> <span class="op">{</span></span>
<span id="cb8-14"><a href="#cb8-14" aria-hidden="true" tabindex="-1"></a> <span class="bu">std::</span>cout<span class="op"> <<</span> line <span class="op"><<</span> <span class="st">" = "</span> <span class="op"><<</span> p<span class="op">-></span>expr <span class="op"><<</span> prompt<span class="op">;</span></span>
<span id="cb8-15"><a href="#cb8-15" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb8-16"><a href="#cb8-16" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span> <span class="cf">catch</span> <span class="op">(</span>rust<span class="op">::</span>Error <span class="at">const</span><span class="op">&</span> e<span class="op">)</span> <span class="op">{</span></span>
<span id="cb8-17"><a href="#cb8-17" aria-hidden="true" tabindex="-1"></a> <span class="bu">std::</span>cerr<span class="op"> <<</span> e<span class="op">.</span>what<span class="op">()</span> <span class="op"><<</span> prompt<span class="op">;</span></span>
<span id="cb8-18"><a href="#cb8-18" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb8-19"><a href="#cb8-19" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb8-20"><a href="#cb8-20" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-21"><a href="#cb8-21" aria-hidden="true" tabindex="-1"></a> <span class="cf">return</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb8-22"><a href="#cb8-22" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>Source and build scripts and so on for this project <a href="https://github.com/shayne-fletcher/zen/tree/master/rust/arith_final_tagless">here</a>.</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-75101633420807287082021-02-07T13:17:00.003-05:002021-03-11T10:55:22.924-05:00Configuring Cabal Build Flags<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>*markdown-output*</title>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
</head>
<body>
<h1 id="configuring-cabal-build-flags">Configuring Cabal build flags</h1>
<p>It’s always an emergency when I go looking for this information!</p>
<p>Suppose you are building HLint.</p>
<pre><code>mkdir ~/tmp && cd ~/tmp
curl -o hlint-3.2.7.tar.gz https://hackage.haskell.org/package/hlint-3.2.7/hlint-3.2.7.tar.gz
gunzip hlint-3.2.7.tar.gz && tar xvf hlint-3.2.7.tar</code></pre>
<p>Left to their own devices, HLint and <code>ghc-lib-parser-ex</code> will default to <code>auto</code> mode meaning, they will decide for themselves if they should depend on <code>ghc-lib-parser</code> or native ghc libraries.</p>
<p>Sometimes it’s desirable to force the situation though and explicitly make them do one or the other. How you do that? The answer is of course package Cabal flags. There are two scenarios: building with stack or building with cabal.</p>
<ul>
<li><p><code>stack.yaml</code></p>
<ul>
<li><p>Force link with <code>ghc-lib-parser</code></p>
<pre><code> flags:
hlint:
ghc-lib: true
ghc-lib-parser-ex:
auto: false
no-ghc-lib: false</code></pre></li>
<li><p>Force link with native ghc</p>
<pre><code> flags:
hlint:
ghc-lib: false
ghc-lib-parser-ex:
auto: false
no-ghc-lib: true</code></pre></li>
</ul></li>
<li><p><code>cabal.project</code></p>
<ul>
<li><p>Force link with <code>ghc-lib-parser</code></p>
<pre><code> packages: hlint-3.2.7
package hlint
flags: +ghc-lib
package ghc-lib-parser-ex
flags: -auto -no-ghc-lib</code></pre></li>
<li><p>Force link with native ghc</p>
<pre><code> packages: hlint-3.2.7
package hlint
flags: -ghc-lib
package ghc-lib-parser-ex
flags: -auto +no-ghc-lib</code></pre></li>
</ul></li>
</ul>
<p>When working with Cabal in the HLint repository one can configure with command line constraints like this:
<pre><code>cabal new-build --constraint "hlint -ghc-lib" --constraint "ghc-lib-parser-ex -auto +no-ghc-lib"</code></pre>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-35276768585138202142021-01-18T20:04:00.002-05:002021-01-18T20:04:54.791-05:00Two things in Rust<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>two_things_rust.html</title>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
</head>
<body>
<h1 id="two-things-in-rust">Two things in Rust</h1>
<p>Two things I needed to learn before Rust made sense to me.</p>
<h2 id="pattern-binding-modes">1 Pattern binding modes</h2>
<p>I don’t remember reading about this in <a href="https://doc.rust-lang.org/book/">the book</a>. Default binding modes come into play when <strong>non-reference</strong> patterns are encountered.</p>
<h4 id="example">Example</h4>
<div class="sourceCode" id="cb1"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="kw">let</span> x <span class="op">=</span> <span class="cn">Some</span>(<span class="dv">3</span>)<span class="op">;</span></span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="kw">let</span> y<span class="op">:</span> <span class="op">&</span><span class="dt">Option</span><span class="op"><</span><span class="dt">i32</span><span class="op">></span> <span class="op">=</span> <span class="op">&</span>x<span class="op">;</span></span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="kw">match</span> y <span class="op">{</span></span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="cn">Some</span>(a) <span class="op">-></span> <span class="op">{</span></span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> <span class="co">// `y` is deref'd and `a` is bound as `ref a`</span></span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> <span class="cn">None</span> <span class="op">=></span> <span class="op">{}</span></span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>The default binding mode starts as <code>move</code>. Each time a reference is matched using a <em>non-reference pattern</em>; it will automatically derefence the vaue and update the default binding mode - If the reference is <code>&val</code>, set the default binding mode to <code>ref</code> - If the reference is <code>&mut val</code>: if the current default binding is <code>ref</code>, it should remain <code>ref</code>. Otherwise, set the current binding mode to <code>ref mut</code>.</p>
<p>Full details are given in the <a href="https://github.com/rust-lang/rfcs/blob/master/text/2005-match-ergonomics.md">2005-match-ergonomics</a> rustlang RFC.</p>
<h3 id="example-1">Example</h3>
<ul>
<li>Example</li>
</ul>
<div class="sourceCode" id="cb2"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="kw">match</span> (<span class="op">&</span><span class="cn">Some</span>(<span class="dv">3</span>)) <span class="op">{</span></span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="cn">Some</span>(p) <span class="op">=></span></span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> <span class="co">// This pattern is a "non-reference pattern".</span></span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> <span class="co">// Dereference the `&` and shift the default binding</span></span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> <span class="co">// mode to `ref`. `p` is read as `ref p` and given type `i32`.</span></span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> x <span class="op">=></span> <span class="op">{</span></span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> <span class="co">// In this arm, we are still in `move` mode by default, so `x`</span></span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> <span class="co">// has type `&Option<32>`</span></span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<ul>
<li>Desugared</li>
</ul>
<div class="sourceCode" id="cb3"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a> <span class="kw">match</span>(<span class="op">&</span><span class="cn">Some</span>(<span class="dv">3</span>)) <span class="op">{</span></span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="op">&</span><span class="cn">Some</span>(<span class="kw">ref</span> p) <span class="op">=></span> <span class="op">{</span></span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> <span class="op">...</span></span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> <span class="op">},</span></span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> x <span class="op">=></span> <span class="op">{</span></span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> <span class="op">...</span></span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a> <span class="op">},</span></span>
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span></code></pre></div>
<h2 id="implict-deref-coercisons-with-functions-and-methods">2. <a href="https://doc.rust-lang.org/book/ch15-02-deref.html#implicit-deref-coercions-with-functions-and-methods">Implict Deref Coercisons with Functions and Methods</a></h2>
<p>This is another ergonomics feature that saves on explicit <code>&</code>s and <code>*</code>s when writing function and method calls. When we pass a reference to a function or method call <code>deref</code> implicitly as needed to coerce to the parameter target type.</p>
<ul>
<li>Example:</li>
</ul>
<pre><code>fn hello(name: &str) {
println!("Hello, {}", name);
}
fn main() {
let m = MyBox::new(String::from("Rust"));
hello(&m);
}</code></pre>
<ul>
<li>Example:</li>
</ul>
<pre><code>pub struct Point {
x: Vec<i32>,
y: Vec<i32>,
}
impl Point {
pub fn x(&self) -> &Vec<i32> {
match self {
&Point { ref x, .. } => x,
}
}
}
fn use_i32(_: &i32) -> () {}
fn use_vi32(_: &Vec<i32>) -> () {}
fn use_str(_: &str) -> () {}
fn use_strr(_: &&String) -> () {}
fn main() {
let p: Point = Point {
x: vec![],
y: vec![],
};
let rp: &Point = &p;
let rrp: &&Point = &rp;
println!("p.x = {:?}", rrp.x());
let _s: &str = "foo";
let s: String = String::from(_s);
let bs: Box<String> = Box::new(s.clone());
let bsr: Box<&String> = Box::new(&s);
let bi32: Box<Vec<i32>> = Box::new(vec![]);
use_i32(&&&&1i32);
use_str(bs.deref());
use_str(&s);
use_strr(bsr.deref());
let r: &&String = &bsr;
let r2: &String = r.deref();
use_str(r2);
use_str(&bsr);
use_vi32(&bi32);
let p = Point {
x: vec![1, 2, 3],
y: vec![4, 5, 6],
};
match &p {
Point { x, y } => {
use_vi32(x);
use_vi32(y);
println!("{:?}, {:?}", x, y);
}
}
println!("{:?}, {:?}", p.x, p.y);
}</code></pre>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-5551637709877934812020-10-30T14:51:00.011-04:002021-06-06T04:51:42.334-04:00ghc-lib-parser module count<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
</style>
<title>ghc-lib-parser module count</title>
</head>
<body>
<p>When a user installs a program like <a href="https://github.com/ndmitchell/hlint/blob/master/README.md">HLint</a> via Cabal, there's a good chance they'll pay a cost of building <code class="code">ghc-lib-parser</code>. The build time for <code class="code">ghc-lib-parser</code> is proportional to the number of modules that it contains, the less there are the faster it builds. The fewer there are, the better the user experience installing HLint.
</p>
<p>Back in September last year there was a bit of a <a href="https://mail.haskell.org/pipermail/ghc-devs/2019-September/018122.html">scare</a>. At that time, <code class="code">ghc-lib-parser</code> consisted of around 165 modules (and <code class="code">ghc-lib</code> had roughly 300). An MR landed that unintentionally resulted in <code class="code">ghc-lib-parser</code> needing 543 modules (and <code class="code">ghc-lib</code> getting just 25). Fortunately a quick refactoring sorted that out (thanks <a href="https://twitter.com/sgraf1337">@sgraf1337</a>!) As <a href="https://twitter.com/mpickering_">@mpickering_</a> correctly <a href="https://mail.haskell.org/pipermail/ghc-devs/2019-September/018125.html">pointed out</a> at that time "any fix should ensure adding a test to make sure it doesn't happen again". To that end, <a href="https://twitter.com/cocreature?s=20">@cocreature</a> and I managed to work out the details of one which we can have a look at in a minute.
</p>
<p>Since a year has now elapsed since the test was introduced, it's interesting to see how the module count has changed over time. TL;DR it's not too bad — the number of modules has increased from around 165 a year ago to about 230 today:
</p>
<p>
<div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCK3cpAw3nPWrQULJooj1LkT93mUnS_ojD4HjsxGNGAl7k9wH2RCmhumJ9HyEuyA598bRRyqLHk2N0LY0jc4DGeH8ANQmc9XzS8xp9vRii_OosL58N1nBXbKKCI1JdBz9aoozxHZh2IhhJ/s608/parser-deps.png" style="display: block; padding: 1em 0px; text-align: center;"><img alt="" border="0" data-original-height="377" data-original-width="608" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCK3cpAw3nPWrQULJooj1LkT93mUnS_ojD4HjsxGNGAl7k9wH2RCmhumJ9HyEuyA598bRRyqLHk2N0LY0jc4DGeH8ANQmc9XzS8xp9vRii_OosL58N1nBXbKKCI1JdBz9aoozxHZh2IhhJ/s320/parser-deps.png" width="320" /></a></div></p>
<p>So how does the test work? To be comfortable in the knowledge that the number of modules needed in <code class="code">ghc-lib-parser</code> will not signficantly change without anyone noticing, it's enough to have a program that counts them. That program when inserted into GHC's continuous integration system alerts the committer to a limit breach. How do we count them? We use the GHC API to compute the transitive closure of the dependencies of <code class="code">GHC.Parser</code> — the count is the cardinality of that. The code is only a few lines. The key to it is the function <a href="https://hackage.haskell.org/package/ghc-8.10.1/docs/HscMain.html#v:hscGetModuleInterface"><code class="code">hscGetModuleInterface</code></a>. You can read the source <a href="https://gitlab.haskell.org/ghc/ghc/-/blob/master/utils/count-deps/Main.hs">here</a>.
</p>
</body>
</html>
<br>Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-70492235346830092392020-04-17T17:48:00.005-04:002020-08-24T09:58:01.046-04:00Syntactic ambiguity resolution in the GHC parser<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
table, th, td {
border: 1px solid black;
border-collapse: collapse;
}
th, td {
padding: 10px;
}
</style>
<title>Syntactic ambiguity resolution in the GHC parser</title>
</head>
<body>
<h1></h1>
<p>
There are places in the Haskell grammar where it's not known apriori whether it's an expression a command or a pattern that is being parsed. This used to be handled by picking a parse (e.g. as an expression say) and if that choice later turned out to be wrong, "rejigging it" (transform the constructed parse tree to its analog in the pattern language). The problem with that approach is that it meant having conflated sub-languages meaning, for example, <code class="code">HsExpr</code> had to have pattern related constructors e.g. <code class="code">EWildPat</code>, <code class="code">EAsPat</code> (and further, these propogated into other compiler phases like the renamer and typechecker). This was the case until roughly a year ago before extraordinary work by Vladislav Zavialov who solved the ambiguity resolution issue by parsing into an abstraction with an overloaded representation:
<pre><code class=code>class DisambECP b where ...
newtype ECP = ECP { unECP :: forall b. DisambECP b => PV (Located b) }
</code></pre>
This innovation might be considered to have come at a cost for developers familiar with the "old" parser however. That is, dealing with understanding the apparent complexity introduced by the ambiguity resolution system. This post attempts to provide some intuition about how the system works and hopefully will lead to the realization that it's not that hard to understand after all!
</p>
<p>Because this post is about building intuition, there are details that are glossed over or omitted entirely: the reader is encouraged to read Vlad's detailed explanatory comments in <code class="code">RdrHsSyn.hs</code> when neccessary to address that.
</p>
<p>We start with something familiar - the GHC parser monad:
<pre><code class=code>newtype P a = P { unP :: PState -> ParseResult a }
</code></pre>
</p>
<p>
This fundamentally is a wrapper over a function <code class="code">PState -> ParseResult a</code>.
</p>
<p>
The (let's call it the) "ECP system" introduces a new (and as we'll see, very related) concept. The parser validator monad:
<pre><code class=code>newtype PV a = PV { unPV :: PV_Context -> PV_Accum -> PV_Result a }
</code></pre>
</p>
<p>So a parser validator is a function similar in spirit to a parser where:
<ul><li><code class="code">data PV_Context</code>: The type of essentially a wrapper around the lexer <code class="code">ParserFlags</code> value;</li>
<li><code class="code">data PV_Accum</code>: The type of state accumulated during parsing validation (like errors & warnings , comments, annotations);</li>
<li><code class="code">data PV_Result</code>: The parser validator function's result type that is, <code class=code>data PV_Result a = PV_Ok PV_Accum a | PV_Failed PV_Accum</code>.</li>
</ul>
</p>
<p>Of critical interest is how this type is made a monad.
<pre><code class=code>instance Functor PV where
fmap = liftM
instance Applicative PV where
pure a = a `seq` PV (\_ acc -> PV_Ok acc a)
(<*>) = ap
</code></pre>
</p>
<p>The above reveals that an expression like <code class="code">return e</code> where <code>e</code> is of type <code class="code">Located b</code>, constructs a function that given arguments <code class="code">ctx</code> and <code class="code">acc</code> returns <code class="code">e</code>. The moral equivalent of <code class="code">const</code>.
<pre><code class=code>instance Monad PV where
m >>= f = PV $ \ctx acc ->
case unPV m ctx acc of
PV_Ok acc' a -> unPV (f a) ctx acc'
PV_Failed acc' -> PV_Failed acc'
</code></pre>
</p>
<p>The bind operation composes <code class="code">PV</code> actions threading context and accumlators through the application of their contained functions: given an <code class="code">m :: PV a</code> and a function <code class="code">f :: a -> PV b</code>, then <code class="code">m >>= f</code> constructs a <code class="code">PV b</code> that wraps a function that composes <code class="code">f</code> with the function in <code class="code">m</code>.
</p>
<p><code class="code">PV</code> is a bit more than a monad, it also satisfies the <code class="code">MonadP</code> class for monads that support parsing-related operations providing the ability to query for active language extensions, store warnings, errors, comments and annotations.
<pre><code class=code>instance MonadP PV where
addError srcspan msg = ....
PV $ \ctx acc@PV_Accum{pv_messages=m} ->
let msg' = msg $$ pv_hint ctx in
PV_Ok acc{pv_messages=appendError srcspan msg' m} ()
addWarning option srcspan warning = ...
addFatalError srcspan msg =...
getBit ext =
PV $ \ctx acc ->
let b = ext `xtest` pExtsBitmap (pv_options ctx) in
PV_Ok acc $! b
addAnnotation (RealSrcSpan l _) a (RealSrcSpan v _) = ...
...
</code></pre>
</p>
<p>The function <code class="code">runPV</code> is the interpreter of a <code class="code">PV a</code>. To run a <code class="code">PV a</code> through this function is to produce a <code class="code">P a</code>.
<pre><code class=code>runPV :: PV a -> P a
</code></pre>
</p>
<p>That is, given a <code class="code">PV a</code> construct a function <code class="code">PState -> ParseResult a</code>.
<pre><code class=code>runPV m =
P $ \s ->
let
pv_ctx = PV_Context {...} -- init context from parse state 's'
pv_acc = PV_Accum {...} -- init local state from parse state 's'
-- Define a function that builds a parse state from local state
mkPState acc' =
s { messages = pv_messages acc'
, annotations = pv_annotations acc'
, comment_q = pv_comment_q acc'
, annotations_comments = pv_annotations_comments acc' }
in
-- Invoke the function in m with context and state, harvest its revised state and
-- turn its outcome into a ParseResult.
case unPV m pv_ctx pv_acc of
PV_Ok acc' a -> POk (mkPState acc') a
PV_Failed acc' -> PFailed (mkPState acc')
</code></pre>
</p>
<p>It is often the case that a production (or set of productions) might result different ASTs depending on the context. Ideally, we just want to write the productions once and reuse them across these different sub-languages (e.g. expressions vs. commands vs. patterns). For example, the production for a parenthesized "thing" is
<pre><code class=code>'(' texp ')'
</code></pre>
</p>
<p>In the context of a pattern we expect an AST with a <code class="code">ParPat _ p</code> node whereas in the context of an expression we want an AST with an <code class="code">HsPar _ e</code> node. To this end the <code class="code">DisambECP</code> class embodies an abstract set of operations for parse tree construction.
<pre><code class=code>class DisambECP b where
...
-- | Return a command without ambiguity, or fail in a non-command context.
ecpFromCmd' :: LHsCmd GhcPs -> PV (Located b)
-- | Return an expression without ambiguity, or fail in a non-expression context.
ecpFromExp' :: LHsExpr GhcPs -> PV (Located b)
... Lots of operations like this
mkHsOpAppPV :: SrcSpan -> Located b -> Located (InfixOp b) -> Located b -> PV (Located b)
mkHsVarPV :: Located RdrName -> PV (Located b)
...
</code></pre>
</p>
<p>The idea is that in the semantic actions of the grammar we construct and compose parser validators in terms of these abstract functions. Running the <code class="code">PV</code>s produces parsers and at the point of execution of parsers we know the context (the nature of the AST we expect to recive) and the concrete choices for each of the abstract functions is thereby fixed (and then, on evaluation, we get the parse result).
</p>
<p>The only wrinkle is in the return type of productions that produce parser validators. In general, they will have the form <code class="code">forall b. DisambECP b => PV (Located b)</code>. If they were monadic productions though we would be led to <code class="code">P (forall b. DisambECP b => PV (Located b)</code> and that dog don't hunt for GHC's lack of support for impredicative types. There is a standard work-around that can be employed though. This newtype is how impredicative types in monadic productions are avoided:
<pre><code class=code>newtype ECP = ECP { unECP :: forall b. DisambECP b => PV (Located b) }
</code></pre>
</p>
<p>So here, <code class="code">ECP</code> is a wrapper around a <code class="code">PV (Located b)</code> value where <code class="code">b</code> can be of any type that satisifies the constraints of <code class="code">class DisamECP</code>. So, in a production that looks like
<pre><code class=code>| ... {% return (ECP ...)}
</code></pre>
</p>
<p>
we are dealing with <code class="code">P ECP</code> whereas without a newtype we would be dealing with <code class="code">P (forall b. DisambECP b => PV (Located b))</code>.
</p>
<p>Now to produce a <code class="code">P (Located b)</code> from the <code class="code">PV (Located b)</code> in an <code class="code">ECP</code> we can use this expression (of type <code class=code>DisambECP b => ECP -> P (Located b)</code>):
<pre><code class="code">runPV (unECP p)
</code></pre>
</p>
<p>It takes an <code class="code">ECP</code> value, projects out the parser validator contained therein and "runs" it to produce a function from <code class="code">PState -> ParseResult a</code> (a parser action).
</p>
<p>From the <code class="code">DisabmECP</code> instance for <code class="code">HsExpr GhcPs</code>, here's <code class="code">ecpFromCmd'</code>:
<pre><code class=code> ecpFromCmd' (L l c) = do
addError l $ vcat
[ text "Arrow command found where an expression was expected:",
nest 2 (ppr c) ]
return (L l hsHoleExpr)
</code></pre>
</p>
<p>Makes perfect sense - you get a parser validator that when evaluated will store a (non-fatal) error and returns an expression "hole" (unbound variable called <code class="code">_</code>) so that parsing can continue.
</p>
<p>Continuing, the definition of <code class="code">ecpFromExp'</code>:
<pre><code class=code> ecpFromExp' = return
</code></pre>
</p>
<p>Also sensible. Simply calculate a function that returns its provided <code class="code">acc</code> argument together with the given constant expression under a <code class="code">PV_Ok</code> result (see the definition of <code class="code">pure</code> in the <code class="code">Appliciatve</code> instance for <code class="code">PV</code> given above).
</p>
<p>Parenthesizing an expression for this <code class="code">DisambECP</code> instance means wrapping a <code class="code">HsPar</code> around the given <code class="code">e</code>:
<pre><code class=code> mkHsParPV l e = return $ L l (HsPar noExtField e)
</code></pre>
</p>
<p>And so on. You get the idea.
</p>
<p>So how does this all fit together? Consider agin the production of parenthesized things:
<pre><code class=code> | '(' texp ')' { ECP $
unECP $2 >>= \ $2 ->
amms (mkHsParPV (comb2 $1 $>) $2) [mop $1,mcp $3] }
</code></pre>
</p>
<p>We note that the <code class="code">texp</code> production calculates an <code class="code">ECP</code>. Stripping away for simplicity the annotation and source code location calculations in the semantic action, in essence we are left with this.
<pre><code class=code>ECP $ unECP $2 >>= \ $2 -> mkHsParPV $2
</code></pre>
</p>
<p>The effect of <code class="code">unECP</code> is to project out the <code class="code">forall b. DisambECP b => PV (Located b)</code> value from the result of <code class="code">texp</code>. Recalling that <code class="code">unPV</code> projects out the function that the <code class="code">PV</code> wrapper shields and by substition of the definition of bind, we obtain roughly:
<pre><code class=code> ECP $ PV $ \ctx acc ->
case unPV (unECP $2) ctx acc of
PV_Ok acc' a -> unPV (mkHsParPV a) ctx acc'
PV_Failed acc' -> PV_Failed acc'
</code></pre>
</p>
<p>The net effet is we construct a new parser validatior (function) from the parser validator (function) returned from the <code class="code">texp</code> production that puts parenthesis around whatever that function when evaluated produces. If used in a context where <code class="code">texp</code> generates a <code class="code">LPat GhcPs</code> that'll be a <code class="code">ParPat</code> node, if an <code class="code">LHsExpr GhcPs</code>, then a <code class="code">HsPar</code> node.
</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-24449178587052332312020-04-05T15:18:00.001-04:002020-04-05T21:45:53.404-04:00GHC: How whitespace sensitive operator lexing works<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
table, th, td {
border: 1px solid black;
border-collapse: collapse;
}
th, td {
padding: 10px;
}
</style>
<title>How whitespace sensitive operator lexing works</title>
</head>
<body>
<h1></h1>
<p>In GHC, Haskell operator occurrences get classified into one of four categories. For example, the occurrence of ⊕ in <code class="code">a ⊕ b</code> is "loose infix", in <code class="code">a⊕b</code> is "tight infix", in <code class="code">a ⊕b</code> is "prefix" and in <code class="code">a⊕ b</code>, "suffix"</p>
<p>The point of this is that certain operators can be ascribed different meanings depending on the classification of their occurrence and language extensions that may be in effect. For example, <code class="code">!</code> when encountered will lex as strictness annotation (token type <code class="code">ITbang</code>) if its occurrence is prefix (e.g. <code class="code">f !x = rhs</code>) or an ordinary operator (token type <code class="code">ITvarsym</code> ) if not (e.g. <code class="code">xs ! 3</code>). Another ready example is provided by operator <code class="code">@</code> which, according to whitespace considerations, may be a type application (prefix), an as-pattern (tight infix), an ordinary operator (loose infix) or a parse error (suffix).
<p>The implementation of this categorization relies upon two functions: <code class="code">followedByOpeningToken</code> and <code class="code">precededByClosingToken</code>. To explain further:
<ul>
<li>Identifiers, literals and opening brackets <code class="code">(</code>, <code class="code">(#</code>, <code class="code">[|</code>, <code class="code">[||</code>, <code class="code">[p|</code>, <code class="code">[t|</code>, <code class="code">{</code> are considered "opening tokens";</li>
<li>Identifiers, literals and closing brackets <code class="code">)</code>, <code class="code">#)</code>, <code class="code">]</code>, <code class="code">|]</code>, <code class="code">}</code> are considered "closing tokens";</li>
<li>Other tokens and whitespace are considered neither opening or closing.</li>
</ul>
</p>
<p>
The classification algorithm is defined by the following rules:
<table>
<tr><th><code class="code">precededByClosingToken</code></th><th><code class="code">followedByOpeningToken</code></th><th>occurrence</th></tr>
<tr><td align="center"><code class="code">False</code></td><td align="center"><code class="code">True</code></td><td align="center">prefix</td></tr>
<tr><td align="center"><code class="code">True</code></td><td align="center"><code class="code">False</code></td><td align="center">suffix</td></tr>
<tr><td align="center"><code class="code">True</code></td><td align="center"><code class="code">True</code></td><td align="center">tight infix</td></tr>
<tr><td align="center"><code class="code">False</code></td><td align="center"><code class="code">False</code></td><td align="center">loose infix</td></tr>
</table>
</p>
The implementation of <code class="code">precededByClosingToken</code> is very straightforward: look backwards one character in the lexing buffer.
<pre><code class="code">precededByClosingToken :: AlexAccPred ExtsBitmap
precededByClosingToken _ (AI _ buf) _ _ =
case prevChar buf '\n' of
'}' -> decodePrevNChars 1 buf /= "-"
')' -> True
']' -> True
'\"' -> True
'\'' -> True
'_' -> True
c -> isAlphaNum c
</code></pre>
Similarly, <code class="code">followedByOpeningToken</code>: look forwards one character in the lexing buffer.
<pre><code class="code">followedByOpeningToken :: AlexAccPred ExtsBitmap
followedByOpeningToken _ _ _ (AI _ buf)
| atEnd buf = False
| otherwise =
case nextChar buf of
('{', buf') -> nextCharIsNot buf' (== '-')
('(', _) -> True
('[', _) -> True
('\"', _) -> True
('\'', _) -> True
('_', _) -> True
(c, _) -> isAlphaNum c
</code></pre>
Armed by these rules, the lexing of operators looks like this:
<pre><code class="code"><0> {
@varsym / { precededByClosingToken `alexAndPred` followedByOpeningToken } { varsym_tight_infix }
@varsym / { followedByOpeningToken } { varsym_prefix }
@varsym / { precededByClosingToken } { varsym_suffix }
@varsym { varsym_loose_infix }
}
</code></pre>
</p>
<p>
The actions <code class="code">varsym_tight_infix</code>, <code class="code">varsym_prefix</code>, <code class="code">varsym_suffix</code> and <code class="code">varsym_loose_infix</code> are "fed" the operator and allow for language extension specific issuance of tokens (as opposed to issuance of general <code class="code">ITvarsym</code> tokens). For example, <code class="code">varsym_prefix</code> :
<pre><code class=code>varsym_prefix :: Action
varsym_prefix = sym $ \exts s ->
if | TypeApplicationsBit `xtest` exts, s == fsLit "@"
-> return ITtypeApp
| ...
| otherwise -> return (ITvarsym s)
</code></pre>
</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-34628002012479107152020-03-01T07:34:00.000-05:002020-04-07T22:13:38.904-04:00GHC Haskell Pats and LPats<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
</style>
<title>GHC Haskell Pats and LPats</title>
</head>
<body>
<p>In the <a href="http://www.jucs.org/jucs_23_1/trees_that_grow/jucs_23_01_0042_0062_najd.pdf">Trees that Grow</a> paper, it is explained that GHC has a single data type <i>HsSyn</i> that crosses several compiler phases; a second data type <i>TH.Syntax</i> for Template Haskell and that other Haskell libraries e.g. <code>haskell-src-exts</code> defnining yet others. Ideally, <i>HsSyn</i> would be reused in Template Haskell and these third-party libraries and motivates the flexibilities offered by the TTG (Trees That Grow) techniques.
</p>
<p>Before GHC 8.8, patterns and located patterns were related in the following way:
<pre><code class="code">type LPat = Located Pat
data Pat p
= ...
| LazyPat (XLazyPat p) (LPat p)
...
</code></pre>
That is, patterns with locations are represented by values of type <code class="code">LPat</code> and patterns themselves as values of type <code class="code">Pat</code>. Note that <code class="code">LPat</code> values contain <code class="code">Pat</code> values which in turn can contain <code class="code">LPat</code> values hence the name "ping pong style" being given to this idiom.
</p>
<p>
Since location annotations may (e.g. GHC native) or may not (e.g. <i>Template Haskell</i>) be present for a given application it is realized that "baking" locations into <i>HsSyn</i> is undesirable. For this reason, in 8.8 attempts were made to make their presence a strictly GHC "thing" in the following way:
<pre><code class="code">type LPat p = Pat p
data Pat p
= ...
| LazyPat (XLazyPat p) (LPat p)
| ...
| XPat (XXPat p)
type instance XXPat (GhcPass p) = Located (Pat (GhcPass p))
</code></pre>
That is, in GHC under this approach, locations are stored in the extension constructor - patterns with locations are wrapped in <code class="code">XPat</code> e.g. <code class="code">XPat noExt (L _ (VarPat noExt _))</code>. Of course, now, to get at the location you have to go through an indirection through <code class="code">XPat</code>. For this, the functions <code class="code">cL</code> and <code class="code">dL</code> (and the bi-directional pattern synonym <code class="code">LL</code>) were provided. Applications that don't want locations in the parse tree just don't make use of the <code class="code">XPat</code> constructor.
</p>
<p>It turned out that the 8.8 approach wasn't as good an idea as it seemed; it was a bit more complicated than it needed to be and had some unexpected implications for the existing GHC source code base. It was realized that this following alternative approach yields the same benefits and is what we find in 8.10 and beyond:
<pre><code class="code">type family XRec p (f :: * -> *) = r | r -> p f
type instance XRec (GhcPass p) f = Located (f (GhcPass p))
type LPat p = XRec p Pat
data Pat p
= ...
| LazyPat (XLazyPat p) (LPat p)
| ...
| XPat (XXPat p)
type instance XXPat (GhcPass _) = NoExtCon
</code></pre>
Thus for GHC, ping-pong style is restored and applications other than GHC can define the <code class="code">XRec</code> instance as simply <code class="code">f p</code> so that locations are absent.
</p>
<p>In practical terms, going from 8.8 to 8.10 <code class="code">LL</code> becomes <code class="code">L</code>, <code class="code">dL -></code> is removed and <code class="code">cL</code> is just <code class="code">L</code>.
</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-65517900703424370552019-08-11T11:43:00.000-04:002019-08-11T12:55:15.055-04:00Partitions of a set<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
</style>
<title>Calculating the partitions of a set</title>
</head>
<body>
<p>Having "solved" a bunch of these divide & conquer problems, I'm the first to admit to having being lulled into a false sense of security. At first glance, the problem of this post seemed deceptively simple and consequently I struggled with it, sort of "hand-waving", not really engaging my brain and getting more and more frustrated how the dang thing wouldn't yield to my experience! I think the moral of the story is math doesn't care about your previous successes and so don't let your past practice trick you into laziness. Be guided by your experience but fully apply yourself to the problem at hand!</p>
<p>Suppose a set of two elements <em>{2, 3}</em>. There are only two ways it can be partitioned: <em>(23), (3)(2)</em>. For meaning, you might think of these two partitions like this : in the first partition, there is a connection between the elements <em>2</em> and <em>3</em>, in the second, <em>2</em> and <em>3</em> are isolated from each other.</p>
<p>Suppose a set of elements <em>{1, 2, 3}</em>. There are five partitions of this set : <em>(123), (23)(1), (13)(2), (3)(21), (3)(2)(1)</em> (I've carefully written them out this way to help with the elucidation). Maybe you want to break here and see if you can write an algorithm for calculating them before reading on?
</p>
<p>Observe that we can get the partitions of <em>{1, 2, 3}</em> from knowledge of the partitions of <em>{2, 3}</em> by looking at each partition of <em>{2, 3}</em> in turn and considering the partitions that would result by inclusion of the element <em>1</em>. So, for example, the partition <em>(23)</em> gives rise to the partitions <em>(123)</em> and <em>(23)(1)</em>. Similarly, the partition <em>(3)(2)</em> gives rise to the partitions <em>(13)(2)</em>, <em>(3)(21)</em> and <em>(3)(2)(1)</em>. We might characterize this process as computing new partitions of <em>{1, 2, 3}</em> from a partition <em>p</em> of <em>{2, 3}</em> as "extending" <em>p</em> .
</p>
<p>Suppose then we write a function <code class="code">extend x p</code> to capture the above idea. Let's start with the signature of <code class="code">extend</code>. What would it be? Taking <em>(23)(1)</em> as an exemplar, we see that a component of a partition can be represented as <code class="code">[a]</code> and so a partition itself then as <code class="code">[[a]]</code>. We know that <code class="code">extend</code> takes an element and a partition and returns a list of (new) partitions so it must have signature <code class="code">extend :: a -> [[a]] -> [[[a]]]</code> (yes, lists of lists of lists are somehow easy to get confused about).
</p>
<p>Now for writing the body of <code class="code">extend</code>. The base case is the easiest of course - extending the empty partition:
<pre><code class="code">extend x [] = [[[x]]]
</code></pre>
That is, a singleton list of partitions where that one partition has one component. The inductive case is the partition obtained by "pushing" <code class="code">x</code> into the first component of <code class="code">p</code> together with the extensions that leave the first component of <code class="code">p</code> alone.
<pre><code class="code">extend x (h : tl) = ((x : h) : tl) : map (h :) (extend x tl)
</code></pre>
</p>
<p>We can now phrase the function <code class="code">partition</code> with signature <code class="code">partition :: [a] -> [[[a]]]</code> like this:
<pre><code class="code">partition [] = [[]]
partition (h : tl) = concatMap (extend h) (partition tl)
</code></pre>
The base case says, the only partition of the empty set is the the empty partition.
</p>
<p>Wrapping it all up, the algorithm in entirety is
<pre><code class="code">partition :: [a] -> [[[a]]]
partition [] = [[]]
partition (h : tl) = concatMap (extend h) (partition tl)
where
extend :: a -> [[a]] -> [[[a]]]
extend x [] = [[[x]]]
extend x (h : tl) = ((x : h) : tl) : map (h :) (extend x tl)
</code></pre>
</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-68872087512405501262019-06-29T14:40:00.002-04:002019-06-29T18:47:22.165-04:00Build GHC with stack and hadrian<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
</style>
<title>Building GHC with stack and hadrian</title>
</head>
<body>
<p>
By far the easiest way I know of to get a build of GHC is via the tools 'stack' and 'hadrian'<super>*</super>. The procedures below set out commands that I know first hand work<super>**</super> with machines provisioned by the CI systems Azure, Travis and Appveyor.
</p>
<h3>Setup</h3>
<ul><li>Ubuntu:
<pre><code class="code">curl -sSL https://get.haskellstack.org/ | sh
stack setup
</code></pre></li>
<li>macOS:
<pre><code class="code">/usr/bin/ruby -e \
"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew install autoconf automake gmp
curl -sSL https://get.haskellstack.org/ | sh
stack setup
</code></pre></li>
<li>Windows:
<pre><code class="code">curl -sSL https://get.haskellstack.org/ | sh
stack setup
stack exec -- pacman -S autoconf automake-wrapper make patch python tar \
--noconfirm
</code></pre></li>
</ul>
<h3>Build</h3>
<ul><li>Ubuntu & macOS:
<pre><code class="code">git clone --recursive https://gitlab.haskell.org/ghc/ghc.git
cd ghc
hadrian/build.stack.sh --configure --flavour=quickest -j
</code></pre></li>
<li>Windows:
<pre><code class="code">git clone --recursive https://gitlab.haskell.org/ghc/ghc.git
cd ghc
hadrian/build.stack.bat --configure --flavour=quickest -j
</code></pre></li>
</ul>
<br/>
<hr/>
[*] The simplicitly and uniformity of these commands make me an advocate of these tools and in particular, the hadrian <code>--configure</code> flag.
<br/>
<br/>
[**] Well, that is to say mostly work. The above is the ideal and has worked me for me reliably for the last year. Recently though, for one reason or another, there seem to have been a lot of breakages. Your mileage may vary.
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-16927029071341867692019-06-28T13:58:00.002-04:002019-06-28T14:11:38.035-04:00Harvesting annotations from the GHC parser<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
</style>
<title>Harvesting annotations from the GHC parser</title>
</head>
<body>
<p>My <a href="http://blog.shaynefletcher.org/2019/06/have-ghc-parsing-respect-dynamic-pragmas.html">last post</a> on parsing in the presence of dynamic pragmas left us with this outline for calling the GHC parser.
<pre><code class="code"> flags <-
parsePragmasIntoDynFlags
(defaultDynFlags fakeSettings fakeLlvmConfig) file s
whenJust flags $ \flags ->
case parse file flags s of
PFailed s ->
report flags $ snd (getMessages s flags)
POk s m -> do
let (wrns, errs) = getMessages s flags
report flags wrns
report flags errs
when (null errs) $ analyzeModule flags m
</code></pre>
</p>
<p>Now, it's a fact that you'll not find in a GHC parse tree certain things like comments and the location of keywords (e.g. <code class="code">let</code>, <code class="code">in</code> and so on). Certainly, if you're writing refactoring tools (think programs like <a href="http://neilmitchell.blogspot.com/">Neil Mitchell</a>'s awesome <a href="https://github.com/ndmitchell/hlint#readme"><code>hlint</code></a> for example), access to these things is critical!
</p>
<p>So, how does one go about getting these program "annotations"? You guessed it... there's an API for that.
</p>
<p>If we assume the existence of a function <code class="code">analyzeModule :: DynFlags -> Located (HsModule GhcPs) -> ApiAnns -> IO ()</code> then, here's the gist of the code that exercises it:
<pre><code class="code"> POk s m -> do
let (wrns, errs) = getMessages s flags
report flags wrns
report flags errs
when (null errs) $ analyzeModule flags m (harvestAnns s)
</code></pre>
Here <code class="code">harvestAnns</code> is defined as
<pre><code class="code"> harvestAnns pst =
( Map.fromListWith (++) $ annotations pst
, Map.fromList ((noSrcSpan, comment_q pst) : annotations_comments pst)
)
</code></pre>
</p>
<p>
The type <a href="https://hackage.haskell.org/package/ghc-8.6.5/docs/ApiAnnotation.html#t:ApiAnns"><code class="code">ApiAnns</code></a> is a pair of maps : the first map contains keyword and punctuation locations, the second maps locations of comments to their values.
</p>
<p>You might think that's the end of this story but there's one twist left : the GHC lexer won't harvest comments by default - you have to tell it to do so by means of the <code class="code">Opt_KeepRawTokenStream</code> (general) flag (see the <a href="https://gitlab.haskell.org/ghc/ghc/wikis/api-annotations">GHC wiki</a> for details)!
</p>
<p>Taking the above into account, to parse with comments, the outline now becomes:
<pre><code class="code"> flags <-
parsePragmasIntoDynFlags
(defaultDynFlags fakeSettings fakeLlvmConfig) file s
whenJust flags $ \flags ->
case parse file (flags `gopt_set` Opt_KeepRawTokenStream)s of
PFailed s ->
report flags $ snd (getMessages s flags)
POk s m -> do
let (wrns, errs) = getMessages s flags
report flags wrns
report flags errs
when (null errs) $ analyzeModule flags m (harvestAnns s)
</code></pre>
</p>
<p>For a complete program demonstrating all of this see <a href="https://raw.githubusercontent.com/digital-asset/ghc-lib/master/examples/mini-hlint/src/Main.hs">this example</a> in the <a href="https://github.com/digital-asset/ghc-lib"><code>ghc-lib</code> repo</a>.
</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-44859499828421860762019-06-02T11:57:00.000-04:002020-04-30T14:43:02.946-04:00Have GHC parsing respect dynamic pragmas<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
</style>
<title>Have GHC parsing respect dynamic pragmas</title>
</head>
<body>
<p>This post about <a href="https://blog.shaynefletcher.org/2019/05/handling-ghc-parser-errors-right.html">Handling GHC parse errors</a> shows that using <code class="code">qualified</code> in postpostive position is a syntax error unless the <a href="https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0049-module-qualified-syntax.rst"><code class="code">ImportQualifiedPost</code></a> language extension is enabled. In that post, it is explained that the program
<pre><code class="code">module M where
import Data.List qualified
</code></pre>
is invalid whereas,
<pre><code class="code">{#- LANGUAGE ImportQualifiedPost -#}
module M where
import Data.List qualified
</code></pre>
which enables the extension via a "dynamic pragma", is legit.
</p>
<p>
Perhaps surprisingly, running the second of these programs through the parsing code presented in that post continues to generate the error
<pre><code class="code"> Found `qualified' in postpositive position.
To allow this, enable language extension 'ImportQualifiedPost'
</code></pre>
Evidently, our parse-fu needs an upgrade to respect dynamic pragmas and that's what this post provides.
</p>
<p>This code exercises the GHC API to parse a module.
<pre><code class="code">parse :: String -> DynFlags -> String -> ParseResult (Located (HsModule GhcPs))
parse filename flags str =
unP Parser.parseModule parseState
where
location = mkRealSrcLoc (mkFastString filename) 1 1
buffer = stringToStringBuffer str
parseState = mkPState flags buffer location
</code></pre>
</p>
</p>
Note in the above, the second argument <code class="code">flags :: DynFlags</code>. In order for <code class="code">parse</code> to take into account extensions enabled by pragmas in the source argument <code class="code">str</code>, then <code class="code">flags</code> must be set up to do so <em>a priori</em>. That is, before jumping into <code class="code">parse</code>, a "first pass" must be made to sniff out flags. There is a GHC API for that. It's called <code class="code">parseDynamicFilePragma</code>.
</p>
<p>Here's a function to harvest flags from pragmas that makes that call to <code class="code">parseDynamicFilePragma</code>.
<pre><code class="code">parsePragmasIntoDynFlags :: DynFlags -> FilePath -> String -> IO (Maybe DynFlags)
parsePragmasIntoDynFlags flags filepath str =
catchErrors $ do
let opts = getOptions flags (stringToStringBuffer str) filepath
(flags, _, _) <- parseDynamicFilePragma flags opts
return $ Just flags
where
catchErrors :: IO (Maybe DynFlags) -> IO (Maybe DynFlags)
catchErrors act = handleGhcException reportErr
(handleSourceError reportErr act)
reportErr e = do putStrLn $ "error : " ++ show e; return Nothing
</code></pre>
The main contribution of this function is to account for the complication that <code class="code">parseDynamicFilePragma</code> can throw two kinds of exceptions : <a href="https://hackage.haskell.org/package/ghc-8.6.5/docs/GHC.html#t:GhcException"><code class="code">GhcException</code></a> and <a href="https://hackage.haskell.org/package/ghc-8.6.5/docs/HscTypes.html#t:SourceError"><code class="code">SourceError</code></a>. The GHC API functions <a href="https://hackage.haskell.org/package/ghc/docs/Panic.html#v:handleGhcException"><code class="code">handleGhcException</code></a> and <a href="https://hackage.haskell.org/package/ghc/docs/GHC.html#v:handleSourceError"><code class="code">handleSourceError</code></a> are the means to achieve that.
</p>
<p>Putting it all together then, here's an outline of how to parse in the presence of dynamic pragmas.
<pre><code class="code"> s <- readFile' file
flags <-
parsePragmasIntoDynFlags
(defaultDynFlags fakeSettings fakeLlvmConfig) file s
whenJust flags $ \flags ->
case parse file flags s of
PFailed s ->
report flags $ snd (getMessages s flags)
POk s m -> do
let (wrns, errs) = getMessages s flags
report flags wrns
report flags errs
when (null errs) $ analyzeModule flags m
</code></pre>
For a complete working program that utilizes this function, see <a href="https://raw.githubusercontent.com/digital-asset/ghc-lib/master/examples/mini-hlint/src/Main.hs">this example</a> in the <a href="https://github.com/digital-asset/ghc-lib"><code class="code">ghc-lib</code> repo</a>.
</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-8000900326029022552019-05-10T22:03:00.000-04:002019-06-02T11:53:28.223-04:00Handling GHC parser errors right<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
</style>
<title>Handling GHC parser errors right</title>
</head>
<body>
<p>Did you know, a <code>POk</code> parse result from the GHC parser doesn't necessarily mean the parse was OK? This blog explains what's up with that. The source code below is from <a href="https://raw.githubusercontent.com/digital-asset/ghc-lib/master/examples/mini-hlint/src/Main.hs">this example</a> in the <a href="https://github.com/digital-asset/ghc-lib"><code>ghc-lib</code> repo</a>.
</p>
<p>Here is code that tries to make a parse tree of a Haskell module.
<pre><code class="code">parse :: String -> DynFlags -> String -> ParseResult (Located (HsModule GhcPs))
parse filename flags str =
unP Parser.parseModule parseState
where
location = mkRealSrcLoc (mkFastString filename) 1 1
buffer = stringToStringBuffer str
parseState = mkPState flags buffer location
</code></pre>
</p>
<p>The way to call the above code is like this.
<pre><code class="code">case parse file flags s of
PFailed s ->
report flags $ snd (getMessages s flags)
POk s m -> do
report flags $ fst (getMessages s flags)
analyzeModule flags m
</code></pre>
In the <code>PFailed s</code> case (where <code>s</code> is the parse state), the expression <code>snd (getMessages s flags)</code> retrieves the errors and we report them. In the <code>POk</code> case, we report warnings and do whatever it is we wanted to do with the parse tree <code>m</code> right?
</p>
<p>Not quite. The problem is that the parser produces two sorts of errors : "fatal" and "non-fatal". Thus far, we have only considered the "fatal" ones.
</p>
<p>Fatal errors are such that production of a parse tree is impossible. Non-fatal parse errors are those that don't prevent construction of a parse tree. A parse that generates non-fatal errors is going to associate with a parse tree in some way non-conforming to the Haskell language specification.
</p>
<p>The right way to write the <code>POk</code> case is like this.
<pre><code class="code">POk s m -> do
let (warns, errs) = getMessages s flags
report flags warns
report flags errs
when (null errs) $ analyzeModule flags m
</code></pre>
The key point is <code>analyzeModule</code> is called only if there are absolutely no parse errors at all.
</p>
<p>A non-fatal error example is provided by the <a href="https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0049-module-qualified-syntax.rst"><code>ImportQualifiedPost</code> language extension</a> (see <a href="https://blog.shaynefletcher.org/2019/02/adding-ghc-language-extension.html">this post</a> for how to add a GHC language extension). Specifically, it is only legal to write <code>import M qualified</code> if the extension is in effect via pragma or the option <code>-XImportQualifiedPost</code>. In the event this syntax is used when the extension is not in effect, the user should see an error like
<pre> test/MiniHlintTest_non_fatal_error.hs:6:18: error:
Found `qualified' in postpositive position.
To allow this, enable language extension 'ImportQualifiedPost'
</pre>
and further analysis of the parse abandoned.
</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-88534136517428811612019-04-07T09:20:00.000-04:002019-04-07T12:13:52.655-04:00Announcing ghc-lib 0.20190404<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
</style>
<title>Announcing ghc-lib 0.20190404</title>
</head>
<body>
<p>On behalf of <a href="https://www.digitalasset.com/">Digital Asset</a> I am excited to share with you the latest release of <code>ghc-lib</code>.</p>
<p>As described in detail in the <code>ghc-lib</code> <a href="https://github.com/digital-asset/ghc-lib/blob/master/README.md">README</a>, the <code>ghc-lib</code> project lets you use the GHC API for parsing and analyzing Haskell code without tying you to a specific GHC version.
</p>
<h2>What's new</h2>
<p>The GHC <a href="https://gitlab.haskell.org/ghc">source code</a> in this release is updated to GHC HEAD as of April the 4<sup>th</sup>, 2019. Accordingly, the <a href="https://raw.githubusercontent.com/digital-asset/ghc-lib/master/examples/mini-hlint/src/Main.hs"><code>mini-hlint</code></a> example in the <a href="https://github.com/digital-asset/ghc-lib"><code>ghc-lib</code></a> repo was adjusted to accomodate GHC API changes to the <code class="code">ParseResult</code> datatype and parser error handling.</p>
<p>By far the biggest change though is this : the <code>ghc-lib</code> project now provides two packages, <code>ghc-lib-parser</code> and <code>ghc-lib</code>. The packages are released on Hackage, and can be installed as usual e.g. <code>cabal install ghc-lib</code>.
</p>
<p>Some projects don't require the ability to compile Haskell to GHC's <a href="https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/CoreSynType">Core language</a>. If lexical analysis alone is sufficient for your project's needs, then the <code>ghc-lib-parser</code> package alone will do for that. The build time for <code>ghc-lib-parser</code> is roughly half of the combined build times of <code>ghc-lib-parser</code> and <code>ghc-lib</code>. That is, in this case, switching to the new release will decrease the build time for your project. Note that if your project does require compiling Haskell to Core, then your project will now depend on both the <code>ghc-lib-parser</code> and <code>ghc-lib</code> packages.</p>
<p>The <code>ghc-lib</code> package "re-exports" the modules of the <code>ghc-lib-parser</code> package. So, if you depend upon the <code>ghc-lib</code> package, you'll get the <code>ghc-lib-parser</code> modules "for free". Sadly though, at this time, package import syntax (and we do recommend using package import syntax for these packages) doesn't quite work like you'd expect so that if you, <code class="code">import "ghc-lib" DynFlags</code> for example, this will fail because <code class="code">DynFlags</code> is in fact in the <code>ghc-lib-parser</code> package. In this case, you'd write, <code class="code">import "ghc-lib-parser" DynFlags</code> and all would be well. The <a href="https://raw.githubusercontent.com/digital-asset/ghc-lib/master/examples/mini-compile/src/Main.hs"></code>mini-compile</code></a> example in the <a href="https://github.com/digital-asset/ghc-lib"><code>ghc-lib</code></a> repo demonstrates mixing modules from both packages.
</p>
<p><a href="https://www.digitalasset.com/">Digital Asset</a> make extensive use of the <code>ghc-lib</code> packages in the <a href="https://daml.com/">DAML</a> smart contract language compiler and hope you continue to benefit from this project too!
</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-82670883908875502792019-03-20T13:42:00.000-04:002019-04-06T12:10:34.544-04:00Bush fixing Travis and GitLab<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
</style>
<title>Bush fixing Travis and CI</title>
</head>
<body>
<p>Ever had one of those days?
<div class="separator" style="clear: both; text-align: left;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLsuV7dNG6n6sltBDIAljCVt1UmHzennQrn3zXVE_P1ku93m724Ksd0LwwH7K25IpakT6NYNAg2_Kv0P1IkYfSmTl67OM74NxTzIO80R_czI2OLNp9SjlKAMtBb3XyXzY4extDkDuidp7H/s1600/commit-history.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLsuV7dNG6n6sltBDIAljCVt1UmHzennQrn3zXVE_P1ku93m724Ksd0LwwH7K25IpakT6NYNAg2_Kv0P1IkYfSmTl67OM74NxTzIO80R_czI2OLNp9SjlKAMtBb3XyXzY4extDkDuidp7H/s1600/commit-history.png" data-original-width="547" data-original-height="781" /></a></div>
You are not alone!
</p>
<p>This Saturday 9th March 2019, the GHC devs are going to announce that git://git.haskell.org/ghc.git has been decommissioned. The new official upstream GHC will be https://gitlab.haskell.org/ghc/ghc.
</p>
<p>Sadly (for us) this broke <a href="https://github.com/digital-asset/ghc-lib"><code>ghc-lib</code></a> CI's Travis linux configuration.
</p>
<p>What does our CI do? The <a href="https://github.com/digital-asset/ghc-lib/blob/master/CI.hs"><code>ghc-lib</code> CI script</a> pulls down the latest GHC sources and builds and tests them as a ghc-lib. The details of the problem are that Travis gives you a broken Ubuntu where cloning the official URL fails with a TLS “handshake error”. More generally, any Travis job that tries to <code>git clone</code> over the https protocol from a GitLab remote will fail the same way.
</p>
<p>This <a href="https://github.com/digital-asset/ghc-lib/blob/3f128afd4198a3f2434610a01d8cdc9cab76d3a0/.travis.yml">.travis.yml</a> shows a workaround. The idea is to spin up a container before install that doesn’t have this problem and clone from there. The essential bits are:
<pre><code class="code">services:
- docker
# [Why we git clone on linux here]
# At this time, `git clone https://gitlab.haskell.org/ghc/ghc.git`
# from within `CI.hs` does not work on on linux. This appears to be a
# known Travis/ubuntu SSL verification issue. We've tried many less
# drastic workarounds. This grand hack is the only way we've found so
# far that can be made to work.
before_install:
- |
if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then
docker pull alpine/git
docker run -ti --rm -v ${HOME}:/root -v $(pwd):/git \
alpine/git clone https://gitlab.haskell.org/ghc/ghc.git /git/ghc --recursive
fi
</code></pre>
</p>
<p>Note, MacOS docker services aren’t supported but that’s OK! The TLS handshake problem doesn’t exhibit in that configuration.
</p>
<p>Update : It turns out that while this issue exists in Ubuntu 14.04 which Travis uses by default, it is “fixed” in Ubuntu 16.04. So by writing <code class="copde">dist: xenial</code> in your <code>.travis.yml</code> file, the above workaround can be avoided.
</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-80793838293812273682019-02-23T14:28:00.000-05:002019-04-06T12:10:47.626-04:00Adding a GHC Language Extension<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
</style>
<title>Adding a GHC Language Extension</title>
</head>
<body>
<p>This note summarizes the essential mechanics of adding a new language extension to GHC. The example code will illustrate adding a <code class="code">Foo</code> extension.
</p>
<h2>Implementing the extension</h2>
<p>The first step is to add a <code class="code">Foo</code> constructor to the <code class="code">Extension</code> type in <code>libraries/ghc-boot-th/GHC/LanguageExtensions/Type.hs</code>.
<pre><code class="code">data Extension
= Cpp
| OverlappingInstances
...
| Foo
</code></pre>
</p>
<p>The next job is to extend <code class="code">xFlagsDeps</code> in <code>compiler/main/DynFlags.hs</code>.
<pre><code class="code">xFlagsDeps = [
flagSpec "AllowAmbiguousTypes" LangExt.AllowAmbiguousTypes,
...
flagSpec "Foo" LangExt.Foo
]</code></pre>
</p>
<p>That's all it takes. With these two changes, it is now possible to enable <code class="code">Foo</code> in Haskell source files by writing <code class="code">{-# LANGUAGE Foo #-}</code> or from a compile command by passing the argument <code class="code">-XFoo</code>.
</p>
<h2>Testing for the extension</h2>
<h3>Lexer</h3>
<p>In <code>compiler/parser/Lexer.x</code>, locate <code class="code">data ExtBits</code> and add a constructor for <code class="code">Foo</code>.
<pre><code class="code">data ExtBits
= FfiBit
| ...
| FooBit
</code></pre>
Next, extend the <code class="code">where</code> clause of function <code class="code">mkParserFlags'</code> with a case for <code class="code">Foo</code>.
<pre><code class="code">langExtBits =
FfiBit `xoptBit` LangExt.ForeignFunctionInterface
.|. InterruptibleFfiBit `xoptBit` LangExt.InterruptibleFFI
...
.|. FooBit `xoptBit` LangExt.FooBit
</code>
</pre>The function <code class="code">xtest</code> is then the basic building block for testing if <code class="code">Foo</code> is enabled. For example, this specific function tests a bitmap for the on/off status of the <code class="code">Foo</code> bit.
<pre><code class="code">fooEnabled :: ExtsBitMap -> Bool
fooEnabled = xtest FooBit
</code></pre>In practice, testing for a language extension in the lexer is called from a function computing a lexer action. Suppose <code class="code">foo</code> to be such a function and the action it computes depends somehow on whether the <code class="code">Foo</code> language extension is in effect. Putting it all together, schematically it will have the following form.
<pre><code class="code">foo :: (FastString -> Token) -> Action
foo con span buf len = do
exts <- getExts
if FooBit `xtest` exts then
...
else
...
</code></pre>
</p>
<h3>Parser</h3>
<p>This utility computes a monadic expression testing for the on/off state of a bit in a parser state monad.
<pre><code class="code">extension :: (ExtsBitmap -> Bool) -> P Bool
extension p = P $ \s -> POk s (p $! (pExtsBitmap . options) s)
</code></pre>An expression of this kind can be evaluated in the semantic action of a parse rule in <code>compiler/parser/Parser.y</code>. Here's an example of how one might be used.
<pre><code class="code">foo :: { () }
: 'foo' {}
| {- empty -} {% do
foo_required <- extension fooEnabled
when foo_required $ do
loc <- fileSrcSpan
parseErrorSDoc loc $ text "Missing foo"
}
</code></pre>
</p>
<h3>Renaming, type-checking and de-sugaring</h3>
<p>All of renaming, typechecking and desurgaring occur in the contexts of <code class="code">TcRnIf _ _</code> monads. Function <code class="code">xoptM :: Extension -> TcRnIf gbl lcl Bool</code> is provided for extension testing in such contexts. Here's a schematic of how such a test might be used in a renaming function.
<pre><code class="code">import GHC.LanguageExtensions
updateFoos :: [AvailInfo] -> RnM (TcGlbEnv, TcLclEnv)
updateFoos info = do
(globals, locals) <- getEnvs
opt_Foo <- xoptM Foo
if not opt_Foo then
return (globals, locals)
else
let globals' = ...
locals' = ...
return (globals', locals')
</code></pre>
</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-47196152031833964942018-06-10T14:29:00.000-04:002018-06-10T14:29:22.494-04:00Bucket Sort<html><head>
<style>.keyword { font-weight : bold ; color : Red }
.keywordsign { color : #C04600 }
.comment { color : Green }
.constructor { color : Blue }
.type { color : #5C6585 }
.string { color : Maroon }
.warning { color : Red ; font-weight : bold }
.info { margin-left : 3em; margin-right: 3em }
.param_info { margin-top: 4px; margin-left : 3em; margin-right : 3em }
.code { color : #465F91 ; background-color: #F5F5F5; }
pre { margin-bottom: 4px; font-family: monospace; background-color: #F5F5F5; }
pre.verbatim, pre.codepre { }</style>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
<title>Bucket Sort</title>
</head>
<body>
<p>
Bucket sort assumes input generated by a random process that distributes elements uniformly over the interval <i>[0, 1)</i>.
</p>
<p>
The idea of bucket sort is to divide <i>[0, 1)</i> into <i>n</i> equal-sized subintervals or <i>buckets</i>, and then distribute the <i>n</i> input numbers into the buckets. To produce the output, sort the numbers in each bucket and then go through the buckets in order. Sorting a bucket can be done with insertion sort.
</p><pre><code class="code"><span class="keyword">let</span> <span class="keyword">rec</span> insert x = <span class="keyword">function</span>
<span class="keywordsign">|</span> [] <span class="keywordsign">-></span> [x]
<span class="keywordsign">|</span> h :: tl <span class="keyword">as</span> ls <span class="keywordsign">-></span>
<span class="keyword">if</span> x < h <span class="keyword">then</span> x :: ls <span class="keyword">else</span> h :: insert x tl
<span class="keyword">let</span> <span class="keyword">rec</span> insertion_sort = <span class="keyword">function</span>
<span class="keywordsign">|</span> [] <span class="keywordsign">|</span> [_] <span class="keyword">as</span> ls <span class="keywordsign">-></span> ls
<span class="keywordsign">|</span> h :: tl <span class="keywordsign">-></span> insert h (insertion_sort tl)
</code></pre>
<p></p>
<p>
</p><p>This code for bucket sort assumes the input is an <i>n</i>-element array <code class="code">a</code> and that each element <i>0 ≤ <code class="code">a.(i)</code> < 1</i>. The code requires an auxillary array <code class="code">b</code>.(<i>0 .. n - 1</i>) of lists (buckets).
</p>
<pre><code class="code"><span class="keyword">let</span> bucket_sort a =
<span class="keyword">let</span> n = <span class="constructor">Array</span>.length a <span class="keyword">in</span>
<span class="keyword">let</span> b = <span class="constructor">Array</span>.make n [] <span class="keyword">in</span>
<span class="constructor">Array</span>.iter
(<span class="keyword">fun</span> x <span class="keywordsign">-></span>
<span class="keyword">let</span> i =
int_of_float (
floor (float_of_int n *. x)
) <span class="keyword">in</span>
<span class="constructor">Array</span>.set b i (x :: <span class="constructor">Array</span>.get b i)
) a;
<span class="constructor">Array</span>.iteri
(<span class="keyword">fun</span> i l <span class="keywordsign">-></span>
<span class="constructor">Array</span>.set b i (insertion_sort l)
) b;
<span class="constructor">Array</span>.fold_left (<span class="keyword">fun</span> acc bucket <span class="keywordsign">-></span> acc @ bucket) [] b
;;
bucket_sort [| 0.78; 0.17; 0.39; 0.26; 0.72; 0.94
; 0.21; 0.12; 0.23; 0.68|]
</code></pre>
Bucket sort runs in linear time on the average.
<p></p>
<p>
</p><hr>
<p>
References:<br>
[1] "Introduction to Algorithms" Section 9.4:Bucket Sort -- Cormen et. al. (Second ed.) 2001.<br>
</p>
</body></html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-62977132280157040172018-05-20T14:26:00.000-04:002018-06-10T12:59:33.476-04:00Dijkstra's algorithm<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.keyword { font-weight : bold ; color : Red }
.keywordsign { color : #C04600 }
.comment { color : Green }
.constructor { color : Blue }
.type { color : #5C6585 }
.string { color : Maroon }
.warning { color : Red ; font-weight : bold }
.info { margin-left : 3em; margin-right: 3em }
.param_info { margin-top: 4px; margin-left : 3em; margin-right : 3em }
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
</style>
<title>Shortest Path</title>
</head>
<body>
<p>This article assumes familiarity with Dijkstra's shortest path algorithm. For a refresher, see [1]. The code assumes <code class="code">open Core</code> is in effect and is online <a href="https://github.com/shayne-fletcher/zen/tree/master/ocaml/dijkstra">here</a>.
</p>
<p>The first part of the program organizes our thoughts about what we are setting out to compute. The signature summarizes the notion (for our purposes) of a graph definition in modular form. A module implementing this signature defines a type <code class="code">vertex_t</code> for vertices, a type <code class="code">t</code> for graphs and type <code class="code">extern_t</code> : a representation of a <code class="code">t</code> for interaction between an implemening module and its "outside world".
<pre><code class="code"><span class="keyword">module</span> <span class="keyword">type</span> <span class="constructor">Graph_sig</span> = <span class="keyword">sig</span>
<span class="keyword">type</span> vertex_t [@@deriving sexp]
<span class="keyword">type</span> t [@@deriving sexp]
<span class="keyword">type</span> extern_t
<span class="keyword">type</span> load_error = [ <span class="keywordsign">`</span><span class="constructor">Duplicate_vertex</span> <span class="keyword">of</span> vertex_t ] [@@deriving sexp]
<span class="keyword">exception</span> <span class="constructor">Load_error</span> <span class="keyword">of</span> load_error [@@deriving sexp]
<span class="keyword">val</span> of_adjacency : extern_t <span class="keywordsign">-></span> [ <span class="keywordsign">`</span><span class="constructor">Ok</span> <span class="keyword">of</span> t <span class="keywordsign">|</span> <span class="keywordsign">`</span><span class="constructor">Load_error</span> <span class="keyword">of</span> load_error ]
<span class="keyword">val</span> to_adjacency : t <span class="keywordsign">-></span> extern_t
<span class="keyword">module</span> <span class="constructor">Dijkstra</span> : <span class="keyword">sig</span>
<span class="keyword">type</span> state
<span class="keyword">type</span> error = [
<span class="keywordsign">|</span> <span class="keywordsign">`</span><span class="constructor">Relax</span> <span class="keyword">of</span> vertex_t
] [@@deriving sexp]
<span class="keyword">exception</span> <span class="constructor">Error</span> <span class="keyword">of</span> error [@@deriving sexp]
<span class="keyword">val</span> dijkstra : vertex_t <span class="keywordsign">-></span> t <span class="keywordsign">-></span> [ <span class="keywordsign">`</span><span class="constructor">Ok</span> <span class="keyword">of</span> state <span class="keywordsign">|</span> <span class="keywordsign">`</span><span class="constructor">Error</span> <span class="keyword">of</span> error ]
<span class="keyword">val</span> d : state <span class="keywordsign">-></span> (vertex_t * float) list
<span class="keyword">val</span> shortest_paths : state <span class="keywordsign">-></span> (vertex_t * vertex_t list) list
<span class="keyword">end</span>
<span class="keyword">end</span>
</code></pre>A realization of <code class="code">Graph_sig</code> provides "conversion" functions <code class="code">of_adjacency</code>/<code class="code">to_adjacency</code> between the types <code class="code">extern_t</code> and <code class="code">t</code> and nests a module <code class="code">Dijkstra</code>. The signature of the sub-module <code class="code">Dijkstra</code> requires concrete modules provide a type <code class="code">state</code> and an implementation of Dijkstra's algorithm in terms of the function signature <code class="code">val dijkstra : vertex_t -> t -> [ `Ok of state | `Error of error ]</code>.
</p>
<p>For reusability, the strategy for implementing graphs will be generic programming via functors over modules implementing s vertex type.</p>
<p>An implementation of the module type <code class="code">GRAPH</code> defines a module type <code class="code">VERT</code> which is required to provide a comparable type <code class="code">t</code>. It further defines a module type <code class="code">S</code> that is exactly module type <code class="code">Graph_sig</code> above. Lastly, modules of type <code class="code">GRAPH</code> provide a functor <code class="code">Make</code> that maps any module of type <code class="code">VERT</code> to new module of type <code class="code">S</code> fixing <code class="code">extern_t</code> to an adjacency list representation in terms of the native OCaml type <code class="code">'a list</code> and <code class="code">float</code> to represent weights on edges.
<pre><code class="code"><span class="keyword">module</span> <span class="keyword">type</span> <span class="constructor">GRAPH</span> = <span class="keyword">sig</span>
<span class="keyword">module</span> <span class="keyword">type</span> <span class="constructor">VERT</span> = <span class="keyword">sig</span>
<span class="keyword">type</span> t[@@deriving sexp]
<span class="keyword">include</span> <span class="constructor">Comparable</span>.<span class="constructor">S</span> <span class="keyword">with</span> <span class="keyword">type</span> t := t
<span class="keyword">end</span>
<span class="keyword">module</span> <span class="keyword">type</span> <span class="constructor">S</span> = <span class="keyword">sig</span>
<span class="keyword">include</span> <span class="constructor">Graph_sig</span>
<span class="keyword">end</span>
<span class="keyword">module</span> <span class="constructor">Make</span> : <span class="keyword">functor</span> (<span class="constructor">V</span> : <span class="constructor">VERT</span>) <span class="keywordsign">-></span>
<span class="constructor">S</span> <span class="keyword">with</span> <span class="keyword">type</span> vertex_t = <span class="constructor">V</span>.t
<span class="keyword">and</span> <span class="keyword">type</span> extern_t = (<span class="constructor">V</span>.t * (<span class="constructor">V</span>.t * float) list) list
<span class="keyword">end</span>
</code></pre>
The two module types <code class="code">Graph_sig</code> and <code class="code">GRAPH</code> together provide the specification for the program. <code class="code">module Graph</code> in the next section implements this specification.
</p>
<p>Implementation of module <code class="code">Graph</code> is in outline this.
<pre><code class="code"><span class="keyword">module</span> <span class="constructor">Graph</span> : <span class="constructor">GRAPH</span> = <span class="keyword">struct</span>
<span class="keyword">module</span> <span class="keyword">type</span> <span class="constructor">VERT</span> = <span class="keyword">sig</span>
<span class="keyword">type</span> t[@@deriving sexp]
<span class="keyword">include</span> <span class="constructor">Comparable</span>.<span class="constructor">S</span> <span class="keyword">with</span> <span class="keyword">type</span> t := t
<span class="keyword">end</span>
<span class="keyword">module</span> <span class="keyword">type</span> <span class="constructor">S</span> = <span class="keyword">sig</span>
<span class="keyword">include</span> <span class="constructor">Graph_sig</span>
<span class="keyword">end</span>
<span class="keyword">module</span> <span class="constructor">Make</span> : <span class="keyword">functor</span> (<span class="constructor">V</span> : <span class="constructor">VERT</span>) <span class="keywordsign">-></span>
<span class="constructor">S</span> <span class="keyword">with</span> <span class="keyword">type</span> vertex_t = <span class="constructor">V</span>.t
<span class="keyword">and</span> <span class="keyword">type</span> extern_t = (<span class="constructor">V</span>.t * (<span class="constructor">V</span>.t * float) list) list
=
<span class="keyword">functor</span> (<span class="constructor">V</span> : <span class="constructor">VERT</span>) <span class="keywordsign">-></span> <span class="keyword">struct</span>
...
<span class="keyword">end</span>
<span class="keyword">end</span>
</code></pre>
As per the requirements of <code class="code">GRAPH</code> the module types <code class="code">VERT</code> and <code class="code">S</code> are provided as is the functor <code class="code">Make</code>. It is the code that is ellided by the <code class="code">...</code> above in the definition of <code class="code">Make</code> that is now the focus.
</p>
<p>Modules produced by applications of <code class="code">Make</code> satisfy <code class="code">S</code>. This requires suitable definitions of types <code class="code">vertext_t</code>, <code class="code">t</code> and <code class="code">extern_t</code>. The modules <code class="code">Map</code> and <code class="code">Set</code> are available due to modules of type <code class="code">VERT</code> being comparable in their type <code class="code">t</code>.
<pre><code class="code"> <span class="keyword">module</span> <span class="constructor">Map</span> = <span class="constructor">V</span>.<span class="constructor">Map</span>
<span class="keyword">module</span> <span class="constructor">Set</span> = <span class="constructor">V</span>.<span class="constructor">Set</span>
<span class="keyword">type</span> vertex_t = <span class="constructor">V</span>.t [@@deriving sexp]
<span class="keyword">type</span> t = (vertex_t * float) list <span class="constructor">Map</span>.t [@@deriving sexp]
<span class="keyword">type</span> extern_t = (vertex_t * (vertex_t * float) list) list
<span class="keyword">type</span> load_error = [ <span class="keywordsign">`</span><span class="constructor">Duplicate_vertex</span> <span class="keyword">of</span> vertex_t ] [@@deriving sexp]
<span class="keyword">exception</span> <span class="constructor">Load_error</span> <span class="keyword">of</span> load_error [@@deriving sexp]
</code></pre>
</p>
<p>While the external representation <code class="code">extern_t</code> of graphs is chosen to be an adjacency list representation in terms of association lists, the internal representation <code class="code">t</code> is a vertex map of adjacency lists providing logarithmic loookup complexity. The conversion functions between the two representations "come for free" via module <code class="code">Map</code>.
<pre><code class="code"> <span class="keyword">let</span> to_adjacency g = <span class="constructor">Map</span>.to_alist g
<span class="keyword">let</span> of_adjacency_exn l = <span class="keyword">match</span> <span class="constructor">Map</span>.of_alist l <span class="keyword">with</span>
<span class="keywordsign">|</span> <span class="keywordsign">`</span><span class="constructor">Ok</span> t <span class="keywordsign">-></span> t
<span class="keywordsign">|</span> <span class="keywordsign">`</span><span class="constructor">Duplicate_key</span> c <span class="keywordsign">-></span> raise (<span class="constructor">Load_error</span> (<span class="keywordsign">`</span><span class="constructor">Duplicate_vertex</span> c))
<span class="keyword">let</span> of_adjacency l =
<span class="keyword">try</span>
<span class="keywordsign">`</span><span class="constructor">Ok</span> (of_adjacency_exn l)
<span class="keyword">with</span>
<span class="keywordsign">|</span> <span class="constructor">Load_error</span> err <span class="keywordsign">-></span> <span class="keywordsign">`</span><span class="constructor">Load_error</span> err
</code></pre>
</p>
<p>At this point the "scaffolding" for Dijkstra's algorithm, that part of <code class="code">GRAPH</code> dealing with the representation of graphs is implemented.</p>
<p>The interpretation of Dijkstra's algorithm we adopt is functional : the idea is we loop over vertices relaxing their edges until all shortest paths are known. What we know on any recursive iteration of the loop is a current "state" (of the computation) and each iteration produces a new state. This next definition is the formal definition of <code class="code">type state</code>.
<pre><code class="code"> <span class="keyword">module</span> <span class="constructor">Dijkstra</span> = <span class="keyword">struct</span>
<span class="keyword">type</span> state = {
src : vertex_t
; g : t
; d : float <span class="constructor">Map</span>.t
; pred : vertex_t <span class="constructor">Map</span>.t
; s : <span class="constructor">Set</span>.t
; v_s : (vertex_t * float) <span class="constructor">Heap</span>.t
}
</code></pre>
The fields of this record are:
<ul><li><code class="code">src : vertex_t</code>, the source vertex;</li>
<li><code class="code">g : t</code>, <i>G</i> the graph;</li>
<li><code class="code">d : float Map.t</code>, <i>d</i> the shortest path weight estimates;</li>
<li><code class="code">pre : vertex_t Map.t</code>, <i>π</i> the predecessor relation;</li>
<li><code class="code">s : Set.t</code>, the set <i>S</i> of nodes for which the lower bound shortest path weight is known;</li>
<li><code class="code">v_s : (vertex_t * float) Heap.t</code>, <i>V - {S}, </i> , the set of nodes of <code class="code">g</code> for which the lower bound of the shortest path weight is not yet known ordered on their estimates.</li>
</ul>
<p>Function invocation <code class="code">init src g</code> compuates an initial state for the graph <code class="code">g</code> containing the source node <code class="code">src</code>. In the initial state, <code class="code">d</code> is everywhere <i>∞</i> except for <code class="code">src</code> which is <i>0</i>. Set <i>S</i> (i.e. <code class="code">s</code>) and the predecessor relation <i>π</i> (i.e. <code class="code">pred</code>) are empty and the set <i>V - {S}</i> (i.e. <code class="code">v_s</code>) contains all nodes.
<pre><code class="code"> <span class="keyword">let</span> init src g =
<span class="keyword">let</span> init x = <span class="keyword">match</span> <span class="constructor">V</span>.equal src x <span class="keyword">with</span>
<span class="keywordsign">|</span> <span class="keyword">true</span> <span class="keywordsign">-></span> 0.0 <span class="keywordsign">|</span> <span class="keyword">false</span> <span class="keywordsign">-></span> <span class="constructor">Float</span>.infinity <span class="keyword">in</span>
<span class="keyword">let</span> d = <span class="constructor">List</span>.fold (<span class="constructor">Map</span>.keys g) ~init:<span class="constructor">Map</span>.empty
~f:(<span class="keyword">fun</span> acc x <span class="keywordsign">-></span> <span class="constructor">Map</span>.set acc ~key:x ~data:(init x)) <span class="keyword">in</span>
{
src
; g
; s = <span class="constructor">Set</span>.empty
; d
; pred = <span class="constructor">Map</span>.empty
; v_s = <span class="constructor">Heap</span>.of_list (<span class="constructor">Map</span>.to_alist d)
~cmp:(<span class="keyword">fun</span> (_, e1) (_, e2) <span class="keywordsign">-></span> <span class="constructor">Float</span>.compare e1 e2)
}
</code></pre>
</p>
<p>Relaxing an edge <i>(u, v)</i> with weight <i>w (u, v)</i> tests whether the shortest path to <i>v</i> so far can be improved by going through <i>u</i> and if so, updating <i>d (v)</i> and <i>π (v)</i> accordingly.
<pre><code class="code"> <span class="keyword">type</span> error = [
<span class="keywordsign">|</span> <span class="keywordsign">`</span><span class="constructor">Relax</span> <span class="keyword">of</span> vertex_t
] [@@deriving sexp]
<span class="keyword">exception</span> <span class="constructor">Error</span> <span class="keyword">of</span> error [@@deriving sexp]
<span class="keyword">let</span> relax state (u, v, w) =
<span class="keyword">let</span> {d; pred; v_s; _} = state <span class="keyword">in</span>
<span class="keyword">let</span> dv = <span class="keyword">match</span> <span class="constructor">Map</span>.find d v <span class="keyword">with</span>
<span class="keywordsign">|</span> <span class="constructor">Some</span> dv <span class="keywordsign">-></span> dv
<span class="keywordsign">|</span> <span class="constructor">None</span> <span class="keywordsign">-></span> raise (<span class="constructor">Error</span> (<span class="keywordsign">`</span><span class="constructor">Relax</span> v)) <span class="keyword">in</span>
<span class="keyword">let</span> du = <span class="keyword">match</span> <span class="constructor">Map</span>.find d u <span class="keyword">with</span>
<span class="keywordsign">|</span> <span class="constructor">Some</span> du <span class="keywordsign">-></span> du
<span class="keywordsign">|</span> <span class="constructor">None</span> <span class="keywordsign">-></span> raise (<span class="constructor">Error</span> (<span class="keywordsign">`</span><span class="constructor">Relax</span> u)) <span class="keyword">in</span>
<span class="keyword">if</span> dv > du +. w <span class="keyword">then</span>
<span class="keyword">let</span> dv = du +. w <span class="keyword">in</span>
(<span class="keyword">match</span> <span class="constructor">Heap</span>.find_elt v_s ~f:(<span class="keyword">fun</span> (n, _) <span class="keywordsign">-></span> <span class="constructor">V</span>.equal n v) <span class="keyword">with</span>
<span class="keywordsign">|</span> <span class="constructor">Some</span> tok <span class="keywordsign">-></span> ignore (<span class="constructor">Heap</span>.update v_s tok (v, dv))
<span class="keywordsign">|</span> <span class="constructor">None</span> <span class="keywordsign">-></span> raise (<span class="constructor">Error</span> (<span class="keywordsign">`</span><span class="constructor">Relax</span> v))
);
{ state <span class="keyword">with</span>
d = <span class="constructor">Map</span>.change d v
~f:(<span class="keyword">function</span>
<span class="keywordsign">|</span> <span class="constructor">Some</span> _ <span class="keywordsign">-></span> <span class="constructor">Some</span> dv
<span class="keywordsign">|</span> <span class="constructor">None</span> <span class="keywordsign">-></span> raise (<span class="constructor">Error</span> (<span class="keywordsign">`</span><span class="constructor">Relax</span> v))
)
; pred = <span class="constructor">Map</span>.set (<span class="constructor">Map</span>.remove pred v) ~key:v ~data:u
}
<span class="keyword">else</span> state
</code></pre>
Here, relaxation can result in a linear heap update operation. A better implementation might seek to avoid that.
</p>
<p>One iteration of the body of the loop of Dijkstra's algorithm consists of the node in <i>V - {S}</i> with the least shortest path weight estimate being moved to <i>S</i> and its edges relaxed.
<pre><code class="code"> <span class="keyword">let</span> dijkstra_exn src g =
<span class="keyword">let</span> <span class="keyword">rec</span> loop ({s; v_s; _} <span class="keyword">as</span> state) =
<span class="keyword">match</span> <span class="constructor">Heap</span>.is_empty v_s <span class="keyword">with</span>
<span class="keywordsign">|</span> <span class="keyword">true</span> <span class="keywordsign">-></span> state
<span class="keywordsign">|</span> <span class="keyword">false</span> <span class="keywordsign">-></span>
<span class="keyword">let</span> u = fst (<span class="constructor">Heap</span>.pop_exn v_s) <span class="keyword">in</span>
loop (
<span class="constructor">List</span>.fold (<span class="constructor">Map</span>.find_exn g u)
~init:{ state <span class="keyword">with</span> s = <span class="constructor">Set</span>.add s u }
~f:(<span class="keyword">fun</span> state (v, w) <span class="keywordsign">-></span> relax state (u, v, w))
)
<span class="keyword">in</span> loop (init src g)
<span class="keyword">let</span> dijkstra src g =
<span class="keyword">try</span>
<span class="keywordsign">`</span><span class="constructor">Ok</span> (dijkstra_exn src g)
<span class="keyword">with</span>
<span class="keywordsign">|</span> <span class="constructor">Error</span> err <span class="keywordsign">-></span> <span class="keywordsign">`</span><span class="constructor">Error</span> err
</code></pre>
</p>
<p>The shortest path estimates contained by a value of <code class="code">state</code> is given by the projection <code class="code">d</code>.
<pre><code class="code"> <span class="keyword">let</span> d state = <span class="constructor">Map</span>.to_alist (state.d)
</code></pre>
</p>
<p>The shortest paths themselves are easily computed as,
<pre><code class="code"> <span class="keyword">let</span> path state n =
<span class="keyword">let</span> <span class="keyword">rec</span> loop acc x =
(<span class="keyword">match</span> <span class="constructor">V</span>.equal x state.src <span class="keyword">with</span>
<span class="keywordsign">|</span> <span class="keyword">true</span> <span class="keywordsign">-></span> x :: acc
<span class="keywordsign">|</span> <span class="keyword">false</span> <span class="keywordsign">-></span> loop (x :: acc) (<span class="constructor">Map</span>.find_exn state.pred x)
) <span class="keyword">in</span>
loop [] n
<span class="keyword">let</span> shortest_paths state =
<span class="constructor">List</span>.map (<span class="constructor">Map</span>.keys state.g) ~f:(<span class="keyword">fun</span> n <span class="keywordsign">-></span> (n, path state n))
<span class="keyword">end</span>
<span class="keyword">end</span>
</code></pre>
which completes the implementation of <code class="code">Make</code>.
<p>The following program produces a concrete instance of the shortest path problem (with some evaluation output from the top-level).
<pre><code class="code"><span class="keyword">module</span> <span class="constructor">G</span> : <span class="constructor">Graph</span>.<span class="constructor">S</span> <span class="keyword">with</span>
<span class="keyword">type</span> vertex_t = char <span class="keyword">and</span> <span class="keyword">type</span> extern_t = (char * (char * float) list) list
=
<span class="constructor">Graph</span>.<span class="constructor">Make</span> (<span class="constructor">Char</span>)
<span class="keyword">let</span> g : <span class="constructor">G</span>.t =
<span class="keyword">match</span> <span class="constructor">G</span>.of_adjacency
[ <span class="string">'s'</span>, [<span class="string">'u'</span>, 3.0; <span class="string">'x'</span>, 5.0]
; <span class="string">'u'</span>, [<span class="string">'x'</span>, 2.0; <span class="string">'v'</span>, 6.0]
; <span class="string">'x'</span>, [<span class="string">'v'</span>, 4.0; <span class="string">'y'</span>, 6.0; <span class="string">'u'</span>, 1.0]
; <span class="string">'v'</span>, [<span class="string">'y'</span>, 2.0]
; <span class="string">'y'</span>, [<span class="string">'v'</span>, 7.0]
]
<span class="keyword">with</span>
<span class="keywordsign">|</span> <span class="keywordsign">`</span><span class="constructor">Ok</span> g <span class="keywordsign">-></span> g
<span class="keywordsign">|</span> <span class="keywordsign">`</span><span class="constructor">Load_error</span> e <span class="keywordsign">-></span> failwiths <span class="string">"Graph load error : %s"</span> e <span class="constructor">G</span>.sexp_of_load_error
;;
<span class="keyword">let</span> s = <span class="keyword">match</span> (<span class="constructor">G</span>.<span class="constructor">Dijkstra</span>.dijkstra <span class="string">'s'</span> g) <span class="keyword">with</span>
<span class="keywordsign">|</span> <span class="keywordsign">`</span><span class="constructor">Ok</span> s <span class="keywordsign">-></span> s
<span class="keywordsign">|</span> <span class="keywordsign">`</span><span class="constructor">Error</span> e <span class="keywordsign">-></span> failwiths <span class="string">"Error : %s"</span> e <span class="constructor">G</span>.<span class="constructor">Dijkstra</span>.sexp_of_error
;; <span class="constructor">G</span>.<span class="constructor">Dijkstra</span>.d s
- : (char * float) list =
[(<span class="string">'s'</span>, 0.); (<span class="string">'u'</span>, 3.); (<span class="string">'v'</span>, 9.); (<span class="string">'x'</span>, 5.); (<span class="string">'y'</span>, 11.)]
;; <span class="constructor">G</span>.<span class="constructor">Dijkstra</span>.shortest_paths s
- : (char * char list) list =
[(<span class="string">'s'</span>, [<span class="string">'s'</span>]); (<span class="string">'u'</span>, [<span class="string">'s'</span>; <span class="string">'u'</span>]); (<span class="string">'v'</span>, [<span class="string">'s'</span>; <span class="string">'u'</span>; <span class="string">'v'</span>]); (<span class="string">'x'</span>, [<span class="string">'s'</span>; <span class="string">'x'</span>]);
(<span class="string">'y'</span>, [<span class="string">'s'</span>; <span class="string">'x'</span>; <span class="string">'y'</span>])]
</code></pre>
</p>
</p>
<p>
<hr/>
<p>
References:<br/>
[1] "Introduction to Algorithms" Section 24.3:Dijkstra's algorithm -- Cormen et. al. (Second ed.) 2001.<br/>
</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-40360747557964843392017-12-09T10:40:00.000-05:002017-12-09T10:42:37.996-05:00How to migrate your ppx to OCaml migrate parsetree<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.keyword { font-weight : bold ; color : Red }
.keywordsign { color : #C04600 }
.comment { color : Green }
.constructor { color : Blue }
.type { color : #5C6585 }
.string { color : Maroon }
.warning { color : Red ; font-weight : bold }
.info { margin-left : 3em; margin-right: 3em }
.param_info { margin-top: 4px; margin-left : 3em; margin-right : 3em }
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }</style>
<title>OCaml migrate parse tree</title>
</head>
<body>
<h2>OCaml migrate parse tree</h2>
<p>Earlier this
year, <a href="http://blog.shaynefletcher.org/2017/05/preprocessor-extensions-for-code.html">this
blog post</a> [2] explored the implementation of a small
preprocessor extension (ppx).
</p>
<p>The code of the above article worked well enough at the time but
as written, exhibits a problem : new releases of the OCaml compiler
are generally accompanied by evolutions of the OCaml parse tree. The
effect of this is, a ppx written against a specific version of the
compiler will "break" in the presence of later releases of the
compiler. As pointed out in [3], the use of ppx's in the OCaml
eco-system these days is ubiquitous. If each new release of the
OCaml compiler required sychronized updates of each and every ppx
in <a href="http://opam.ocaml.org/">opam</a>, getting new releases
of the compiler out would soon become a near impossibilty.
</p>
<p>Mitigation of the above problem is provided by
the <a href="http://opam.ocaml.org/packages/ocaml-migrate-parsetree/"><code>ocaml-migrate-parsetree</code>
</a> library. The library provides the means to convert parsetrees
from one OCaml version to another. This allows the ppx rewriter to
write against a specific version of the parsetree and lets the
library take care of rolling parsetrees backwards and forwards in
versions as necessary. In this way, the resulting ppx is "forward
compatible" with newer OCaml versions without requiring ppx code
updates.
</p>
<p>To get the <code>ppx_id_of</code> code from the earlier blog post
usable with <code>ocaml-migrate-parsetree</code> required a couple
of small tweaks to make it OCaml 4.02.0 compatible. The changes from
the original code were slight and not of significant enough interest
to be worth presenting here. What is worth looking at is what it
then took to switch the code to
use <code>ocaml-migrate-parsetree</code>. The answer is : very
little!
<pre><code class="code"><span class="keyword">open</span> <span class="constructor">Migrate_parsetree</span>
<span class="keyword">open</span> <span class="constructor">OCaml_402</span>.<span class="constructor">Ast</span>
<span class="keyword">open</span> <span class="constructor">Ast_mapper</span>
<span class="keyword">open</span> <span class="constructor">Ast_helper</span>
<span class="keyword">open</span> <span class="constructor">Asttypes</span>
<span class="keyword">open</span> <span class="constructor">Parsetree</span>
<span class="keyword">open</span> <span class="constructor">Longident</span>
<span class="comment">(* The original ppx as written before goes here!
. . .
. . .
. . .
*)</span>
<span class="keyword">let</span> () = <span class="constructor">Driver</span>.register ~name:<span class="string">"id_of"</span> (<span class="keyword">module</span> <span class="constructor">OCaml_402</span>) id_of_mapper
</code></pre> The complete code for this article is available
online <a href="https://github.com/shayne-fletcher/zen/tree/master/ocaml/ppx_id_of/v2">here</a>
and as a bonus, includes a
minimal <a href="https://jbuilder.readthedocs.io/en/latest/"><code>jbuilder</code></a>
build system demonstrating just how well the OCaml tool-chain comes
together these days.
</p>
<hr/>
<p>
References:<br/>
[1] <a href="https://whitequark.org/blog/2014/04/16/a-guide-to-extension-points-in-ocaml/">"A
Guide to Extension Points in OCaml" -- Whitequark (blog post
2014)</a><br/>
[2] <a href="http://blog.shaynefletcher.org/2017/05/preprocessor-extensions-for-code.html">"Preprocessor
extensions for code generation" -- Shayne Fletcher (blog post
2017)</a><br/>
[3] <a href="http://rgrinberg.com/posts/extension-points-3-years-later/">"Extension
Points - 3 Years Later" -- Rudi Grinberg (blog post 2017)</a><br/>
</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-26887169036709876912017-11-11T09:32:00.001-05:002017-11-12T08:24:03.780-05:00Towers of Hanoi<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.keyword { font-weight : bold ; color : Red }
.keywordsign { color : #C04600 }
.comment { color : Green }
.constructor { color : Blue }
.type { color : #5C6585 }
.string { color : Maroon }
.warning { color : Red ; font-weight : bold }
.info { margin-left : 3em; margin-right: 3em }
.param_info { margin-top: 4px; margin-left : 3em; margin-right : 3em }
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
</style>
<title>Towers of Hanoi</title>
</head>
<body>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgSew9voATWbN_zIwPt1FYceiaM186QwMadfz_ISaijn5TR0lCEKpGbnnNQrfYyNfQYngBoflhI2c0QZnBwEno2nEIpwllwQwNO4QcLtx1ta7AbmE42PtRU-_zPDLQbdeGEbgiaibHiHgsf/s1600/tower_of_hanoi_fig1_600.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgSew9voATWbN_zIwPt1FYceiaM186QwMadfz_ISaijn5TR0lCEKpGbnnNQrfYyNfQYngBoflhI2c0QZnBwEno2nEIpwllwQwNO4QcLtx1ta7AbmE42PtRU-_zPDLQbdeGEbgiaibHiHgsf/s320/tower_of_hanoi_fig1_600.jpg" width="320" height="159" data-original-width="600" data-original-height="298" /></a></div>
<p>
The "towers of Hanoi" problem is stated like this. There are three
pegs labelled <i>a</i>, <i>b</i> and <i>c</i>. On peg <i>a</i> there
is a stack of <i>n</i> disks of increasing size, the largest at the
bottom, each with a hole in the middle to accomodate the peg. The
problem is to transfer the stack of disks to peg <i>c</i>, one disk at
a time, in such a way as to ensure that no disk is ever placed on top
of a smaller disk.
</p>
<p>The problem is amenable to a divide and conquer strategy : "Move
the top <i>n - 1</i> disks from peg <i>a</i> to peg <i>b</i>, move the
remaining largest disk from peg <i>a</i> to peg <i>c</i> then, move
the <i>n - 1</i> disks on peg <i>b</i> to peg <i>c</i>."
</p>
<p>
<pre><code class="code"><span class="keyword">let</span> <span class="keyword">rec</span> towers n from to_ spare =
<span class="keyword">if</span> n > 0 <span class="keyword">then</span>
<span class="keyword">begin</span>
towers (n - 1) from spare to_;
<span class="constructor">Printf</span>.printf <span class="string">
"Move the top disk from peg %c to peg %c\n"</span> from to_;
towers (n - 1) spare to_ from
<span class="keyword">end</span>
<span class="keyword">else</span>
()
;;
</code></pre>
For example, the
invocation <code class="code"><span class="keyword">let</span> () =
towers
3 <span class="string">'a'</span> <span class="string">'c'</span> <span class="string">'b'</span></code>
will generate the recipie
<pre>Move the top disk from peg a to peg c
Move the top disk from peg a to peg b
Move the top disk from peg c to peg b
Move the top disk from peg a to peg c
Move the top disk from peg b to peg a
Move the top disk from peg b to peg c
Move the top disk from peg a to peg c
</pre>
</p>
<p>Let <i>T(n)</i> be the time complexity of <code>towers (x, y,
z)</code>, when the characteristic operation is the moving of a disk
from one peg to another. The time complexity of <code>towers(n - 1, x,
y z)</code> is <i>T(n - 1)</i> by definition and no further
investigation is needed. <i>T(0) = 0</i> because the test <code>n >
0</code> fails and no disks are moved. For larger <code>n</code>, the
expression <code>towers (n - 1, from, spare, to_)</code> is evaluated
with cost <i>T(n - 1)</i> followed by <code><span class="constructor">Printf</span>.printf <span class="string">"Move the top disk from peg %c to peg %c\n"</span> from to_
</code> with cost <i>1</i> and finally, <code>towers(n - 1, spare,
to_, from)</code> again with cost <i>T(n - 1)</i>.
</p>
<p>
Summing these contributions gives the recurrence relation <i>T(n) =
2T(n - 1) + 1</i> where <i>T(0) = 0</i>.
</p>
<p>Repeated substituition can be used to arrive at a closed form
for <i>T(n)</i>, since, <i>T(n) = 2T(n - 1) + 1 = 2[2T(n - 2) + 1] + 1
= 2[2[2T(n - 3) +1] + 1] + 1 = 2<sup>3</sup>T(n - 3) + 2<sup>2</sup> +
2<sup>1</sup> + 2<sup>0</sup></i> (provided <i>n ≥</i> 3),
expanding the brackets in a way that elucidates the emerging
pattern. If this substitution is repeated <i>i</i> times then clearly
the result is <i>T(n) = 2<sup>i</sup>T(n - i) + 2<sup>i - 1</sup> +
2<sup>i - 2</sup> + ··· + 2<sup>0</sup></i> (<i>n
≥ i</i>). The largest possible value <i>i</i> can take is <i>n</i>
and if <i>i = n</i> then <i>T(n - i) = T(0) = 0</i> and so we arrive
at <i>T(n) = 2<sup>n</sup>0 + 2<sup>n - 1</sup> +
··· + 2<sup>0</sup></i>. This is the sum of a
geometric series with the well known solution <i>2<sup>n</sup> - 1</i>
(use induction to establish that last result or more directly, just
compute <i>2T(n) - T(n)</i>). And so, the time complexity (the number
of disk moves needed) for <i>n</i> disks is <i>T(n) = 2<sup>n</sup> -
1</i>.
</p>
<hr/>
<p>
References:<br/>
<cite>Algorithms and Data Structures Design, Correctness, Analysis</cite> by Jeffrey Kingston, 2nd ed. 1998
</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-44683410788079178402017-10-27T20:53:00.001-04:002017-10-28T10:27:28.754-04:00Nesting quoted strings in OCaml<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.keyword { font-weight : bold ; color : Red }
.keywordsign { color : #C04600 }
.comment { color : Green }
.constructor { color : Blue }
.type { color : #5C6585 }
.string { color : Maroon }
.warning { color : Red ; font-weight : bold }
.info { margin-left : 3em; margin-right: 3em }
.param_info { margin-top: 4px; margin-left : 3em; margin-right : 3em }
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
</style>
<title>Quoting</title>
</head>
<body>
<p>
According to the lexical conventions of OCaml, characters different from <code class="code">\</code> and <code class="code">"</code> can be enclosed in single quotes and appear in strings. The special characters <code class="code">\</code> and <code class="code">"</code> are represented in these contexts by their <em>escape sequences</em>. The
escape sequence <code class="code">\\</code> denotes the character <code class="code">\</code> and <code class="code">\"</code> denotes the character <code>"</code>.
</p>
<p>Here we print the string <code class="code">"Hello world!</code>". The quotes delimit the string and are not themselves part of the string.
<pre><code class="code">utop[0]> <span class="constructor">Caml</span>.<span class="constructor">Printf</span>.printf <span class="string">"Hello world!"</span>;;
<span class="string">Hello world!</span>- : unit = ()
</code></pre>
<p>
To capture the quotes we need to write them into the string by their escape sequence.
<pre><code class="code">utop[1]> <span class="constructor">Caml</span>.<span class="constructor">Printf</span>.printf <span class="string">"\"Hello world!\""</span>;;
<span class="string">"Hello world!"</span>- : unit = ()
</code></pre>
</p>
<p>
What now if we wish to quote a string within a string?
<pre><code class="code">utop[3]> <span class="constructor">Caml</span>.<span class="constructor">Printf</span>.printf
<span class="string">"\"A quoted string with \\\"a nested quoted string\\\"\""</span>;;
<span class="string">"A quoted string with \"a nested quoted
string\""</span>- : unit = ()
</code></pre>
<p>
We see that in rendering the above string, <code class="code">printf</code> has rendered the escape sequence <code class="code">\"</code> as <code class="code">"</code> and <code class="code">\\\"</code> as <code class="code">\"</code> as required. The pattern continues if we now wish to quote a string within a quoted string within a quoted string.
<pre><code class="code">utop[4]> <span class="constructor">Caml</span>.<span class="constructor">Printf</span>.printf
<span class="string">"\"A quoted string with \\\"a nested \\\\\\\"nested\\\\\\\"
quoted string\\\"\""</span>;;
<span class="string">"A quoted string with \"a nested \\\"nested\\\"
quoted string\""</span>- : unit = ()
</code></pre>
</p>
<p>As you can see, things get crazy pretty quickly and you can easily drive yourself mad working out the correct escape sequences to get the desired nesting!
</p>
<p>Here's a hack : If the string has <i>k</i> levels of quoting, then count how many occurences of <code class="code">\</code>s precede the <code class="code">"</code> at that level. Let that number be <i>n</i> say. To get the next level of quoting you need to concatenate a sequence of <i>n + 1</i> <code class="code">\</code>s to them to get a total of <i>2n + 1</i> <code class="code">\</code>s. To illustrate, look again at the last example:
<pre><code class="code">utop[4]> <span class="constructor">Caml</span>.<span class="constructor">Printf</span>.printf
<span class="string">"\"A quoted string with \\\"a nested \\\\\\\"nested\\\\\\\"
quoted string\\\"\""</span>;;
<span class="string">"A quoted string with \"a nested \\\"nested\\\"
quoted string\""</span>- : unit = ()
</code></pre>
That's three level of quoting. At the third level we have the sequence <code class="code">\\\\\\\"</code>. That's <i>7</i> <code class="code">\</code>s. To quote to the fourth level then we need <i>8 + 7 = 15</i> <code class="code">\</code>s:
<pre><code class="code">utop[5]> <span class="constructor">Caml</span>.<span class="constructor">Printf</span>.printf
<span class="string">"\"A quoted string with \\\"a nested \\\\\\\"nested
\\\\\\\\\\\\\\\"nested\\\\\\\\\\\\\\\" \\\\\\\" quoted string\\\"\""</span>;;
<span class="string">"A quoted string with \"a nested \\\"nested
\\\\\\\"nested\\\\\\\" \\\" quoted string\""</span>- : unit = ()
</code></pre>
</p>
<p>In general, the number of <code class="code">\</code>s required for <i>n</i> levels of quoting is <i>2<sup>n</sup> - 1</i> (that is, an exponential function). The solution follows from the recurrence relation <i>Q<sub>0</sub> = 0</i> and <i>Q<sub>n</sub> = 2Q<sub>n - 1</sub> + 1</i> which in fact establishes a connection to the "Towers of Hanoi" problem.
</p>
<hr/>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-12870932115570373142017-10-14T15:20:00.001-04:002017-10-14T15:34:18.415-04:00How to render trees like the Unix tree command<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<style>
.keyword { font-weight : bold ; color : Red }
.keywordsign { color : #C04600 }
.comment { color : Green }
.constructor { color : Blue }
.type { color : #5C6585 }
.string { color : Maroon }
.warning { color : Red ; font-weight : bold }
.info { margin-left : 3em; margin-right: 3em }
.param_info { margin-top: 4px; margin-left : 3em; margin-right : 3em }
.code {
color : #465F91 ;
background-color: #F5F5F5;
}
pre {
margin-bottom: 4px;
font-family: monospace;
background-color: #F5F5F5;
}
pre.verbatim, pre.codepre { }
</style>
<title>How to render trees like Unix 'tree'</title>
</head>
<body>
<p>The Unix <a href="https://en.wikipedia.org/wiki/Tree_(Unix)"><code>tree</code></a> utility produces a pretty rendering of a filesystem. Implementing an algorithm to produce output like <code>tree</code> is a little harder than one might expect! This short example program illustrates one way of doing it.
<pre><code class="code"><span class="comment">(* A type of non-empty trees of strings. *)</span>
<span class="keyword">type</span> tree = [
<span class="keywordsign">|</span><span class="keywordsign">`</span><span class="constructor">Node</span> <span class="keyword">of</span> string * tree list
]
;;
<span class="comment">(* [print_tree tree] prints a rendering of [tree]. *)</span>
<span class="keyword">let</span> <span class="keyword">rec</span> print_tree
?(pad : (string * string)= (<span class="string">""</span>, <span class="string">""</span>))
(tree : tree) : unit =
<span class="keyword">let</span> pd, pc = pad <span class="keyword">in</span>
<span class="keyword">match</span> tree <span class="keyword">with</span>
<span class="keywordsign">|</span> <span class="keywordsign">`</span><span class="constructor">Node</span> (tag, cs) <span class="keywordsign">-></span>
<span class="constructor">Printf</span>.printf <span class="string">"%s%s\n"</span> pd tag;
<span class="keyword">let</span> n = <span class="constructor">List</span>.length cs - 1 <span class="keyword">in</span>
<span class="constructor">List</span>.iteri (
<span class="keyword">fun</span> i c <span class="keywordsign">-></span>
<span class="keyword">let</span> pad =
(pc ^ (<span class="keyword">if</span> i = n <span class="keyword">then</span> <span class="string">"`-- "</span> <span class="keyword">else</span> <span class="string">"|-- "</span>),
pc ^ (<span class="keyword">if</span> i = n <span class="keyword">then</span> <span class="string">" "</span> <span class="keyword">else</span> <span class="string">"| "</span>)) <span class="keyword">in</span>
print_tree ~pad c
) cs
;;
<span class="comment">(* An example tree. *)</span>
<span class="keyword">let</span> tree =
<span class="keywordsign">`</span><span class="constructor">Node</span> (<span class="string">"."</span>
, [
<span class="keywordsign">`</span><span class="constructor">Node</span> (<span class="string">"S"</span>, [
<span class="keywordsign">`</span><span class="constructor">Node</span> (<span class="string">"T"</span>, [
<span class="keywordsign">`</span><span class="constructor">Node</span> (<span class="string">"U"</span>, [])]);
<span class="keywordsign">`</span><span class="constructor">Node</span> (<span class="string">"V"</span>, [])])
; <span class="keywordsign">`</span><span class="constructor">Node</span> (<span class="string">"W"</span>, [])
])
;;
<span class="comment">(* Print the example tree. *)</span>
<span class="keyword">let</span> () = print_tree tree
;;
</code></pre>
</p>
<p>The output of the above looks like this:
<pre>.
|-- S
| |-- T
| | `-- U
| `-- V
`-- W</pre>
</p>
<hr/>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.comtag:blogger.com,1999:blog-5012565255225108517.post-51275232550691293342017-08-12T15:38:00.000-04:002017-08-12T15:38:54.894-04:00Transpose<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<style>
.keyword { font-weight : bold ; color : Red } .keywordsign {
color : #C04600 } .comment { color : Green } .constructor {
color : Blue } .type { color : #5C6585 } .string { color :
Maroon } .warning { color : Red ; font-weight : bold } .info {
margin-left : 3em; margin-right: 3em } .param_info { margin-top:
4px; margin-left : 3em; margin-right : 3em } .code { color :
#465F91 ; } pre { margin-bottom: 4px; font-family: monospace; }
pre.verbatim, pre.codepre { }</style>
<meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type">
<script type="text/x-mathjax-config">
MathJax.Hub.Config({ extensions:
["tex2jax.js","TeX/AMSmath.js","TeX/AMSsymbols.js"], jax:
["input/TeX", "output/HTML-CSS"], tex2jax: { inlineMath: [
['$','$'], ["\\(","\\)"] ], displayMath: [ ['$$','$$'],
["\\[","\\]"] ], }, "HTML-CSS": { availableFonts: ["TeX"] } });
</script>
<title>Transpose</title>
</head>
<body>
<p>
If we are to represent a row of a matrix as a list of numbers,
then a matrix can naturally be represented as a list of lists of
numbers.
</p>
<p>The transpose of a matrix $\mathbf{A}$ is a new matrix denoted
$\mathbf{A^{T}}$. The traditional mathematical definition of
$\mathbf{A^{T}}$ is expressed as saying the $i$ th row, $j$ th
column element of $\mathbf{A^{T}}$ is the $j$ th row, $i$ th
column element of $\mathbf{A}$:
<div align="center">
$\left[\mathbf{A}\right]_{ij} = \left[\mathbf{A^{T}}\right]_{ji}$.
</div>
</p>
<p>
As definitions go, this isn't terribly helpful in
explaining <em>how</em> to compute a transpose. A better
equivalent definition for the functional programmer is : the
matrix obtained by writing the columns of $\mathbf{A}$ as the
rows of $\mathbf{A^{T}}$.
</p>
<p>
An elegant program for computing a transpose follows from a
direct translation of that last definition.
<pre><code class="code"><span class="keyword">let</span> <span class="keyword">rec</span> transpose (ls : <span class="keywordsign">'</span>a list list) : <span class="keywordsign">'</span>a list list =
<span class="keyword">match</span> ls <span class="keyword">with</span>
<span class="keywordsign">|</span> [] <span class="keywordsign">|</span> [] :: _ <span class="keywordsign">-></span> []
<span class="keywordsign">|</span> ls <span class="keywordsign">-></span> <span class="constructor">List</span>.map (<span class="constructor">List</span>.hd) ls :: transpose (<span class="constructor">List</span>.map (<span class="constructor">List</span>.tl) ls)
</code></pre>
</p>
<p>
It is not at all hard to understand how the program works when
you've seen an example:
<pre><code class="code">transpose [[1; 2]; [3; 4;]; [5; 6]]
= [1; 3; 5] :: transpose [[2]; [4;]; [6]]
= [1; 3; 5] :: [2; 4; 6] :: transpose [[]; []; []]
= [1; 3; 5] :: [2; 4; 6] :: []
= [[1; 3; 5]; [2; 4; 6]]</code></pre>
</p>
<p>
Being as pretty as it is, one is inclined to leave things be but,
as a practical matter it should be rephrased to be tail-recursive.
<pre><code class="code"><span class="keyword">let</span> <span class="keyword">rec</span> transpose (ls : <span class="keywordsign">'</span>a list list) : <span class="keywordsign">'</span>a list list =
<span class="keyword">let</span> <span class="keyword">rec</span> transpose_rec acc = <span class="keyword">function</span>
<span class="keywordsign">|</span> [] <span class="keywordsign">|</span> [] :: _ <span class="keywordsign">-></span> <span class="constructor">List</span>.rev acc
<span class="keywordsign">|</span> ls <span class="keywordsign">-></span> transpose_rec (<span class="constructor">List</span>.map (<span class="constructor">List</span>.hd) ls :: acc) (<span class="constructor">List</span>.map (<span class="constructor">List</span>.tl) ls)
<span class="keyword">in</span> transpose_rec [] ls
</code></pre>
</p>
<hr/>
<p>
References:<br/>
"An Introduction to Functional Programming Systems Using Haskell" -- Davie A J T., 1992
</p>
</body>
</html>
Shayne Fletcherhttp://www.blogger.com/profile/08359015903308240887noreply@blogger.com