I recently stumbled over a tweet by @jmaslak which talks about how you can turn a Dromedary camel into a Bactrian camel using Perl6. The following code:
my $c = 'πͺ';
$c++;
say $c;
produces the following output: βπ«β
The reason for that is the Unicode characters πͺ and π« have the code points
U+1F42A and U+1F42B respectively, so the ++
operator moves from one to the next (while
looking at that code I also learned that ++
is not the same as += 1
β if you try this, rakudo complains that πͺ is not a valid base-10 number).
Since I am currently in the process of learning more about both Haskell and PureScript, I decided I wanted to try and replicate that code in both languages.
In Haskell, I managed quite quickly as follows:
Prelude> import Data.Char
Prelude Data.Char> putStrLn [(chr . (+1) . ord) 'πͺ']
π«
While writing this blog post, I realized that Char
has a Enum
type class instance as
well, so the code can be made even easier:
Prelude> putStrLn [succ 'πͺ']
π«
PureScript created a bit more of a headache for me. I first tried to work with
toCharCode
from Data.Char, but β¦
PSCi, version 0.12.0
Type :? for help
import Prelude
> import Data.Char
> toCharCode 'πͺ'
(line 1, column 15):
unexpected astral code point in character literal; characters must be valid UTF-16 code units
What? That kinda reminds me about an 11 year old rant about VBScript.
Oh well, luckily if one knows where to dig (or whines a bit on Twitter), the
Data.String.CodePoints
module comes to the rescue. Equipped with this, I
arrived at the following solution:
import Data.String.CodePoints (singleton, codePointAt)
import Data.Enum (succ)
import Data.Maybe (maybe)
maybe "" singleton (codePointAt 0 "πͺ" >>= succ)
Wow, that looks a bit more complicated than in Haskell. OTOH, it is also safer. Let me try and explain what is happening here:
Since we still canβt use a Dromedary camel in a character literal, we have to
put it into a string literal (I am still somewhat confused as to why that
works, but it does not in character literals though β¦). We can then call the
codePointAt
function which has the following type:
> :t codePointAt
Int -> String -> Maybe CodePoint
So we pass it an Int (the position in the string, 0 in our case) and a String
and we get back a Maybe CodePoint
. Why Maybe
? Because if we want to get for
example the code point of the second character of βπͺβ, it does not exist, so it
will return Nothing
to signal this.
As a second step, we want to get the next code point from here. Luckily,
CodePoint
has an Enum
type class instance (at least in newer versions of
Data.String.CodePoints
, the above code unfortunately does not work on
try.purescript.org as Phil Freeman himself
pointed out). This means we can use the
succ
function which has the following type:
> :t succ
forall a. Enum a => a -> Maybe a
My first attempt was to say: βOK, then I will just (f)map
succ
over the Maybe
CodePoint
returned by codePointAt 0
β. But then I end up with a double Just
construct:
> succ <$> codePointAt 0 "πͺ"
(Just (Just (CodePoint 0x1F42B)))
Then I realized that I recently read in The Haskell Book (Haskell Programming From First Principles) that this is exactly the use case
for Monads and the bind operator (>>=
). So the bind operator makes sure that
we get rid of one of the layers of Maybe
s and does what we want:
> codePointAt 0 "πͺ" >>= succ
(Just (CodePoint 0x1F42B))
We have a Maybe CodePoint
now which we want to turn into a String
. For this, we combine the maybe
function from Data.Maybe
and singleton
from Data.String.CodePoints
. Here are their types:
> :t maybe
forall a b. b -> (a -> b) -> Maybe a -> b
> :t singleton
CodePoint -> String
Letβs start with singleton: It takes a CodePoint
and gives us a String
of length 1 with the character represented by that code point. The maybe
function takes a default value, a function that goes from a
to b
, a Maybe a
value and gives us a b
value (either the default one if the Maybe a
is Nothing
, or the result of the function application of the value inside the Just
in the other case).
If we want to combine this function with maybe
, we can figure out what the types a
and b
are in our specific case. For this we can used typed holes, something I recently learned about at the very nice FP Unconference BusConf 2018:
> :t maybe ?b singleton ?ma
[...]
Hole 'b' has the inferred type
String
[...]
Hole 'ma' has the inferred type
Maybe CodePoint
So b
is String
and a
is CodePoint
. Great, we just need to choose the empty string as the default value and run it, then we end up with our camel!
> maybe "" singleton (codePointAt 0 "πͺ" >>= succ)
"π«"