This blog is about Simon, a young gifted mathematician and programmer, who had to move from Amsterdam to Antwerp to be able to study at the level that fits his talent, i.e. homeschool. Visit https://simontiger.com

This is Simon’s introductory video for the World Science Scholars program (initiative of The World Science Festival). In May this year, Simon has been chosen as one of the 30 young students worldwide, joining the 2019 cohort for exceptional talents in mathematics. Most of the other students are 14 to 17 years old, age was not a factor in the selection process. To help the students and their future mentors to get to know one another, every World Science Scholar was asked to record an introductory video, no longer than 3 minutes, answering a few questions such as what is the biggest misconception about math, what your favourite branches of math and science are and who among the living mathematicians you’d like to meet.

Throughout the program, the students are given access to over a dozen unique interdisciplinary online courses and have the option to complete an applied math project, alone or as a team, consulting real experts in the field of their project. Simon has already started the first course module, on Special Relativity by Professor Brian Greene. The course has been specifically recorded for the World Science Scholars and reflects the program’s ethos: it’s self-paced, no grades, it relies on beautiful animations and visualizations, it’s full of subtle humour, is dynamic, thought-provoking and quite advanced (exactly in The Goldilocks Zone for Simon, as far as I could judge), yet broken up into easy-to-digest pieces. It’s difficult to predict how Simon’s path as a World Science Scholar will unfold (I’m afraid of making any predictions as he is extremely autodidact), but so far we have been very pleased with the nature of this program and it seems to match our non-coercive, self-directed learning style. I have especially liked one of the course’s main postulates: “Simultaneity is in the eye of the beholder”.

This amazing sentence is generated by a Markov Text-Generation Algorithm. What is a Markov Algorithm? Simply put, it generates the rules from a source text, and it generates a new text that also follows those rules. The rules are often called the Markov Blanket, and the new text is also called the Markov Chain. OK, how does this all work?

Let’s take an example: let’s consider the source text to be “Hello, world!”. Then we pick a number called the order. The higher the number, the more sense the text makes. We’ll pick 1 for the first examples, we’ll examine what happens with higher numbers later.

Then we generate the Markov Blanket. This is a deterministic process. We start from the beginning: “H”. So we put H in our Markov Blanket. Then we come across “e”. So we put e in our Markov Blanket, but to specify that it’s next from H, we connect H to e by an arrow. Then we stumble on “l”. So we put l in our Markov Blanket, but again, to specify that it’s next from e, we connect e to l by an arrow.

Now, here’s where it gets interesting. What’s next? Well, it’s “l” again. So now we connect l to itself, by an arrow. This is interesting because it’s no longer a straight line!

And we keep going. Once we’re done, our Markov Blanket will look something like this:

Once we’ve created our Markov Blanket, then we start generating the Markov Chain from it. Unlike the Markov Blanket, generating the Markov Chain is a stochastic process.

This is just a process of wandering about the Markov Blanket, and noting down where we go. One way to do this, is just to start from the beginning, and follow the path. And whenever we come across some sort of fork, we just spin a wheel to see where we go next. For example, here are some possible Markov Chains:

That was an easy one, so let’s do something more complex with code.

First, just an interface to enter in the text, and the order:

text = "" # Variable to hold the text
print("Type your text here (type END to end it):")
while True:
line = input("") # Read a line of text from standard input
if line != "END": text += line + "\n" # If we didn't enter END, add that line to the text
else: break # If we entered END, signify that the text has ended
text = text[:len(text)-1] # Remove the last line break
order = int(input('Type the order (how much it makes sense) here: '))
input("Generate me a beautiful text") # Just to make it dramatic, print this message, and ask the user to hit ENTER to proceed

Next, the Markov Blanket. Here, we store it in a dictionary, and store every possible next letter in a list:

def markov_blanket(text, order):
result = {} # The Markov Blanket
for i in range(len(text) - order + 1): # For every n-gram
ngram = ""
for off in range(order):
ngram += text[i+off]
if not ngram in result: # If we didn't see it yet
result[ngram] = []
if i < len(text) - order: # If we didn't reach the end
result[ngram].append(text[i+order]) # Add the next letter as a possibility
return result # Give the result back

Huh? What is this code?

This is what happens when we pick a number >1. Then, instead of making the Markov Blanket for every character, you make it for every couple of characters.

For example, if we pick 2, then we make the Markov Blanket for the 1st and 2nd letter, the 2nd and 3rd, the 3rd and 4th, the 4th and 5th, and so on. When we generate the Markov Chain, we squash the ngrams that we visit together. So next, the Markov Chain:

def markov_chain(blanket):
keys = blanket.keys()
ngram = random.choice(list(keys)) # Starting Point
new_text = ngram
while True:
try:
nxt = random.choice(blanket[ngram]) # Choose a next letter
new_text += nxt # Add it to the text
ngram += nxt # Add it to the ngram and remove the 1st character
ngram = ngram[1:]
except IndexError: # If we can't choose a next letter, maybe because there is none
break
return new_text # Give the result back
# Now that we know how to do the whole thing, do the whole thing!
new_text = markov_chain(markov_blanket(text, order), num)
print(new_text) # Print the new text out

OK, now let’s run this:

Type your text here (type END to end it):
A rainbow is a meteorological phenomenon that is caused by reflection, refraction and dispersion of light in water droplets resulting in a spectrum of light appearing in the sky. It takes the form of a multicoloured circular arc. Rainbows caused by sunlight always appear in the section of sky directly opposite the sun.
Rainbows can be full circles. However, the observer normally sees only an arc formed by illuminated droplets above the ground, and centered on a line from the sun to the observer's eye.
In a primary rainbow, the arc shows red on the outer part and violet on the inner side. This rainbow is caused by light being refracted when entering a droplet of water, then reflected inside on the back of the droplet and refracted again when leaving it.
In a double rainbow, a second arc is seen outside the primary arc, and has the order of its colours reversed, with red on the inner side of the arc. This is caused by the light being reflectedtwice on the inside of the droplet before leaving it.
END
Type the order (how much it makes sense) here: 5
Generate me a beautiful text

And……..it..stops.

Why did it do that?

You see, this is not such a good method. What if our program generated a Markov Blanket that didn’t have an end? Well, our program wouldn’t even get to the end, and it would just wander around and around and around, and never give us a result! Or even if it did, it would be infinite!

So how do we avoid this?

Well, we set another much bigger number , let’s say 5000, to be a callout value. If we don’t get to the end within 5000 steps, we give up and output early. Let’s run this again…

And now, it doesn’t stop anymore! Snippets of example generated text:

It takes the sun to the ground, and violet on the observer’s eye.

This rainbow, a second arc formed by illuminated droplets resulting it. In a primary rainbow is a meteorological phenomenon the back of the ground, and has the sky. It takes the order of its coloured circles. However, the sun.

Rainbow, a second arc shows red on a line from the section of light in water droplet and has the sun.

In a double rainbow is caused by illuminated droplet on the outer part and refracted when leaving in a spectrum of a multicoloured circles. However, the droplet of water droplets resulting it. In a double rainbow is a meteorological phenomenon the droplets resulting in a spectrum of a multicoloured circular arc. Rainbow is caused by the inner side the observer’s eye

Simon solving the Ackermann function (a function that cannot be de-recursed). It’s computable but the computer’s soon runs out of its computing power (see the last line of code below):

Simon writes: I’ve built a giant project; a website / community project / platform for making algorithms! I’ve built in this video Bubble Sort, Selection Sort, Insertion Sort, Mergesort, Quicksort, Heapsort, Shell Sort and Radix Sort. So I’m done with the sorting part of the project. In the next video I’ll show you the making of the Pathfinding part of the project, and then, I’m going to put it on GitHub, and pass it on to the community, to put more algorithms on there, and even new types of algorithms!

Simon has started a huge new project: a series of video tutorials about sorting algorithms. In the videos, he codes on his RaspberryPi, but here is the link to the Python code available on his GitHub page (that he continuously updates): https://gist.github.com/simon-tiger/5be70247a066f69c2578be5bb8e41e59

Today, Simon has recorded the fifth part of the series, in which he explains and applies the Quicksort algorithm. [The coding part goes very smoothly and much quicker (hehe) than in the previous sorting videos we have made so far. Simon also came up with his own code, he didn’t look the code up].

And here come the previous parts of Simon’s sorting algorithms series, also available via this link to a playlist on his YouTube channel (there will be more videos coming):

Simon is also fascinated by more exotic sorting algorithms, such as a sorting network:

Simon has tried Matt Parker’s multiplicative persistence challenge on Numberphile: by multiplying all the digits in a large number, looking for the number of steps it takes to bring that large number to a single digit. Are there numbers that require 12 steps (have the multiplicative persistence of 12)?

Simon has worked on this for two days, creating an interface in Wolfram Mathematica. He wrote the code to make the beautiful floral shapes above, they are actually graphs of how many steps three digit numbers take to get to single digit numbers (each ”flower” has the end result at its center).

What about the numbers with many more digits than three? Simon has tried writing code to look for the multiplicative persistence of really large numbers and also came up with some efficiencies, i.e. shortcuts in the search process. He did manage to find the persistence for 2^233 (the persistence was 2):

However after he applied one of his efficiencies to the code to be able to search through many numbers at once, the code didn’t run anymore. You can read Simon’s page about this project and see his code here:

277777788888899 is the smallest number with a persistence of 11. The largest known is 77777733332222222222222222222:

This code works with a few efficiencies: 1. They’ve already checked up to 10^233, so we don’t have to check those again. 2. We can rearrange the digits, and the multiplication will be the same. So we don’t have to check any of the rearrangements of any of the numbers we’ve already checked. 3 3a. We should never put in a 0 (a digit of the number). Because then you would be multiplying by 0, which would result in 0 in 1 step! 3b. We should also never put in a 5 and an even number. Because, in the next step, the number would be divisible both by 5 and by 2, so it’s also divisible by 10. That would put a 0 in the answer, which we saw we should never do! 3c. With similar reasoning (assuming we want to find the smallest number of the type we want), we’ll see we should never put in: – Two 5s – A 5 and a 7 – When we put in a… (- means anything, the order doesn’t matter): 1,- , remove the 1 2,2, put 4 instead 2,3, put 6 instead 2,4, put 8 instead 3,3, put 9 instead So, we can reduce the search space and time collossaly, with just some logic!

Simon has been completely carried away by Wolfram Mathematica. He keeps starting new projects, just to try something out. After working on his Knot Theory book for days, and making beautiful illustrations in Wolfram, he switched over to domain coloring. The images below are some impressions of his experimenting with the color function. He hasn’t applied the complex function yet.

Another new project he started has been Poisson disc sampling.

“Wolfram is the most advanced language! It has most built-in stuff in it! At Wolfram, they are working so hard, that the knowledge base is changing every second!” Simon screams out as he pauses the Elementary Introduction to Wolfram Language book (he was reading it at first and now binge watches it as a series of video tutorials). Simon has been especially blown away by the free-form linguistic input: “From plane English it somehow computes the results and maybe even native Mathematica Syntax!” Wolfram also “has an entire section which is machine learning!”

Simon has started a free trial about a week ago, but I guess we’re buying a subscription.