Noah Brier | July 16, 2019

Why is this interesting? - The Machine Learning Edition

On recurrent neural networks, McLuhan, and experimentation

Recommended Products

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World
The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World

A book on machine learning techniques offering insight into how algorithms are shaping our understanding and interaction with the digital world.

This is a piece I wrote about two years ago but only sent out to a few people. I thought I’d share it here as it still feels relevant. - Noah (NRB)

The medium is the revolution in the mechanical and problems. The composition of the press are experience the most transformation of the decisions like the translating a great society, the press can be done for story of the poets and feeling of printing of the printed word. The complete point of the integration, the sense with the community that has to be a discontinuity and the printed word and inner lives. The fact that the telegraph and information to a printing and newspaper, the transforming popular structure in the computer. The movie was a kind of the hot change and in a means of explosive and print or much of the reversals of information.

If that doesn't make much sense I've got two excuses: First, it was completely generated using machine learning, and second, it's been trained on McLuhan's Understanding Media

After a conversation about Claude Shannon, information theory, and John McPhee's weird word processing habits, friend of WITI Nick Parish sent over two links:

  1. The experiments he's been doing with Recurrent Neural Networks (RNNs)

  2. Robin Sloan's RNN enabled sci-fi writing text editor

RNNs are a fairly common machine learning technique that does its best to mimic the way neurons connect. As best I understand (and I'm still learning two years later), they work by building a big network of weighted values that help it make inferences that can be translated into a whole bunch of different things. What's amazing about this is that it starts knowing nothing and learns to do whatever you ask of it (in this case write like McLuhan) by strengthening and weakening connections. In other words, the computer has no idea it's writing like McLuhan, or even that it's writing words. All it knows is that according to whatever text you've given it, this letter tends to come after that letter and punctuation is most frequently used after this combination.

In the example at the top, I got the sample started by asking it to build off "The medium is" and the RNN did the rest. If you really want to dig in with this technique, Andrej Karpathy’s "The Unreasonable Effectiveness of Recurrent Neural Networks" seems to be a pretty good place to start. It offers a few fascinating examples including this paragraph trained on 100MB of Wikipedia:

Naturalism and decision for the majority of Arab countries' capitalide was grounded by the Irish language by [[John Clair]], [[An Imperial Japanese Revolt]], associated with Guangzham's sovereignty. His generals were the powerful ruler of the Portugal in the [[Protestant Immineners]], which could be said to be directly in Catonese Communication, which followed a ceremony and set inspired prison, training. The emperor travelled back to [[Antioch, Perth, October 25|21]] to note, the Kingdom of Costa Rica, unsuccessful fashioned the [[Thrales]], [[Cynth's Dajoard]], known in western [[Scotland]], near Italy to the conquest of India with the conflict. …

As he points out in the article, what's particularly cool about this is how it learns to open and close parentheses in a markdown-like style. Again, it doesn't know why it's doing it, it just knows that's a part of the pattern.

I also tried to train the network off my blog posts and didn't have quite the same success as the McLuhan text. With that said, it did spit out some fun misspellings (contextual quote in parentheses): Prical ("transparently that makes a started prical products of people"), transumers ("internet in a transumers than started that the control post"), managelism ("city of the big managelism that all the post of the story"), and one from McLuhan, numerage ("that all of the written form of the press in the consumer means of power and numerage"). Seems like a perfect way to generate buzzwords.

My big takeaway is that the bar to get these things ok is low and great is high. While I don’t want to read too much into a few experiments, this does seem to match some of what’s going on in AI/ML more broadly, where early wins with simple problems still haven’t translated into the full-fledged intelligence that seemed inevitable. (NRB)

Google Image Search of the Day:

From this excellent Twitter thread by physicist Sabine Hossenfelder: “Curious find: A Google image search for ‘futuristic’ returns almost exclusively images with blue/black color themes. How is that? Why isn't the future orange? Very puzzled about this.” She follows that with the colors of tech (black/blue), history (sepia), and truth (black and white) amongst others. (NRB)

Quick Links:

Thanks for reading,

Noah (NRB) & Colin (CJN)

© WITI Industries, LLC.