Blog

Working with Environment Variables (Tech Note)

Here’s a quick cheatsheet on setting and reading environment variables across common OS’s and languages.

When developing software, it’s good practice to put anything you don’t wish to be public, as well as anything that’s “production-environment-dependent” into environment variables. These stay with the local machine. This is especially true if you are ever publishing your code to public repositories like Github or Dockerhub.

Good candidates for environment variables are things like database connections, paths to files, etc. Hosting platforms like Azure and AWS also let you easily set the value of variables on production and testing instances.

I switch back and forth between Windows, OSX and even Linux during development. So I wanted a quick cheatsheet on how to do this.

Writing Variables

Mac OSX (zsh)

The default shell for OSX is now zshell, not bash. If you’re still using bash, consider upgrading, and consider using the great utility “Oh My Zsh.”

will print out environment variables

To save new environment variables:

export BASEBALL_TEAM=Seattle Mariners

To save these permanently, you’ll want to save these values to:

~/.zshenv

So, in terminal you’d bring up the editor:

and add export BASEBALL_TEAM=Seattle Mariners to this file. Be sure to restart a new terminal instance for this to take effect, because ~./zshenv is only read when a new shell instance is created.

bash shell (Linux, older Macs, and even Windows for some users)

export BASEBALL_TEAM=Seattle Mariners
​
echo $BASEBALL_TEAM
Seattle Mariners 
​
printenv 
< prints all environment variables > 
​
// permanent setting
sudo nano ~/.bashrc
// place the export BASEBALL_TEAM=Seattle Mariners in this file
// restart a new bash shell 

Windows

  • Right click the Windows icon and select System
  • In the settings window, under related settings, click Advanced System Settings
  • On the Advanced tab, click Environment variables
  • Click New to create a new variable and click OK

Dockerfile

ENV [variable-name]=[default-value]
​ 
ENV BASEBALL_TEAM=Seattle Mariners

Reading Variables

Python

import os
​
print(os.environ.get("BASEBALL_TEAM"))

Typescript / Javascript / Node

const db: string = process.env.BASEBALL_TEAM ?? ''

C#

var bestBaseballTeam = Environment.GetEnvironmentVariable("BASEBALL_TEAM");

Enchanted Rose 2.0 with Falling Petals

Revisiting the Enchanted Rose project four years later, this time with an open-source build based on Raspberry Pi.

Four years ago, my daughter was in a school production of Beauty and the Beast, and I volunteered to build a stage prop “Enchanted Rose,” which could drop these petals on cue. The initial project was fun, and the play was a total success. They did about ten productions in the spring of 2019.

I documented the build in a two-part blog series.

Those articles and their accompanying YouTube video made their way around the search engines. And in the years since, I’ve gotten emails from more than a dozen stage managers around the world (Colorado, Texas, California, London, NY, Florida, Hawaii, Scotland and more.) Stage crews, parents and students are looking to build the working prop.

Some even inquire about rental. Well, rental of the version 1.0 prop is out of the question. Not only was the original design based on servo motors and springs and fishing line, far too delicate to ship, the version 1.0 isn’t even in my possession anymore. I gifted it to the director. She’s also the school’s computer teacher, and she really wanted the prop.

As people trying to re-make it have let me know, the original instructions are now fairly out of date in at least two major ways. First, the Bluetooth board that the prop relied upon has since been discontinued. And second, the controller was a native iOS app written in Swift 3.0. Apple has updated the Swift language app so much that the old native app no longer compiles. Though I could get the app working again, Apple has made it fairly difficult to distribute apps ad-hoc, and I have no interest in publishing the app in the App Store as it is far too specialized.

But with every new inquiry from a hopeful production manager, I’ve felt it would be a nice thing to do to rebuild the prop, to better explain how to build one.

And so, here it is. Enchanted Rose 2.0 is a web-based one with a Raspberry Pi and an open-source approach. The basic mechanics are now working:

Enchanted Rose 2.0

Features

  • Control via private wifi network and mobile web browser, not native app
  • Total cost no more than $200
  • Durable enough to ship
  • Do NOT require that it connect to a local internet
  • Easy to use interface that an elementary school stage crew could use
  • Easy on-site setup. Ideally, just turn it on and it works.
  • Battery operated — most productions like to keep the illusion (and stage safety) by eschewing cords.

The main functions of the prop are to remotely:

  • Turn stem light on and off
  • Turn accent light on and off, and set colors and pattern
  • Drop 4 rose petals, one by one, on cue

Basic Approach

Lighting: The lighting circuitry is the same as in v1.0 design. That is, for the “stem lights”, a Raspberry Pi uses its GPIO port to send a signal to a resistor’s “base,” to allow a 4.5V circuit to complete.

For the Accent Lights, there are Neopixels; I found the 60-LED strand to be the right size. There’s a very good article on Adafruit telling you how to use a Raspberry Pi and Circuit Python to power Neopixels. Just as they recommend, I used a level converter chip (the 74AHCT125) to convert the Pi’s 3.3V output up to 5V without limiting the power drawn by the NeoPixels.

Both of the lights are powered by 4.5V, delivered by 3 AA batteries in a battery pack.

Drop Mechanism: For the drop mechanism, I made major changes from v1.0 to v2.0. Version 1.0’s use of magnet-spring-fishing-line servo-motor works for local productions just fine, and has some advantages (reliability once calibrated.) But it is far too delicate to ship. I experimented with electric solenoids pulling on magnets, but the smallest ones were too weak, and would require an enormous “bulb” to house.

I hit upon the idea of using pneumatic pressure! That is, pumps create puffs of air in tubes, which jettison the petal.

Software: Control of the prop is via a Python-based API, which has a front-end control app (link to repo below.) That is, the prop itself hosts a web server on the Raspberry Pi, and presents its own private “ad-hoc” network. The stage manager connects the prop’s ad-hoc network, and accesses the controller software. This general pattern can be used to build “remote control” devices.

There are three main components to this stage prop:

  1. The front-end control software, written in React/Next.js. You’ll find that code in the enchantedrose repository. From a technical standpoint, all this repo does is present a nice front-end layer onto the API.
  2. This hardware API, in this repo: Enchanted Rose Python API. This repo is the essential python-based API software. This controls the GPIO pins on the Raspberry Pi and presents an API on port 5000 to the world. The role of this software library is to provide an API for turning on/off lights, dropping petals, etc.
  3. The prop hardware itself.

Software

The two key pieces of software, other than the Raspbian Operating System itself, can be found here:

You’ll also need to install the NeoPixel driver libraries, which are well documented over at the Adafruit website.

Hardware

Each pump motor needs a circuit like the below. Since RPi cannot output 9V, we’ll make use of a transistor to switch a supplied 9V current on and off. (Note that in production, I ended up boosting the power supply to 8 AA batteries, or 12V. The motors can handle this higher voltage for short duration, even though they’re rated for 6-9V.)

Since these motors generate current spikes when they are powering down and powering up, it’s very important to protect your RPi with a diode, as follows (the directionality matters! pay attention to the line on the diode and the positive lead from the motor):

Wiring diagram for pump motors

There are four such circuits, with the power source and RPi being shared between them.

Accent Lights

The circuit and software to control Neopixels is well documented at Adafruit’s website.

Configuring Raspberry Pi for “Ad Hoc Network”

I was excited to hit upon the idea of having the prop create its own “Ad Hoc Network” (i.e., it’s own Wifi SSID that you can join.) Not only did this eliminate the need to have the stage manager at each playhouse somehow find a way to connect the prop into their own local network, but it removed any requirement for a display or keyboard to ship with the device for configuration purposes.

All they have to do is plug in the device, wait for boot-up, and it presents its own Wifi Network (“EnchantRose”.) They then simply enter the password, bring up a web browser on their phone, and visit a pre-determined IP address which I’ve configured: http://192.168.11.1. It took a while to figure out how to configure Raspberry Pi to do this, since the instructions have changed seemingly with every release of Raspbian, the RPi operating system. But I figured it out by adopting existing instructions, and blogged it here.

What’s most exciting about this is that it’s a pattern that can has many applications in the world of “Internet of Things” — run Apache2 on your Raspberry Pi to host a website, and render that website visible via the Rpi’s own password-secured wifi network.

Current Status

The new version 2.0 prop is fully functional. I’m now in the “arts and crafts” stage of making it look much more realistic. There’s more work to be done there, as well as documenting a setup guide for stagehands.

I’ve gotten four more inquiries since November 2022, and will be selling this prop to one of them. I had considered renting it out, but it’s a bit of a logistical hassle to ship and receive. If you’ve got specific questions on assembly or the software, drop me a note. On with the show!

Turn a Raspberry Pi into a Web Server with Its Own Wifi Network (Tech Note)

The Raspberry Pi microcomputer is great for building “Internet of Things” devices. This technical post describes the steps to get your Raspberry Pi to broadcast its own local network and bridge to an Ethernet network if available. There are other guides showing how to do this on the Internet, but they are often woefully out of date, leading people down real rabbit holes.

The Raspberry Pi (RPi) microcomputer is great for building cheap “Internet of Things” devices. In this tech note, I outline how to set up a Raspberry Pi device so that it broadcasts its own local wifi network, so you can join it from a mobile phone. From there, you could launch a web browser and point it at the RPi’s website, which in turn controls the board’s I/O devices. This means you could use a mobile phone, say, and a Raspberry Pi device, and control it without any larger Internet connection.

The steps below also essentially turn the RPi into a “wifi access point,” which basically can take a hardline Internet network (LAN) and present a wifi front-end (WLAN.)

Turn your RPi into a wifi access point, or also its own ad-hoc network

So how do you get a Raspberry Pi to create its own wifi network? There are several “How To’s” on the web, but nearly all of them I followed are out of date. You need to install a few services, set up some configuration to define your network, and then reboot the Pi once everything is all set.

One great use-case is to build a standalone device which can be controlled via a web interface from, say, a mobile device. RPi’s can easily run web servers like Apache, so once you set up an RPi to broadcast its own private wifi network, you can easily interact with the Raspberry Pi’s devices and sensors via your website.

You’ve probably seen this kind of behavior if you’ve set up a wifi printer, or smart switch, or wifi security camera. These devices often have modes where they broadcast their own local wifi network, then you use a browser or a configuration app to “join” it, you do some setup process, and then it in turn uses what you’ve input to restart and join your home network with those configuration details thus supplied.

Enchanted Rose Prop 2.0

I had a use-case for this “ad hoc networking” with a stage prop I’m building.

A few years ago, I built an Enchanted Rose prop for my daughter’s school production of “Beauty and the Beast.” It let the stage manager drop petals and turn on lights on queue. It was based on Arduino and Bluetooth. I blogged the instructions in a two-part series. As a result thanks to Google, every couple of months or so, I get an inquiry from a production manager from somewhere in the world who wants to know more about how to build such a prop. Invariably, they follow these instructions and then hit a dead-end (which I’ve now noted in the posts.) The problem is, the version 1.0 device that I built is based upon an Arduino microcomputer (fine) with a Bluetooth add-on board which is now discontinued (not fine.) Worse, its controller was a proprietary Swift app which I wrote in a many-years-out-of-date dialect of Swift, which had to be installed straight from XCode onto a machine, as I opted not to publish it in the App Store. Apple has made many changes to Swift since then. The app as originally written no longer compiles, and Apple itself make it very difficult to distribute “one off” apps to people. (You can’t just post the app download somewhere, or email them a link — Apple requires that the developer add the person to an ad-hoc distribution list.)

So I began to think about how to rebuild it in a more open-source way.

Motivation for Ad-Hoc Network

At the heart of the version 2.0 of this prop is a Raspberry Pi Zero 2W, which runs its own little Apache web server to control lights and the falling petals. Ideally, a stage manager would simply need a mobile phone or iPad or some kind of web browser on a device equipped with wifi to communicate with the prop.

I’ve heard interest in this prop from stage managers in the UK, California, Colorado, Texas and more, and I wanted to re-build this prop in a more robust, even shippable way. So on the mechanical front, instead of using delicate springs and fishing line, it now uses pumps and air to push the petals off the rose. I’ve got the motors and circuits all working fine. Before moving on to the final aesthetics, I’m now working through matters related to the deployment setting(s) this prop will run in.

There’s a high school in California putting on this play in March. I don’t know what network environment is at that school, nor should I particularly care. I certainly don’t want to request and configure wifi passwords for the prop to “join” before shipping the device out. Rather than equip the prop with some kind of user interface to allow it to “join” the local WAN (needlessly wasteful screen and keyboard and instructions), it’s far better for it to broadcast its own tiny network, and be controlled from backstage just on its own. You’ve probably seen this type of “ad hoc” network when setting up, say, a printer or security camera, smart speaker, or smart switch.

But wow, getting ad-hoc networking up and going in Raspberry Pi is complicated!

THAR BE DRAGONS. It has taken a full day to get this up and running. Not only is Linux itself pretty arcane, but a much bigger problem is that so many of the instructions on ad-hoc networking for Raspberry Pi’s are wildly out of date. The Raspberry Pi Foundation changes these methods seemingly with every release of the OS.

So, to save my future self (and anyone Googling who lands here) many headaches… After much tinkering, the following instructions successfully configured a Raspberry Pi Zero 2W device to create and broadcast its own wifi network with a password. It allows clients (such as mobile browsers) to join it, and visit its control webpage hosted by the device on its Apache server. In short, the below just plain worked for me. Many, many other “how-to’s” on the web did not. A day wasted; I don’t want to do that again.

After you complete the below, you should be able to use a wifi device and “join” the network that the Raspberry Pi broadcasts. I successfully did so from both a Mac desktop and an iPhone. I could also then visit the apache page hosted on the Pi from these devices. This opens up a world of possibilities for “build and deploy anywhere” devices.

The notes which follow are adapted from: documentation/access-point-bridged.adoc at develop · raspberrypi/documentation (github.com)

There are a lot of steps here, but once you’re finished configuring your RPi, it should have its own private wifi network, and yet still be connectable via Ethernet. This is called an “Access Point” configuration.

Prerequisite: Use Raspbian “Buster,” a prior version of Raspbian OS

Make sure you use Raspbian “Buster” for the instructions. “Buster” is not, in fact, the most current version of the OS.

For the instructions below to work, you must use the older “Buster” version of Raspbian. Let me repeat, because some of you will be very tempted to use the latest version of Raspbian OS, and then will wonder why the network isn’t showing up. The Ad-hoc networking steps described below have only been tested to work on the “Buster” version. In fact, I tried and failed to get them to work on the latest (“Bullet”) version of Raspbian OS. Perhaps I did something wrong; perhaps by now it’s all resolved. But I tried twice, and only “Buster” worked perfectly.

The Raspberry Pi Foundation is great, but it’s pretty frustrating that they’re constantly tweaking the network setup code and drivers. There. are so many web pages which refer to old versions and drivers. (One clue you’re looking at an outdated how-to: if you see mention of “/etc/network” or the “interfaces” file. This method is discontinued and you’ll see no mention of it below.)

If you are dedicating your Raspberry Pi to its own ad-hoc network, I highly recommend you start over with it, back up whatever files you have on the SD card, and then use the Raspbian OS imager to create a bootable card with a fresh copy of Raspbian OS “Buster” on it. So, grab an archived copy of Buster. Unzip it. Then, launch the Raspberry Pi Imager, and click the GEAR icon to:

  1. ENABLE SSH with a password, and
  2. Set it up with your home wifi’s network name (SSID) and password.

After your card is written, you should be able to pop it out and put it in your RPi, and turn on the device. After a few moments, you should be able to ssh into it (Mac) or use PuTTY (Windows) to connect. To find the device’s IP address, go into your wifi’s router and look for a recently joined Raspberry Pi device. (Generally, the IP address will be the same from session to session.)

For Mac OSX, this looks something like:

ssh pi@192.168.x.y

where x and y are numbers. Every Internet device connected to your wifi is assigned a different address; this will vary from device to device. The default password for “pi” is “raspberry”.

Later in these setup instructions, we’re going to have it create its own wifi network with its own subnet, and connect on the backend with Ethernet. This is called a “Wide Area Network Access Point” or “Wifi Access Point” configuration, in which the RPi will basically have two IP addresses: one over the wire, and the other wireless.

1. Set up “root” user

Instead of the default user “pi”, for some operations you may need root access. For instance, NeoPixel libraries need Raspberry Pi “root” permission to run. So it’s best to set up a root user password first thing:

Then, log out of ssh and log back in with ssh root@<ip address>

From here on, you’ll want to sign in as root.

Enable remote login for the “root” user

  1. sudo nano /etc/ssh/sshd_config
  2. Find this line: PermitRootLogin without-password
  3. Change to: PermitRootLogin yes
  4. Close and save file.
  5. reboot or restart sshd service using: /etc/init

2. Get web server running

Install Apache2 Web Server with one line of code:

sudo apt install apache2 -y

This will create a folder in /var/www/html which the Apache web server will run. The install will also ensure that it starts at boot time.

3. Set up Ad Hoc Networking

OK this is the crucial part. HAVE PATIENCE, and follow this closely. (Full notes are at this Github location.)

Setting up a Routed Wireless Access Point

We will want to configure our Raspberry Pi as a Wireless Access Point, so that it broadcasts a wifi ssid, and so mobile devices can connect to it. Ideally, the RPi would also remain connectable to an Ethernet network for development purposes, to install new libraries, etc.

So a “Routed Wireless Access Point” is perfect for our needs.

AGAIN, BE SURE YOU ARE WORKING ON RASPBIAN “BULLET”, NOT “BULLSEYE” OR LATER.

You can find out what OS version you’re running with the following command:

A Raspberry Pi within an Ethernet network can be used as a wireless access point, creating a secondary network. The resulting new wireless network (called “Enchanted Rose” in my case) is entirely managed by the Raspberry Pi.

A routed wireless access point can be created using the inbuilt wireless features of the Raspberry Pi 4, Raspberry Pi 3 or Raspberry Pi Zero W, or by using a suitable USB wireless dongle that supports access point mode. It is possible that some USB dongles may need slight changes to their settings. If you are having trouble with a USB wireless dongle, please check the forums.

This documentation was tested on a Raspberry Pi Zero W2 running a fresh installation of Raspberry Pi OS Buster.

Before you Begin

  1. Ensure you have root access to your Raspberry Pi. The network setup will be modified as part of the installation: local access, with screen and keyboard connected to your Raspberry Pi, is recommended.
  2. Connect your Raspberry Pi to the Ethernet network and boot the Raspberry Pi OS.
  3. Ensure the Raspberry Pi OS on your Raspberry Pi is up-to-date and reboot if packages were installed in the process.
  4. Take note of the IP configuration of the Ethernet network the Raspberry Pi is connected to:
    • In this document, we assume IP network 192.164.4.0/24 is configured for the Ethernet LAN, and the Raspberry Pi is going to manage IP network 192.168.11.0/24 for wireless clients.
    • Please select another IP network for wireless, e.g. 192.168.10.0/24, if IP network 192.168.11.0/24 is already in use by your Ethernet LAN.
  5. Have a wireless client (laptop, smartphone, …) ready to test your new access point.

Install Access Point and Management Software

In order to work as an access point, the Raspberry Pi needs to have the hostapd access point software package installed:

Enable the wireless access point service and set it to start when your Raspberry Pi boots:

sudo systemctl unmask hostapd
sudo systemctl enable hostapd

In order to provide network management services (DNS, DHCP) to wireless clients, the Raspberry Pi needs to have the dnsmasq software package installed:

Finally, install netfilter-persistent and its plugin iptables-persistent. This utility helps by saving firewall rules and restoring them when the Raspberry Pi boots:

sudo DEBIAN_FRONTEND=noninteractive apt install -y netfilter-persistent iptables-persistent

Software installation is complete. We will configure the software packages later on.

Set up the Network Router

The Raspberry Pi will run and manage a standalone wireless network. It will also route between the wireless and Ethernet networks, providing internet access to wireless clients. If you prefer, you can choose to skip the routing by skipping the section “Enable routing and IP masquerading” below, and run the wireless network in complete isolation.

Define the Wireless Interface IP Configuration

The Raspberry Pi runs a DHCP server for the wireless network; this requires static IP configuration for the wireless interface (wlan0) in the Raspberry Pi. The Raspberry Pi also acts as the router on the wireless network, and as is customary, we will give it the first IP address in the network: 192.168.11.1.

To configure the static IP address, edit the configuration file for dhcpcd with:

sudo nano /etc/dhcpcd.conf

Go to the end of the file and add the following:

interface wlan0
    static ip_address=192.168.11.1/24
    nohook wpa_supplicant

Enable Routing and IP Masquerading

This section configures the Raspberry Pi to let wireless clients access computers on the main (Ethernet) network, and from there the internet.

NOTEIf you wish to block wireless clients from accessing the Ethernet network and the internet, skip this section.

To enable routing, i.e. to allow traffic to flow from one network to the other in the Raspberry Pi, create a file using the following command, with the contents below:

sudo nano /etc/sysctl.d/routed-ap.conf

File contents:

# Enable IPv4 routing
net.ipv4.ip_forward=1

Enabling routing will allow hosts from network 192.168.11.0/24 to reach the LAN and the main router towards the internet. In order to allow traffic between clients on this foreign wireless network and the internet without changing the configuration of the main router, the Raspberry Pi can substitute the IP address of wireless clients with its own IP address on the LAN using a “masquerade” firewall rule.

  • The main router will see all outgoing traffic from wireless clients as coming from the Raspberry Pi, allowing communication with the internet.
  • The Raspberry Pi will receive all incoming traffic, substitute the IP addresses back, and forward traffic to the original wireless client.

This process is configured by adding a single firewall rule in the Raspberry Pi:

sudo iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

Now save the current firewall rules for IPv4 (including the rule above) and IPv6 to be loaded at boot by the netfilter-persistent service:

sudo netfilter-persistent save

Filtering rules are saved to the directory /etc/iptables/. If in the future you change the configuration of your firewall, make sure to save the configuration before rebooting.

Configure the DHCP and DNS services for the wireless network

The DHCP and DNS services are provided by dnsmasq. The default configuration file serves as a template for all possible configuration options, whereas we only need a few. It is easier to start from an empty file.

Rename the default configuration file and edit a new one:

sudo mv /etc/dnsmasq.conf /etc/dnsmasq.conf.orig
sudo nano /etc/dnsmasq.conf

Add the following to the file and save it:

interface=wlan0 # Listening interface
dhcp-range=192.168.11.2,192.168.11.20,255.255.255.0,24h
                # Pool of IP addresses served via DHCP
domain=wlan     # Local wireless DNS domain
address=/gw.wlan/192.168.11.1
                # Alias for this router

The Raspberry Pi will deliver IP addresses between 192.168.4.2 and 192.168.4.20, with a lease time of 24 hours, to wireless DHCP clients. You should be able to reach the Raspberry Pi under the name gw.wlan from wireless clients.

NOTEThere are three IP address blocks set aside for private networks. There is a Class A block from 10.0.0.0 to 10.255.255.255, a Class B block from 172.16.0.0 to 172.31.255.255, and probably the most frequently used, a Class C block from 192.168.0.0 to 192.168.255.255.

There are many more options for dnsmasq; see the default configuration file (/etc/dnsmasq.conf) or the online documentation for details.

Ensure Wireless Operation

Note, I successfully skipped this subsection

Countries around the world regulate the use of telecommunication radio frequency bands to ensure interference-free operation. The Linux OS helps users comply with these rules by allowing applications to be configured with a two-letter “WiFi country code”, e.g. US for a computer used in the United States.

In the Raspberry Pi OS, 5 GHz wireless networking is disabled until a WiFi country code has been configured by the user, usually as part of the initial installation process (see wireless configuration pages in this section for details.)

To ensure WiFi radio is not blocked on your Raspberry Pi, execute the following command:

This setting will be automatically restored at boot time. We will define an appropriate country code in the access point software configuration, next.

Configure the AP Software

Create the hostapd configuration file, located at /etc/hostapd/hostapd.conf, to add the various parameters for your new wireless network.

sudo nano /etc/hostapd/hostapd.conf

Add the information below to the configuration file. This configuration assumes we are using channel 7, with a network name of EnchantedRose, and a password AardvarkBadgerHedgehog. Note that the name and password should not have quotes around them, and the passphrase should be between 8 and 64 characters in length, or else “hostapd” will fail to start.

country_code=US
interface=wlan0
ssid=EnchantedRose
hw_mode=g
channel=7
macaddr_acl=0
auth_algs=1
ignore_broadcast_ssid=0
wpa=2
wpa_passphrase=AardvarkBadgerHedgehog
wpa_key_mgmt=WPA-PSK
wpa_pairwise=TKIP
rsn_pairwise=CCMP

Note the line country_code=US: it configures the computer to use the correct wireless frequencies in the United States. Adapt this line and specify the two-letter ISO code of your country. See Wikipedia for a list of two-letter ISO 3166-1 country codes.

To use the 5 GHz band, you can change the operations mode from hw_mode=g to hw_mode=a. Possible values for hw_mode are:

  • a = IEEE 802.11a (5 GHz) (Raspberry Pi 3B+ onwards)
  • b = IEEE 802.11b (2.4 GHz)
  • g = IEEE 802.11g (2.4 GHz)

Note that when changing the hw_mode, you may need to also change the channel – see Wikipedia for a list of allowed combinations.

Troubleshooting hostapd: If for some reason your access point is not coming up, try running hostapd manually from the command line: sudo hostapd /etc/hostapd/hostapd.conf

You’ll likely get some kind of error message back.

Running the new Wireless AP

Now restart your Raspberry Pi and verify that the wireless access point becomes automatically available.

Once your Raspberry Pi has restarted, search for wireless networks with your wireless client. The network SSID you specified in file /etc/hostapd/hostapd.conf should now be present, and it should be accessible with the specified password.

If SSH is enabled on the Raspberry Pi, it should be possible to connect to it from your wireless client as follows, assuming the pi account is present: ssh pi@192.168.11.1.

If your wireless client has access to your Raspberry Pi (and the internet, if you set up routing), congratulations on setting up your new access point!

If you encounter difficulties, contact the forums for assistance. Please refer to this page in your message.

Once this is done, the network should be “EnchantedRose”, and the main web address should be 192.168.11.1.

Sifting Through the FTX Rubble

The sudden collapse of the world’s second biggest cryptocurrency exchange in November 2022 shocked the crypto world and left more than a million creditors hanging, with the 50 biggest being owed a staggering $3.1 billion.

Update, December 12 2022: Bankman-Fried Arrested

2022 has been quite a year for the co-founder of crypto exchange FTX, Sam Bankman-Fried.

In February, Bankman-Fried and 99 million other Americans watched Curb Your Enthusiasm and Seinfeld comedian Larry David stump for FTX during the Super Bowl. During the spring and summer, Bankman-Fried deployed approximately $5 billion in a series of buyouts: crypto player Liquid Global in February. Video game maker Storybook Brawl in March. Canadian crypto exchange Bitvo in June. Crypto exchange Blockfolio in August. Alameda even deployed about $11 million to a tiny rural bank here in Washington State, with aims to help it bootstrap a crypto bank on American soil.

By August 2022, SBF was being hailed by Bloomberg and CNBC’s Jim Cramer as “the JP Morgan of this generation,” a reference to when JP Morgan helped stabilize America’s economy during the panics of 1893 and 1907.

The NFL’s Tom Brady, modeling’s Giselle Bundchen and Shark Tank’s Kevin O’Leary were all singing his praises. The FTX brand was everywhere. It was emblazoned on the enormous Miami Heat stadium, after FTX secured 19-year naming rights in 2021. Major League Baseball Umpires even wore an FTX patch on their uniforms (two patches, actually) all season long.

A Fortune Magazine piece likened SBF to value investor Warren Buffett, something Buffett, a famous crypto-disbeliever, no doubt disagrees with.

And Bankman-Fried himself was popular for another reason: social change. He was an evangelist for the philosophy of “effective altruism“, which posits that the most effective way to do best for people is to spend one’s productive years amassing a huge sum of wealth, and then give as much of it away as possible. The cargo-shorts wearing, Toyota Corolla-driving Bankman-Fried played the part well.

Video blogger Nas Daily flew to the Bahamas to hail him as the “World’s Most Generous Billionaire”:

Bankman-Fried wasn’t about to wait until his retirement years to start spreading the millions around. He dolled out $42 million to Democrats during the 2022 midterms, as its second largest donor.

Sam Bankman-Fried. Image via Inside Bitcoins

Entering into the fourth quarter of 2022, SBF was riding high. He was worth more than $10 billion on paper, and the exchange he created was valued at more than $32 billion. FTX had over 5 million active users, and on average, its daily trading volume in 2021 exceeded $12.5 billion. According to Bankless, it was on track to reach $1.1 billion in revenue for 2022.

It all collapsed in less than one week. The sudden collapse of the world’s second biggest cryptocurrency exchange in November 2022 shocked the crypto world, and left more than one million creditors reeling. According to bankruptcy filings, the 50 biggest creditors alone are owed a staggering $3.1 billion.

As Bankman-Fried put it to the crowd of movers and shakers gathered in NYC at November’s NYT “Dealbook” conference, he has “had a bad month.” FTX and its sister company Alameda Research declared bankruptcy on November 11, 2022, and SBF is at serious risk of federal prosecution that could send him behind bars for a very long time.

Now worth $0, Bankman-Fried is going before any audience he can find, ignoring his attorneys’ advice otherwise, because, well, he wants us to know that he is sorry. That he “fucked up.” That he didn’t pay near enough attention to proper accounting or risk management. But even though he messed up, he will tell any audience who will listen, “I want to work to make this right,” and “I didn’t ever try to commit fraud.”

Bankman-Fried’s implicit message at the moment has been, more or less, that he did not possess mens rea (a “guilty mind”.) To him, he didn’t knowingly co-mingle customer funds with those of his own hedge fund. He didn’t intentionally mislead investors about where their money was going. He didn’t deliberately cause more than $30 billion of paper wealth (and more than $3 billion of actual creditor dollars) to evaporate.

Whether Bankman-Fried commited fraud or not in one of the decade’s biggest corporate collapses so far should be the subject of fierce federal investigation. And while that may well be occurring, there aren’t many visible signs that the feds are on this collapse with the furvor they had for, say, Bernie Madoff or Enron. Bankman-Fried was politely invited to testify before Rep. Maxine Waters’ House Financial Services Committee, and at first demurred.

One can hope this is underway, but it’s been a month, and not much word from federal lawmakers yet.

Tomorrow Sam Bankman-Fried will appear before the House Financial Services Committee. Joining him will be John J. Ray III, whom FTX’s board appointed as CEO to oversee the post-bankruptcy process.

Cynics speculate that the questioning might be fairly light-handed from Representative Maxine Waters (D, CA). Waters appeared with him in photos just a few months ago, and appeared to blow kisses his way at their last appearance in Washington DC.

Happier times: Bankman-Fried and Rep. Maxine Waters in Washington DC (Twitter)

As mentioned earlier, Bankman-Fried was the second largest donor to the Democratic National Committee for the 2022 midterms, at more than $40 million donated. And another senior FTX executive, co-founder Ryan Salame, donated $24 million to the Republicans. They may have bought themselves a bit more time.

Ray’s prepared remarks to the Committee are brutal: “Never in my career have I seen such an utter failure of corporate controls at every level of an organization, from the lack of financial statements to a complete failure of any internal controls or governance whatsoever.”

UPDATE: Sam Bankman-Fried has been arrested in the Bahamas after US Prosecutors filed charges.

FTX and Alameda Research

The world of cryptocurrency is awash in buzzwords which can complicate understanding.

So here is the collapse in its simplest terms. The allegation is that a handful of FTX executives knowingly co-mingled billions of dollars of FTX end-customer funds with its own closely-held hedge fund, run by sister company called Alameda Research. Compounding matters, Alameda made a staggeringly bad set of leveraged bets with those funds, whose downside results were greatly compounded by a crypto-crash in the Spring of 2022.

Through a series of transactions between FTX and Alameda, Alameda amassed a gigantic position in FTX’s own token (called “FTT”), a cryptocurrency which was highly correlated with FTX’s own market value. (In the non-crypto world, you might liken this to shares of its own stock, since it moved in a very correlated fashion to FTX’s own perceived value.) This “worked” for a short while, as FTX’s private market gain and apparent momentum appeared to convey some value in FTT.

But FTT was highly illiquid. Only a little bit of it traded every day. FTT was risky, far riskier than its mere stock price chart showed at the time. That’s because due to illiquidity, the stock price could be sent rapidly down by a big seller dumping it.

On November 2nd 2022, journalists at crypto trade publication Coindesk published a blockbuster piece: Divisions in Sam Bankman-Fried’s Crypto Empire Blur on His Trading Titan Alameda’s Balance Sheet. Somehow, Coindesk had come across internal documents of Alameda and FTX which detailed Alameda holdings, and what these documents revealed sent shockwaves through the crypto market.

Coindesk reporters noted that of the nominal $14.6 billion that Alameda had amassed on its balance sheet, more than half of it was in FTT/FTX-related currency.

Why is that bad? Not only is a concentrated position in one asset a large risk factor for any hedge fund, but the asset Alameda owned in gobs and gobs was also highly correlated to FTX’s own company value. While FTT was trading between $25-52 throughout most of 2022, it had pretty low trading volume. Not many people wanted to buy it up. A massive unloading of this currency would therefore send its value plummeting. And if that happened, Alameda would get “margin called” by lenders on its substantial loans and have to liquidate some securities to pay them off.

Any sizable drop in value of FTT would put enormous financial pressure on FTX’s solvency.

The Coindesk report revealed the extent to which FTX and Alameda were intertwined. It caught the attention of Changpeng Zhao (CZ), who owned a very large position of FTT. On November 6th, CZ tweeted “As part of Binance’s exit from FTX equity last year, Binance received roughly $2.1 billion USD equivalent in cash (BUSD and FTT). Due to recent revelations that have came to light, we have decided to liquidate any remaining FTT on our books.”

He signaled his intent to sell, and proceded to dump a large volume of FTT into a pretty illiquid market, far more than Alameda or FTX could attempt to buy back.

FTX traders and FTT holders alike noticed this, which triggered a “run on the bank,” i.e., causing FTX.com customers (fearing FTX’s bankruptcy) to say “I want my money back!” FTX was at first able to process the first billion dollars or so of redemption requests, but the downward spiral accelerated, taking down the whole house of cards within a 72 hour period.

By November 11th 2022, FTX was filing for bankruptcy protection.

The entire company lasted just five years. Alameda Research was Bankman-Fried’s first venture; he founded it in November 2017. It began with a fairly simple (and legal) business model, making arbitrage trades of Bitcoin between domestic and Asian markets. Bankman-Fried had noticed that the price of Bitcoin would generally be cheaper in the US than in, say, South Korea. So Alameda made a tidy profit for a while automatically buying Bitcoin on the cheap and then reselling those same coins on Asian markets.

By May 2019, Bankman-Fried’s ambitions grew larger, and he decided to create a crypto-trading exchange. He hired a new CEO, Caroline Ellison, a former colleague of his from his brief time at Wall Street’s quant firm Jane Street Research, to run Alameda Research.

He also convinced Changpeng Zhao (CZ), the founder and CEO of the world’s largest crypto exchange (Binance) to invest in his new venture. SBF had come to know CZ through his arbitrage trades with Alameda.

What’s a Crypto Exchange? A centrally-controlled crypto exchange like FTX is a place where end-users can go to buy and sell crypto currency, and do trades between “fiat currencies” (like the US dollar or British pound) and various crypto coins. If you were an FTX customer, you’d wire in funds to your account, and then trade those funds with other buyers or sellers of cryptocurrency. FTX would benefit from trading fees.

That’s how an exchange is supposed to work.

But it appears that FTX and Alameda co-mingled customer funds; a fact confirmed by current CEO Ray in his prepared remarks to Congress. For the end customer, their balance might display as owning, say, $100 worth of US currency or $100 worth of some crypto coin (minus trading fees), but the big allegation here is that Alameda was taking some or all of those funds, and betting elsewhere at various times.

Making Alameda’s own JENGA-tower shakier, Alameda appears to have had significant positions in Luna Terra, a “stable coin” which utterly collapsed between May 7-12 2022. Needing to cover losses somehow, there’s speculation on Crypto Twitter that Alameda became quite tempted to dip into customer funds.

It appears as though Bankman-Fried’s two business entities blurred several lines, treating customer funds as their own to bet with. Bankman-Fried often points to a second customer agreement allowing for margin trading between accounts, but it’s quite unclear what fraction of customers opted into this type of agreement. End-user deposits which many customers reasonably thought were isolated were instead deployed for risky bets unrelated to what end-users wanted to do with their own funds.

This not only violates their Terms of Service with customers, it would be a pretty clear violation of traditional securities and exchange laws. As FTX’s own terms of service describe:

  • “You control the Digital Assets held in your Account,” says Section 8.2 of the terms.
  • “Title to your Digital Assets shall at all times remain with you and shall not transfer to FTX Trading.”
  • “None of the Digital Assets in your Account are the property of, or shall or may be loaned to, FTX Trading; FTX Trading does not represent or treat Digital Assets in User’s Accounts as belonging to FTX Trading.”

So, Bankman-Fried’s life will be pretty interesting in 2023, just not in the same way that 2022 was. He remains ensconsed in the Bahamas, not yet indicted, on a virtual press tour of all press tours. He’s spoken with the New York Times, Good Morning America, George Stephanaplous, numerous Twitter Spaces and podcasts. Anywhere there’s a microphone, he’s out telling his story.

Tomorrow, he’ll be telling his story (or taking the Fifth) before the House Financial Services Committee. And we’re sure to hear that whatever he did, he didn’t mean it.

Engines of Wow, Part III: Opportunities and Pitfalls

This is the third in a three-part series introducing revolutionary changes in AI-generated art. In Part I: AI Art Comes of Age, we traced back through some of the winding path that brought us to this point. Part II: Deep Learning and The Diffusion Revolution, 2014-present, introduced three basic methods for generating art via deep-learning networks: GANs, VAEs and Diffusion models.

But what does it all mean? What’s at stake? In this final installment, let’s discuss some of the opportunities, legal and ethical questions presented by these new Engines of Wow.

Opportunities and Disruptions

We suddenly have robots which can turn text prompts into relevant, engaging, surprising images for pennies in a matter of seconds. They can compete with custom-created art taking illustrators and designers days or weeks to create.

Anywhere an image is needed, a robot can now help. We might even see side-by-side image creation with spoken words or written text, in near real-time.

  • Videogame designers have an amazing new tool to envision worlds.
  • Bloggers, web streamers and video producers can instantly and automatically create background graphics to describe their stories.
  • Graphic design firms can quickly make logos or background imagery for presentations, and accelerate their work. Authors can bring their stories to life.
  • Architects and storytellers can get inspired by worlds which don’t exist.
  • Entire graphic novels can now be generated from a text script which describes the scenes without any human intervention. (The stories themselves can even be created by new Chat models from OpenAI and other players.)
  • Storyboards for movies, which once cost hundreds of thousands of dollars to assemble, can soon be generated quickly, just by ingesting a script.

It’s already happening. In the Midjourney chat room, user Marcio84 writes: “I wrote a story 10 years ago, and finally have a tool to represent its characters.” With a few text prompts, the Midjourney Diffusion Engine created these images for him for just a few pennies:

Industrial designers, too, have a magical new tool. Inspiration for new vehicles can appear by the dozens and be voted up or down by a community:

Motorcycle concept generated by Midjourney, 2022

These engines are capable of competing with humans. In some surveys, as many as 87% of respondents incorrectly felt an AI-generated image was that of a real person. Think you can do better? Take the quiz.

I bet you could sell the art below, generated by Midjourney from a “street scene in the city, snow” prompt, in an art gallery or Etsy shop online. If I spotted it framed on a wall somewhere, or on a book cover or movie poster, I’d have no clue it was computer-generated:

img

A group of images stitched together becomes a video. One Midjourney user has tried to envision the aging destruction of a room, via successive videoframes of ever-more decaying description:

These are just a few of the things we can now do with these new AI art generation tools. Anywhere an image is useful, AI tools will have an impact, by lowering cost, blending concepts and styles, and envisioning many more options.

Where do images have a role? Well, that’s pretty much every field: architecture, graphic design, music, movies, industrial design, journalism, advertising, photography, painting, illustration, logo design, training, software, medicine, marketing, education and more.

Disruption

The first obvious impact is that many millions of employed or employable people may soon have far fewer opportunities.

Looking just at graphic design, approximately half a million designers employed globally, about 265,000 of whom are in the United States. (Source: Matt Moran of Colorlib.) The total market size for graphic design is about $43 billion per year. 90% of graphic designers work freelance, and the Fortune 500 accounts for nearly one-fifth of graphic design employment.

That’s just the graphic design segment. Then, there are photographers, painters, landscape architects and more.

But don’t count out the designers yet. These are merely tools, just as photography ones. And, while the Internet disrupted (or “disintermediated”) brokers in certain markets in the ’90s and ’00s (particularly in travel, in-person retail, financial services and real estate), I don’t expect that AI-generation tools means these experts are obsolete.

But the AI revolution is very likely to reduce the dollars available and reshape what their roles are. For instance, travel agents and financial advisers very much do still exist, though their numbers are far lower. The ones who have survived — even thrived — have used the new tools to imagine new businesses and have moved up the value-creation chain.

Who Owns the Ingested Data? Who Compensates the Artists?

Is this all plagiarism of sorts? There are sure to be lawsuits.

These algorithms rely upon massive image training sets. And there isn’t much dotting of i’s and crossing of t’s to secure digital rights. Recently, an artist found her own private medical records in one publicly available training dataset on the web which has been used by Stability AI. You can check to see if your own images have been part of the training datasets at www.haveibeenstrained.com.

But unlike most plagiarism and “derivative work” lawsuits up until about 2020, these lawsuits will need to contend with not being able to firmly identify just how the works are directly derivative. Current caselaw around derivative works generally requires some degree of sameness or likeness from input to final result. But the body of imagery which go into training the models is vast. A given creative end-product might be associated with hundreds of thousands or even millions of inputs. So how do the original artists get compensated, and how much, if at all?

No matter the algorithm, all generative AI models rely upon enormous datasets, as discussed in Part II. That’s their food. They go nowhere without creative input. And these datasets are the collective work of millions of artists and photographers. While some AI researchers go to great lengths to ensure that the images are copyright-free, many (most?) do not. Web scraping is often used to fetch and assemble images, and then a lot of human effort is put into data-cleansing and labeling.

The sheer scale of “original art” that goes into these engines is vast. It’s not unusual for a model to be trained by 5 million images. So these generative models learn patterns in art from millions of samples, not just by staring at one or two paintings. Are they “clones?” No. Are they even “derivative?” Probably, but not in the same way that George Harrison’s “My Sweet Lord” was derivative of Ronnie Mack’s “He’s So Fine.”

In the art world, American artist Jeff Koons created a collection called Banality, which featured sculptures from pop culture: mostly three dimensional representations of advertisements and kitsch. Fait d’Hiver (Fact of Winter) was one such work, which sold for approximately $4.3 million in a Christie’s auction in 2007:

Davidovici acknowledged that his sculpture was both inspired by and derived from this advertisement:

img

It’s plain to the eye the work is derivative.

And in fact, that was the whole point: Koons brought to three dimensions some of banality of everyday kitsch. In a legal battle spanning four years, Koons’ lawyers argued unsuccessfully that such derivative work was still unique, on several grounds. For instance, he had turned it into three dimensions, added a penguin and goggles on the woman, applied color, changed her jacket, the material representing snow, changed the scale, and much more.

While derivative, with all these new attributes, was the work not then brand new? The French court said non. Koons was found guilty in 2018. And it’s not the first time he was found guilty — in five lawsuits which sprang from this Banality collection, Koons lost three, and another settled out of court.

Unlike other “derivative works” lawsuits of the past, generative models in AI are relying not upon one work of a given artist, but an entire body of millions of images from hundreds of thousands of creators. Photographs are often lumped in with artistic sketches, oil paintings, graphic novel art and more to fashion new styles.

And, while it’s possible to look into the latent layers of AI models and see vectors of numbers, it’s impossible to translate that into something akin to “this new image is 2% based on image A64929, and 1.3% based on image B3929, etc.” An AI model learns patterns from enormous datasets, and those patterns are not well articulated.

Potential Approaches

It would be possible, it seems to me, to pass laws requiring that AI generative models use properly licensed (i.e., copyright-free or royalty-paid images), and then divvy up that royalty amongst its creators. Each artist has a different value for their work, so presumably they’d set the prices and AI model trainers would either pay for those or not.

Compliance with this is another matter entirely; perhaps certification technologies would offer valid tokens once verifying ownership. Similar to the blockchain concept, perhaps all images would have to be traceable to some payment or royalty agreement license. Or perhaps the new technique of Non Fungible Tokens (NFTs) can be used to license out ownership for ingestion during the training phase. Obviously this would have to be scalable, so it suggests automation and a de-facto standard must emerge.

Or will we see new kinds of art comparison or “plagiarism” tools, letting artists compare similarity and influence between generated works and their own creation? Perhaps if a generated piece of art is found to be more than 95% similar (or some such threshold) to an existing work, it will not retain copyright and/or require licensing of the underlying work. It’s possible to build such comparative tools today.

In the meantime, it’s the Wild West of sorts. As has often happened in the past, technology’s rapid pace of advancement has gotten ahead of where legislation and currency and monetary flow is.

What’s Ahead

If you’ve come with me on this journey into AI-generated art in 2022, or have seen these tools up close, you’re like the viewer who’s seen the world wide web in 1994. You’re on the early end of a major wave. This is a revolution in its nascent stage. We don’t know all that’s ahead, and the incredible capabilities of these tools are known only to a tiny fraction of society at the moment. It is hard to predict all the ramifications.

But if prior disintermediation moments are any guide, I’d expect change to happen along a few axes.

First, advancement and adoption will spread from horizontal tools to many more specialized verticals. Right now, there’s great advantage to being a skilled “image prompter.” I suspect that like photography, which initially required extreme expertise to create even passing mundane results, the engines will get better at delivering remarkable images the first pass through. Time an again in technology, generalized “horizontal” applications have concentrated into an oligopoly market of a few tools (e.g., spreadsheets), yet also launched a thousand flowers in much more specialized “vertical” ones (accounting systems, vertical applications, etc.) I expect the same pattern here. These tools have come of age, but only a tiny fraction of people know about them. We’re still in the generalist period. Meanwhile, they’re getting better and better. Right now, these horizontal applications stun with their potential, and are applicable to a wide variety of uses. But I’d expect thousands of specialty, domain specific applications and brand-names to emerge (e.g., book illustration, logo design, storyboard design, etc.) One set of players might generate sketches for creating book covers. Another for graphic novels. Another might specialize in video streaming backgrounds. Not only will this make the training datasets much more specific and outputs even more relevant, but it will allow the brand-names of each engine to better penetrate a specific customer base and respond to their needs. Applications will emerge for specific domains (e.g., automated graphic design for, say, blog posts.)

Second, many artists will demand compensation and seek to restrict rights to their work. Perhaps new guilds will emerge. A new technology and payments system might likely emerge to allow this to scale. Content generally has many ancillary rights, and one of those rights will likely be “ingestion rights” or “training model rights.” I would expect micropayment solutions or perhaps some form of blockchain-based technology to allow photographers, illustrators and artists to protect their work from being ingested into models without compensation. This might emerge as some kind of paywall approach to individual imagery. As is happening in the world of music, the most powerful and influential creative artists may initiate this trend, by cordoning off their entire collective body of work. For instance, the Ansel Adams estate might decide to disallow all ingestion into training models; right now however it’s very difficult to prove whether or not those images were used in the training of any datasets.

Third, regulation might be necessary to protect vital creative ecosystems. If an AI generative machine is able to create works auctioned at Christie’s for $5 million, and it may well soon, what does this do to the ecosystem of creators? It is likely necessary for regulators to protect the ecosystem for creators which feeds derivative engines, restricting AI generative model-makers from just fetching and ingesting any old image.

Fourth, in the near-term, skilled “image prompters” are like skilled photographers, web graphic designers, or painters. Today, there is a noticeable difference between those who know how to get the most of these new tools from those who do not. For the short term, this is likely to “gatekeep” this technology and validate the expertise of designers. I do not expect to this to be extremely durable, however; the quality of output from very unskilled prompters (e.g., yours truly) now meets or exceeds a lot of royalty-free art that’s out there from the likes of Envato or Shutterstock.

Conclusion

Machines now seem capable of visual creativity. While their output is often stunning, under the covers, they’re just learning patterns from data and semi-randomly assembling results. The shocking advancements since just 2015 suggest much more change is on the way: human realism, more styles, video, music, dialogue… we are likely to see these engines pass the artistic “Turing test” across more dimensions and domains.

For now, you need to be plugged into geeky circles of Reddit and Discord to try them out. And skill in crafting just the right prompts separates talented jockeys from the pack. But it’s likely that the power will fan out to the masses, with engines of wow built directly into several consumer end-user products and apps over the next three to five years.

We’re in a new era, where it costs pennies to envision dozens of new images visualizing anything from text. Expect some landmark lawsuits to arrive soon on what is and is not derivative work, and whether such machine-learning output can even be copyrighted. For now, especially if you’re in a creative field, it’s good advice to get acquainted with these new tools, because they’re here to stay.

Engines of Wow: Part II: Deep Learning and The Diffusion Revolution, 2014-present

A revolutionary insight in 2015, plus AI work on natural language, unleashed a new wave of generative AI models.

In Part I of this series on AI-generated art, we introduced how deep learning systems can be used to “learn” from a well-labeled dataset. In other words, algorithmic tools can “learn” patterns from data to reliably predict or label things. Now on their way to being “solved” via better and better tweaks and rework, these predictive engines are magical power-tools with intriguing applications in pretty much every field.

Here, we’re focused on media generation, specifically images, but it bears a note that many of the same basic techniques described below can apply to songwriting, video, text (e.g., customer service chatbots, poetry and story-creation), financial trading strategies, personal counseling and advice, text summarization, computer coding and more.

Generative AI in Art: GANs, VAEs and Diffusion Models

From Part I of this series, we know at a high level how we can use deep-learning neural networks to predict things or add meaning to data (e.g., translate text, or recognize what’s in a photo.) But we can also use deep learning techniques to generate new things. This type of neural network system, often comprised of multiple neural networks, is called a Generative Model. Rather than just interpreting things passively or searching through existing data, AI engines can now generate highly relevant and engaging new media.

How? The three most common types of Generative Models in AI are Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs) and Diffusion Models. Sometimes these techniques are combined. They aren’t the only approaches, but they are currently the most popular. Today’s star products in art-generating AI are Midjourney by Midjourney.com (Diffusion-based) DALL-E by OpenAI (VAE-based), and Stable Diffusion (Diffusion-based) by Stability AI. It’s important to understand that each of these algorithmic techniques were conceived just in the past 6 years or so.

My goal is to describe these three methods at a cocktail-party chat level. The intuition behind them are incredibly clever ways of thinking about the problem. There are lots of resources on the Internet which go much further into each methodology, listed at the end of each section.

Generative Adversarial Networks

The first strand of generative-AI models, Generative Adversarial Networks (GANs), have been very fruitful for single-domain image generation. For instance, visit thispersondoesnotexist.com. Refresh the page a few times.

Each time, you’ll see highly* convincing images like this, but never the same one twice:

As the domain name suggests, these people do not exist. This is the computer creating a convincing image, using a Generative Adversarial Network (GAN) trained to construct a human-like photograph.

*Note that for the adult male, it only rendered half his glasses. This GAN doesn’t really understand the concept of “glasses,” simply a series of pixels that need to be adjacent to one another.

Generative Adversarial Networks were introduced in a 2014 paper by Ian Goodfellow et al. That was just eight years ago! The basic idea is that you have two deep-learning neural networks: a Generator and a Discriminator. You can think of them like a Conterfeiter and a Detective respectively. One Deep Learning model, serving as the “Discriminator” (Detective), learns to distinguish between genuine articles and counterfeits. It penalizes the generator for producing implausible results. Meanwhile, a Generator model learns to “generate” plausible data, which, if it “fools” the discriminator, becomes negative training data for the Discriminator. They play a zero-sum game against each other (thus it’s “adversarial”) thousands and thousands of times, and with each adjustment to the Generator and Discriminator’s weights and attributes, the Generator gets better and better at “learning” how to construct something to fool the Discriminator, and the Discriminator gets better and better at detecting fakes.

The whole system looks like this:

Generative Adversarial Network, source: Google

GANs have delivered pretty spectacular results, but in fairly narrow domains. For instance, GANs have been pretty good at mimicking artistic styles (called “Neural Style Transfer“) and Colorizing Black and White Images.

GANs are cool and a major area of generative AI research.

More reading on GANs:

Variational Autoencoders (VAE)

An encoder can be thought of as a compressor of data, and a decompressor, something which does this opposite. You’ve probably compressed an image down to a smaller size without losing recognizability. It turns out you can use AI models to compress an image. Data scientists call this reducing its dimensionality.

What if you built two neural network models, an Encoder and a Decoder? It might look like this, going from x, the original image, to x’, the “compressed and then decompressed” image:

Variational Autoencoder, high-level diagram. Images go in on left, and come out on right. If you train. the networks to minimize the difference between output and input, you get to a compression algorithm of sorts. What’s left in red are lower-dimension representation of the images.

So conceptually, you could train an Encoder neural network to “compress” images into vectors, and then a Decoder neural network to “decompress” the image back into something close to the original.

Then, you could consider the red “latent space” in the middle as basically the rosetta stone for what a given image means. Run that algorithm numerous times over multiple images, encoding it with the text of the labeled images, and you would end up with the condensed encoding of how to render various images. If you did this across many, many images and subjects, these numerous red vectors would overlap in n-dimensional space, and could be sampled and mixed and then run through the decoder to generate images.

With some mathematical tricks (specifically, forcing the latent variables in red to conform to a normal distribution), you can build a system which can generate images that never existed before, but which have some very similar properties to the dataset which was used to train the encoder.

More reading on VAEs:

2015: “Diffusion” Arrives

Is there another method entirely? What else could you do with a deep learning system which can “learn” how to predict things?

In March 2015, a revolutionary paper came out from researchers Sohl-Dickstein, Weiss, Maheswaranathan and Ganguli. It was inspired by the physics of non-equilibrium systems: for instance, dropping a drop of food coloring into a glass of water. Imagine you saw a film of that process of “destruction”, and could stop it frame by frame. Could you build a neural network to reliably predict what a reverse might look like?

Let’s think about a massive training set of animal images. Imagine you take an image in your training dataset, and create multiple copies of the image, each time systematically adding graphic “noise” to it. Step by step, more noise is added to your image (x), via what mathematicians call a Markov chain (incremental steps.) You apply a normally-distributed distortion, let’s say, Gaussian Blur.

In a forward direction, from left to right, it might look something like this. At each step from left to right, you’re going from data (the image) to pure noise:

Adding noise to an image, left to right. Credit: image from “AI Summer”: How diffusion models work: the math from scratch | AI Summer (theaisummer.com)

But here’s the magical insight behind Diffusion models. Once you’ve done this, what if you trained a deep learning model to try to predict frames in the reverse direction? Could you predict a “de-noised” image X(t) from its more noisier version, X(t+1)? Could you could read each step backward, from right to left, and try to predict the best way to remove noise at each step?

This was the insight in the 2015 paper, albeit with much more mathematics behind it. It turns out you can train a deep learning system to learn how to “undo” noise in an image, with pretty good results. For instance, if you input the pure-noise image in the last step, x(T), and train a deep learning network that its output should be the previous step x(T-1), and do this over and over again with many images, you can “train” a deep learning network to subtract noise in an image, all the way back to an original image.

Do this enough times, with enough terrier images, say. And then, ask your trained model to divine a “terrier” from random noise. Gradually, step by step, it removes noise from an image to synthesize a “terrier”, like this:

Screen captured video of using the Midjourney chatroom (on Discord) to generate: “terrier, looking up, cute, white background”

Images generated from the current Midjourney model:

“terrier looking up, cute, white background” entered into Midjourney. Unretouched, first-pass output with v3 model.

Wow! Just slap “No One Hates a Terrier” on any of these images above, print 100 t-shirts, and sell it on Amazon. Profit! I’ll touch on some of the legal and ethical controversies and ramifications in the final post in this series.

Training the Text Prompts: Embeddings

How did Midjourney know to produce a “terrier”, and not some other object or scene or animal?

This relied upon another major parallel track in deep learning: natural language processing. In particular, word “embeddings” can be used to get from keywords to meanings. And during the image model training, these embeddings were applied by Midjourney to enhance each noisy-image with meaning.

An “embedding” is a mapping of a chunk of text into a vector of continuous numbers. Think about a word as a list of numbers. A textual variable could be a word or a node in a graph, or a relation between nodes in a graph. By ingesting massive amounts of text, you can train a deep learning network to understand relationships between words and entities, and numerically pull out how closely associated some words and phrases are with others. They can be used to cluster together the sentiment of an expression in mathematical terms a computer can appear to understand. For instance, embedding models are now able to interpret semantics and relationships between words, like “royalty + woman – man = queen.”

An example on Google Colab took a vocabulary of 50,000 words in a collection of movie reviews, and learned over 100 different attributes from words used with them, based on their adjacency to one another:

img

Source: Movie Sentiment Word Embeddings

So, if you simultaneously injected into the “de-noising” diffusion-based learning process the information that this is about a “dog, looking up, on white background, terrier, smiling, cute,” you can get a deep learning network to “learn” how to go from random noise (x(T)) to a very faint outline of a terrier (x(T-1)), to even less faint (x(T-2)) and so on, all the way back to x(0). If you do this over thousands of images, and thousands of keyword embeddings, you end up with a neural network that can construct an image from some keywords.

Incidentally, researchers have found that about T=1000 is about all you need in this process, but millions of input images and enormous amounts of computing power are needed to learn how to “undo” noise at high resolution.

Let’s step back a moment to note that this revelation about Diffusion Models was only really put forward in 2015, and improved upon in 2018 and 2020. So we are just at the very beginning of understanding what might be possible here.

In 2021, Dhariwal and Nichol convincingly note that diffusion models can achieve image quality superior to the existing state-of-the-art GAN models.

Up next, Part III: Ramifications and Questions

That’s it for now. In the final Part III of Engines of Wow, we’ll explore some of the ramifications, controversies and make some predictions about where this goes next.

Engines of Wow: AI Art Comes of Age

Advancements in AI-generated art test our understanding of human creativity and laws around derivative art.

While most of us were focused on Ukraine, the midterm elections, or simply returning to normal as best we can, Artificial Intelligence (AI) took a gigantic leap forward in 2022. Seemingly all of a sudden, computers are now eerily capable of human-level creativity. Natural language agents like GPT-3 are able to carry on an intelligent conversation. GitHub CoPilot is able to write major blocks of software code. And new AI-assisted art engines with names like Midjourney, DALL-E and Stable Diffusion delight our eyes, but threaten to disrupt entire creative professions. They raise important questions about artistic ownership, derivative work and compensation.

In this three-part blog series, I’m going to dive in to the brave new world of AI-generated art. How did we get here? How do these engines work? What are some of the ramifications?

This series is divided into three parts:

[featured image above: “God of Storm Clouds” created by Midjourney AI algorithm]

But first, why should we care? What kind of output are we talking about?

Let’s try one of the big players, the Midjourney algorithm. Midjourney lets you play around in their sandbox for free for about 25 queries. You can register for free at Midjourney.com; they’ll invite you to a Discord chat server. After reading a “Getting Started” agreement and accepting some terms, you can type in a prompt. You might go with: “/imagine portrait of a cute leopard, beautiful happy, Gryffindor outfit, super detailed, hyper realism.”

Wait about 60 seconds, choose one of the four samples generated for you, click the “upscale” button for a bigger image, and voila:

image created by the Midjourney image generation engine, version 4.0. Full prompt used to create it was “portrait of a cute leopard, Beautiful happy, Gryffindor Outfit, white background, biomechanical intricate details, super detailed, hyper realism, heavenly, unreal engine, rtx, magical lighting, HD 8k, 4k”

The Leopard of Gryffindor was created without any human retouching. This is final Midjourney output. The algorithm took the text prompt, and then did all the work.

I look at this image, and I think: Stunning.

Looking at it, I get the kind of “this changes everything” feeling, like the first time I browsed the world-wide web, spoke to Siri or Alexa, rode in an electric vehicle, or did a live video chat with friends across the country for pennies. It’s the kind of revolutionary step-function that causes you to think “this will cause a huge wave of changes and opportunities,” though it’s not even clear what they all are.

Are artists, graphic designers and illustrators doomed? Will these engines ultimately help artists or hurt them? How will the creative ecosystem change when it becomes nearly free to go from idea to visual image?

Once mainly focused at just processing existing images, computers are now extremely capable at generating brand new things. Before diving into a high-level overview of these new generative AI art algorithms, let me emphasize a few things: First, no artist has ever created exactly the above image before, nor will it likely be generated again. That is, Midjourney and its competitors (notably DALL-E and Stable Diffusion) aren’t search engines: they are media creation engines.

In fact, if you typed this same exact prompt into Midjourney again, you’ll get an entirely different image, yet one which also is likely to deliver on the prompt fairly well.

There is an old joke within Computer Science circles that “Artificial Intelligence is what we call things that aren’t working yet.” That’s now sounding quaint. AI is all around us, making better and better recommendations, completing sentences, curating our media feeds, “optimizing” the prices of what we buy, helping us with driving assistance on the road, defending our computer networks and detecting spam.

Part I: The Artists in the Machine, 1950-2015+

How did this revolutionary achievement come about? Two ways, just as bankruptcy came about for Mike Campbell in Hemingway’s The Sun Also Rises: First gradually. Then suddenly.

Computer scientists have spent more than fifty years trying to perfect art generation algorithms. These five decades can be roughly divided into two distinct eras, each with entirely different approaches: “Procedural” and “Deep Learning.” And, as we’ll see in Part II, the Deep-Learning era had three parallel but critical deep learning efforts which all converged to make it the clear winner: Natural Language, Image Classifiers, and Diffusion Models.

But first, let’s rewind the videotape. How did we get here?

Procedural Era: 1970’s-1990’s

If you asked most computer users, the naive approach to generating computer art would be to try to encode various “rules of painting” into software programs, via the very “if this then that” kind of logic that computers excel at. And that’s precisely how it began.

In 1973, American computer scientist Harold Cohen, resident at Stanford University’s Artificial Intelligence Laboratory (SAIL) created AARON, the first computer program dedicated to generating art. Cohen was actually an accomplished, talented artist and a computer scientist. He thought it would be intriguing to try to “teach” a computer how to draw and paint.

His thinking was to encode various “rules about drawing” into software components and then have them work together to compose a complete piece of art. Cohen relied upon his skill as an exceptional artist, and coded his own “style” into his software.

AARON was an artificial intelligence program first written in the C programming language (a low level language compiled for speed), and later LISP (a language designed for symbolic manipulation.) AARON knew about various rules of drawing, such as how to “draw a wavy blue line, intersecting with a black line.” Later, constructs were added to combine these primitives together to “draw an adult human face, smiling.” By 1995, Cohen added rules for painting color within the drawn lines.

Though there were aspects of AARON which were artificially intelligent, by and large computer scientists call his a procedural approach. Do this, then that. Pick up a brush, pick an ink color, and draw from point A to B. Construct an image from its components. Join the lines. And you know what? after a few decades of work, Cohen created some really nice pieces, worthy of hanging on a wall. You can see some of them at the Museum of Computer History in Menlo Park, California.

In 1980, AARON was able to generate this:

Detail from an untitled AARON drawing, ca. 1980.
Detail from an untitled AARON drawing, ca. 1980, via Computer History Museum

By 1995, Cohen had encoded rules of color, and AARON was generating images like this:

img
The first color image created by AARON, 1995. via Computer Museum, Boston, MA

Just a few months ago, other attempts at AI-generated art were flat-looking and derivative, like this image from 2019:

img

Twenty seven years after AARON’s first AI-generated color painting, algorithms like Midjourney would be quickly rendering photorealistic images from text prompts. But to accomplish it, the primary method is completely different.

Deep Learning Era (1986-Present)

Algorithms which can create photorealistic images-on-demand are the culmination of multiple parallel academic research threads in learning systems dating back several decades.

We’ll get to the generative models which are key to this new wave of “engines of wow” in the next post, but first, it’s helpful to understand a bit about their central component: neural networks.

Since about 2000, you have probably noticed everyday computer services making massive leaps in predictive capabilities; that’s because of neural networks. Turn on Netflix or YouTube, and these services will serve up ever-better recommendations for you. Or, literally speak to Siri, and she will largely understand what you’re saying. Tap on your iPhone’s keyboard, and it’ll automatically suggest which letters or words might follow.

Each of these systems rely upon trained prediction models built by neural networks. And to envision them, a branch of computer scientists and mathematicians had to radically shift their thinking from the procedural approach. A branch of them did so first in the 1950’s and 60’s, and then again in a machine-learning renaissance which began in earnest in the mid-1980’s.

The key insight: these researchers speculated that instead of procedural coding, perhaps something akin to “intelligence” could be fashioned from general purpose software models, which would algorithmically “learn” patterns from a massive body of well-labeled training data. This is the field of “machine learning,” specifically supervised machine learning, because it’s using accurately pre-labeled data to train a system. That is, rather than “Computer, do this step first, then this step, then that step”, it became “Computer: learn patterns from this well-labeled training dataset; don’t expect me to tell you step-by-step which sequence of operations to do.”

The first big step began in 1958. Frank Rosenblatt, a researcher at Cornell University, created a simplistic precursor to neural networks, the “Perceptron,” basically a one-layer network consisting of visual sensor inputs and software outputs. The Perceptron system was fed a series of punchcards. After 50 trials, the computer “taught” itself to distinguish those cards which were marked on the left from cards marked on the right. The computer which ran this program was a five-ton IBM 704, the size of a room. By today’s standards, it was an extremely simple task, but it worked.

A single-layer perceptron is the basic component of a neural network. A perceptron consists of input values, weights and a bias, a weighted sum and activation function:

Frank Rosenblatt and the Perceptron system, 1958

Rosenblatt described it as the “first machine capable of having an original idea.” But the Perceptron was extremely simplistic; it merely added up the optical signals it detected to “perceive” dark marks on one side of the punchcard versus the other.

In 1969, MIT’s Marvin Minsky, whose father was an eye surgeon, wrote convincingly that neural networks needed multiple layers (like the optical neuron fabric in our eyes) to really do complex things. But his book Perceptrons, though well-respected in hindsight, got little traction at the time. That’s partially because during these intervening decades, the computing power required to “learn” more complex things via multi-layer networks were out of reach computationally. But time marched on, and over the next three decades, computing power, storage, languages and networks all improved dramatically.

From the 1950’s through the early 1980’s, many researchers doubted that computing power would be sufficient for intelligent learning systems via a neural network style approach. Skeptics also wondered if models could ever get to a level of specificity to be worthwhile. Early experiments often “overfit” the training data and simply output the input data. Some would get stuck on local maxima or minima from a training set. There were reasons to doubt this would work.

And then, in 1986, Carnegie Mellon Professor Geoffrey Hinton, whom many consider the “Godfather of Deep Learning” (go Tartans!), demonstrated that “neural networks” could learn to predict shapes and words by statistically “learning” from a large, labeled dataset. Hinton’s revolutionary 1986 breakthrough was the concept of “backpropagation.” This adds both multiple layers to the model (hidden layers), and also iterations through the neural network model using the output of one or more mathematical functions to adjust weights to minimize “loss” or distance from the expected output.

This is rather like the golfer who adjusts each successive golf swing, having observed how far off their last shots were. Eventually, with enough adjustments, they calculate the optimal way to hit the ball to minimize its resting distance from the hole. (This is where terms like “loss function” and “gradient descent” come in.)

In 1986-87, around the time of the 1986 Hinton-Rumelhart-Williams paper on Backpropagation, the whole AI field was in flux between these procedural and learning approaches, and I was earning a Masters in Computer Science at Stanford, concentrating in “Symbolic and Heuristic Computation.” I had classes which dove into the AARON-style type of symbolic, procedural AI, and a few classes touching on neural networks and learning systems. (My masters thesis was in getting a neural network to “learn” how to win the Tower of Hanoi game, which requires apparent backtracking to win.)

In essence, you can think of a neural network as a fabric of software-represented units (neurons) waiting to soak up patterns in data. The methodology to train them is: “here is some input data and the output I expect, learn it. Here’s some more input and its expected output, adjust your weights and assumptions. Got it? Keep updating your priors. OK, let’s keep doing that.” Like a dog learning what “sit” means (do this, get a treat / don’t do this, don’t get a treat), neural networks are able to “learn” over iterations, by adjusting the software model’s weights and thresholds.

Do this enough times, and what you end up with is a trained model that’s able to “recognize” patterns in the input data, outputting predictions, or labels, or anything you’d like classified.

A neural network, and in particular the special type of multi-layered network called a deep learning system, is “trained” on a very large, well-labeled dataset (i.e., with inputs and correct labels.) The training process uses Hinton’s “backpropagation” idea to adjust the weights of the various neuron thresholds in the statistical model, getting closer and closer to “learning” the underlying pattern in the data.

For much more detail on Deep Learning and the mathematics involved, see this excellent overview:

Deep Learning Revolutionizes AI Art

We’ll rely heavily upon this background of neural networks and deep learning for Part II: The Diffusion Revolution. The AI revolution uses deep learning networks to interpret natural language (text to meaning), classify images, and “learns” how to synthetically build an image from random noise.

Gallery

Before leaving, here are a few more images created from text prompts on Midjourney:

You get the idea. We’ll check in on how deep learning enabled new kinds of generative approach to AI art, called Generative Adversarial Networks, Variable Autoencoders and Diffusion, in Part II: Engines of Wow: Deep Learning and The Diffusion Revolution, 2014-present.

I’m Winding Down HipHip.app

After much thought, I’ve decided to wind down the video celebration app I created, HipHip.app.

After much thought, I’ve decided to wind down the video celebration app I created, HipHip.app.

All servers will be going offline shortly.

Fun Project, Lots of Learning

I started HipHip as a “give back” project during COVID. I noticed that several people were lamenting online that they were going to miss big milestones in-person: celebrations, graduations, birthdays, anniversaries, and memorials. I had been learning a bunch about user-video upload and creation, and I wanted to put those skills to use.

I built HipHip.app, a celebration video creator. I didn’t actually know at the time that there were such services — and it turns out, it’s a pretty crowded marketplace!

While HipHip delivered hundreds of great videos for people in its roughly two years on the market, it struggled to be anything more than a hobby/lifestyle project. It began under the unique circumstances of lockdown, helping people celebrate. That purpose was well served!

Now that the lockdown/remote phase of COVID is over, the economics of the business showed that it’s unlikely to turn into a self-sustaining business any time soon. There are some category leaders that have really strong search engine presence which is pretty expensive to dislodge.

I want to turn my energies to other projects, and free up time and budget for other things. COVID lockdown is over, and a lot of people want a respite from recording and Zoom-like interactions, including me.

It was a terrific, educational project. It kept me busy, learning, and productive. HipHip delivered hundreds of celebration videos for people around the world.

I’ve learned a ton about programmatic video creation, technology stacks like Next.js, Azure and React, and likely will apply these learnings to new projects, or perhaps share them with others via e-learning courses.

Among the videos people created were graduation videos, videos to celebrate new babies, engagement, birthdays, anniversaries and the attainment of US Citizenship.

In the end, the CPU processing and storage required for online video creation meant that it could not be done for free forever, and after testing a few price-points, there seems only so much willingness to pay in a crowded market.

Thanks everyone for your great feedback and ideas!

Financial Systems Come for Your Free Expression: Don’t let them.

I’ve been a PayPal customer for more than a decade, but closed my account last week. 2022 has shown glimpses of what a social credit system might look like in America. Decentralized, yet singular in ideology.

My local bagel store, barbershop and dry cleaner now only accept cashless transactions. Ubiquitous touchscreen displays and tap-to-pay checkouts now happily whisk customers through the line. This long-awaited arrival of a cashless society has been a boon for customer and retailer alike. Mostly, I love it.

But with it has come unprecedented information flow on who we are, and increased temptation by platform providers to start monitoring who can be in their club and who cannot be. While there isn’t yet any grand advance design of a social credit system along the lines of the Chinese Communist Party, I cannot help but worry that we are assembling the ideal toolset for ideological enforcement, monitoring and control should someone, some day, wish to build that out.

Does that sound alarmist? Consider the overall trajectory of these recent stories:

October, 2022: PayPal Attempts to Fine Customers for What it Deems “Harmful” Ideas

On October 7th 2022, PayPal published amendments to its Acceptable Use Policy (AUP), which would have granted the payment provider legal authority to seize $2,500 from customers’ bank accounts for every single violation of what it deemed the spreading of “harmful” or “objectionable” information. The determination of whether something is “harmful”, misleading or “objectionable” would come at PayPal’s sole discretion. These changes were set to go into effect on November 3rd, 2022, but were quietly retracted. PayPal only explained their policy reversal via emails to a few news outlets on October 8th; remarkably, you still cannot find any commentary about this episode on their Twitter feed.

What is harmful misinformation? Well, that’s subjective. We might all agree that businesses that explicitly promote murder shouldn’t be on the platform. But then it starts to get trickier. What if you believe the path of least overall harm was to reopen schools sooner? Or let vaccination be an informed choice, and not a mandate? Is being pro-choice or pro-life more “harmful?” Depends on who is answering the question.

Anything deemed harmful or objectionable by PayPal would be subject to such a fine.

Let’s review several recent statements which were authoritatively deemed “harmful misinformation”:

  • “Prolonged school closure is a mistake. Learning loss will happen, suicide rates might increase. We need to reopen schools urgently.” (Misinformation in 2020, True today.)
  • “Vaccination does not in fact significantly slow spread of COVID-19 to others.” (Misinformation in 2020, True today.)
  • “Naturally-acquired immunity is as strong as immunity acquired through vaccination, if not stronger.” (Misinformation in 2020, True today.)
  • “COVID-19, the most significant public health crisis in our lifetime, might well have emerged from a lab accident.” (Misinformation in 2020, officially declared at least equally plausible by the US government today.)
  • “Hunter Biden’s laptop contained clear and troubling signs of influence peddling.” (Declared misinformation in 2020, yet now verified by New York Times, Washington Post and others.)
  • “For younger males, the risk of myocarditis from vaccination may actually exceed the hospitalization risk of COVID itself.” (Declared misinformation in 2020, yet backed by empirical evidence today.)

Within a brief span of just thirty months, each of these statements has gone from “misinformation” or “harmful-information” as vehemently declared by authorities and name-brand “fact checkers” to now-majority American and empirically-validated viewpoints.

Further, who is paying attention to these stealth changes to terms and conditions? It wasn’t the New York Times, The Verge, nor the Washington Post that brought this major policy change of PayPal’s to America’s attention. It came from the right, who have become the most vocal critics of a creeping state-corporate symbiosis which they call the “Blue Stack.” The Blue Stack includes progressive technocrats, corporate media, and ostensibly independent big tech firms which work to enforce an ideology that inevitably tilts leftward.

The blue stack presents America’s elite with something they’ve always craved but has been out of reach in a liberal democracy: the power to swiftly crush ideological opponents by silencing them and destroying their livelihoods. Typically, American cultural, business, and communication systems have been too decentralized and too diffuse to allow one ideological faction to express power in that way. American elites, unlike their Chinese counterparts, have never had the ability to imprison people for wrong-think or derank undesirables in a social credit system.

Zaid Jilani, The Blue Stack Strikes Back, Tablet

Were it not for Ben Zeisloft, writer for the right-wing website Daily Wire, the public would likely not have known about PayPal’s major policy shift. But once Zeisloft’s piece hit (New PayPal Policy Lets Company Pull $2,500 From Users’ Accounts If They Promote ‘Misinformation’ | The Daily Wire), it caught fire on social media. And PayPal was forced into crisis-response mode, as the unwelcome press and cancellations started pouring in.

This wasn’t misinformation. PayPal’s new policy stated precisely as Zeisloft had identified. #CancelPayPal quickly started trending on Twitter, TikTok and Instagram. The proposed AUP changes are now gone from PayPal’s website, but you can check the web archive:

[image-20221017122727826](https://web.archive.org/web/20220927223312/https://www.paypalobjects.com/marketing/ua/pdf/US/en/acceptableuse-full-110322.pdf)
image-20221017122737706

The company’s former CEO, David Marcus, blasted PayPal on Twitter, saying “It’s hard for me to openly criticize a company I used to love and gave so much to. But @PayPal’s new AUP goes against everything I believe in. A private company now gets to decide to take your money if you say something they disagree with.”, he wrote on Saturday.

PayPal handled this PR crisis very poorly. While they walked it back, they only did so via private emails to publications like Snopes, which dutifully penned “No, PayPal Isn’t Planning to Fine Users $2.5K for Posting Misinfo.” Snopes fails to clearly state that yes, PayPal indeed had.

And PayPal executives have yet to clearly explain to customers how this AUP change even arose. It all gives one the impression that the only “error” with this policy rollout is that someone skeptical noticed it. Their main Twitter handle, @paypal, was and still is silent on the rollout and stealthy walk-back. They reached out one-on-one with a few media organizations to state that it was an error, but they didn’t apprise the public. They didn’t explain how such an “error” could make it onto their corporate website.

October, 2022: JP Morgan Chase Summarily Closes Kanye West’s Accounts

I’ve never been a fan of Kanye “Ye” West’s music, erratic persona, nor many of his MAGA-political views. And his recent clearly antisemitic statements, deserve condemnation. I think they’re abhorrent.

Yet I’m also unsettled by JP Morgan Chase, Inc. summarily indicating they are closing his bank accounts based on his recent speech.

On the plus side, they’ve given him thirty days’ notice.

I admit on this I’m conflicted — I’m fine with this specific decision, but not what it says about the age we are now it. It is in effect a major bank saying you need to not only be a customer in good standing, but not stray from what its executives think are the norms of good speech. Are they saying it’s not just the bank’s services that set their brand, it’s the collective words and deeds of its customers?

Something feels new here.

Have corporate banking giants been arbiters of what we can and cannot say in our private lives? Do we want them to be? They’re private companies, after all, but who expected investment banks of all entities to be the enforcers of what they perceive to be social acceptability?

It feels absurd for bankers, of all people, to be America’s moral compass. Do you consider bankers to be America’s new home for ethicists, who will be able to determine what is and is not societally righteous?

February, 2022: GoFundMe Deplatforms “My Body, My Choice” Truckers

GoFundMe is the #1 marketplace and payment processor for fundraisers. As you may recall, the Canadian truckers who objected to that nation’s vaccine mandate headed en masse to Ottawa to protest the government’s mandate via what they termed a “Freedom Convoy.”

After raising over $10 million through GoFundMe, from people around Canada and the rest of the world, on February 4th 2022, executives at GoFundMe unilaterally decided to lock the truckers’ fundraising account. Further, in their initial statement, GoFundMe signaled they would distribute those funds to charities of their own choosing. “Given how this situation has evolved, no further funds will be distributed directly to the organizers of Freedom Convoy,” GoFundMe wrote about the decision. “We will work with the organizers to send all remaining funds to trusted and established charities verified by GoFundMe.”

After massive outcry, GoFundMe provided an update and said that they would instead refund donations. Many noticed their initial action and found it indicative of who they are. #BoycottGoFundMe made the rounds on social media for weeks.

Critics are right to point out that GoFundMe has hosted numerous fundraisers for Antifa, CHOP/CHAZ and other protest groups — even those around whom violence has occasionally happened — without cancelation or threats of unilateral fund-seizure. You can see just a few of them by searching PayPal’s site.

[Editorial note: I have stated my own views on vaccination: — I’m in favor of it personally and for most older demographics especially, but believe it to be a personal choice. It is now clear that vaccination does not measurably nor durably reduce spread (one such study here, others corroborate), I think vaccination should be an informed choice. I am firmly opposed to COVID vax mandates.]

PayPal, GoFundMe and JP Morgan Chase are each private companies, and have every legal right to set their own terms and conditions of use. But look also what’s happening at the governmental level.

August, 2022: Massive Increase to IRS budget, Considering Lowering Reporting Threshold to $600

In 1972, the Bank Secrecy Act started requiring banks to report deposits of $10,000 or more (in 1970 dollars.) Together with adjustments made by The Patriot Act in 2002, banks need to report to state and local authorities all deposits or withdrawals of $10,000 or more. (Even multiple transactions broken up into smaller pieces are tracked, and known as “structured transactions,” and that in and of itself is illegal.)

More recently, in 2021, Treasury Secretary Janet Yellen and others started advocating for lowering the threshold to $600. This hasn’t yet been adopted, but it’s being strongly advocated. With inflation, $600 is the new $500, so essentially most critical expenditures, like rent, furniture, car purchases, healthcare, travel and more are on the radar.

We often hear about “87,000 IRS Agents” authorized by the so-called Inflation Reduction Act, but really what the Act includes is a massive $79 billion budgetary expansion of the Internal Revenue Service. The IRS has every incentive and desire to start wiring in end-to-end tracking of cashless transactions.

Should the US want to pursue a social credit system along the lines of the Chinese state, all that really will be needed is the “right” lawmakers to authorize it doing so.

Republican Rep. Jefferson Van Drew has introduced HR 5475, known as the Banking Privacy Act, to stop the Biden Administration’s proposal, which has been referred to the House Financial Services Committee. Should the Republicans win control of the House, it’s possible this will be taken-up.

Of course, as the old saw goes, if you’re not doing anything illegal, you have nothing to worry about. After seeing the way COVID was and is handled, and the creeping power of what writer Zaid Jilani calls the Blue Stack, do you still have that confidence?

Themes

Fast-forward the videotape, and it is plain that without new regulation, we are fast headed to ideological groupthink being enforced by the financial world, who are of course also susceptible to the whims of government leaders. Consolidation and a pandemic-accelerated move to a cashless society are making a social credit system much easier to snap-in some day.

True, these actions are mostly the work of free enterprise. Companies aren’t state-controlled in America the way they are in China, and they are free to devise their own legal terms and conditions. We consumers are free to opt in or opt out. No one has to use PayPal, GoFundMe, JP Morgan Chase, Twitter or Facebook for that matter. I’m not aware of any of these activities being illegal.

But we need to pause a moment and recognize how a cashless society with higher concentrations of information flow and money are extremely tempting components for regulators and authoritarians on America’s political flanks. It’s a far cry from the local savings account that was largely agnostic to your speech.

Increasingly, outside groups, state and local governments and employees from within are pressuring banks, big technology companies and other corporations to take manifestly political/ideological stances, and expel people for wrongthink. Our massive migration toward a cashless society makes this easier.

That’s all fine, you might say, “I support Canada’s vax mandate for truckers, I think Kanye West is reprehensible that JP Morgan has every right to de-bank him from their system, and I think think the IRS’s investment in tracking every $600 will inure to much greater revenue to US coffers.”

But recognize that these monitoring tools and platforms themselves are entirely agnostic tools; their application merely depends upon who is in power. And that can change at any election.

So it’s an important exercise to take a moment to imagine the power which you may now applaud in the hands of your worst ideological foe. Are you still comfortable with how this is all trending?

For me, though I love technology and the convenience it offers, these trends to include speech and behavior in whether people can participate in a platform start to fill me with unease, especially when they are phrased in such a subjective way. After more than two decades as a customer, I closed my PayPal account last week. If you’d like to close your account, you can do it in a few clicks as I explain on Twitter.

And while I’m still fine with debit and credit cards, I’m beginning to suspect this sense of ease might not last forever. It only takes VISA or Mastercard to say “we will fine our customers for harmful misinformation.” For these and other reasons, this long-time advocate of technology is now becoming reacquainted with check-writing. I’m not ready to switch back to paper just yet, but it can’t hurt to re-learn how it worked in the 70’s.

Medicine Should Be About Care, Not Self-Righteousness

On the unwillingness of University of Michigan medical school students to hear views that might conflict with their own.

I am not, nor ever have been, a medical professional. I am also among the 61% of Americans that do not consider ourselves “pro-life.”

But there was something profoundly unsettling about the walk-out of incoming med school students at University of Michigan’s medical school convocation last week:

Dr. Collier is one of the most popular professors at University of Michigan Medical School. That’s why she was selected, by a combination of students and faculty, to be the keynote speaker for the “white coat” ceremony, in which incoming med school students get their symbolic professional jacket.

Why the walkout? It’s because Dr. Collier also happens to be among the 39% of Americans who define themselves to be “pro-life.” OK. That’s a rational, quite common viewpoint on a complex issue.

Now, she didn’t even speak to abortion at all. Her keynote address was far more general, and inspiring. It was that physicians should do everything possible to keep from being a machine. They should not perceive themselves as “task-completers,” but rather a physician, a human who cares. That they should be grateful. Appreciate a team.

She chose not to delve into abortion as a topic at all. But what if she had? What if she had decided to mention (gasp) her own perspective of a pro-life medical professional? Is that so appalling that it must be shunned? Is there no learning that’s possible by hearing that viewpoint out?

As it happened, dozens of students didn’t hear that message, because they preemptively walked out, before her address. Call me old-fashioned, but I believe that medical care should begin with empathy, and empathy begins with listening. We can, and must, tolerate and listen to perspectives other than our own.

More than any generation I can remember, far too many young adults that we are raising seem to be interested in hearing out viewpoints other than their own. They even think it’s noble to shut out those views.

39% of Americans — more than one out of every three — declare that they are “pro-life.”