POW NFT Generative Music

Tech stack for the future

A number of considerations need to be made when choosing the tech stack for this project. I am strong of the belief that if you’re going to make a generative art NFT based on-chain data, you need to show your work. It would be very easy to create or curate artwork behind a curtain, then just claim it was generative. So whatever process we landed on, it was important that the process be transparent (and therefore repeatable). In a doomsday scenario, even if every copy of every Atom is destroyed and metadata servers melt, the artwork should be re-creatable with just the ruleset and the token’s hash.

Web tech for the future

Okay, so we want to build using only web tech so our NFTs can live forever, but that raises another important question… how do you make music with just a browser?

AudioContext

This is all great in theory, but how do you make music play in a browser without loading sound files? The answer lies with the Web’s relatively unknown AudioContext API. I’ll admit I had no idea this existed until I needed to make an alarm for the POW NFT miner, and from looking around the web, very few people actually use it. On the rare occasion, someone wants to create music using just a browser, there are a couple of decent-looking frameworks people have made to wrap around AudioContext, but I think I’ve made my thoughts on frameworks pretty clear. As it turns out, AudioContext has all the core components of a modern digital synthesizer, you just need to know how to use them. So we were going to build this from scratch.

The Synthesizer

Going into this, I knew very little about how synths work. A background in mechanical engineering and a decade and a half of writing code for a vibration analysis company gave me a pretty decent understanding of working with sine waves, but the knowledge required to make those waves pleasing to the ear was far outside my wheelhouse. Luckily, my collaborator, Skwid had this kind of knowledge in spades, having worked with both digital and analog synths for as long as I’ve been throwing 1’s and 0’s together. This rig that he built illustrates his skill level far better than I ever could:

The giant synth that Skwid made
Blueprint for a synthesizer

Oscillator nodes

Our synth is actually a dual-oscillator synth, so the Osc node in the diagram actually represents 2 oscillators that can be configured differently but play the same note. AudioContext’s OscillatorNode does a huge amount of work here and is relatively cheap (from a processor standpoint). They basically output a repeating wave which you can configure for all sorts of things. Some of our synths borrows terminology from the AudioContext API, so whenever I’m referring to a component from the latter, I'll write it like this to prevent confusion.

LFOs (Low-frequency Oscillators)

LFOs are basically just oscillator nodes, but they’re used in places where it makes sense for them to have low-frequency settings. In our synth, each instrument has its own LFO which can modulate the OSC frequency, and there’s also one in each filter, and one in each panner.

ADSR

A huge part of making a synth that sounds like anything other than 8bit chiptunes is ADSR envelopes. If you haven’t heard of these (as I hadn’t at the start of this project), it stands for Attack-Decay-Sustain-Release and is the cornerstone of all synthesizer sounds. We actually have 3 ADSRs in the synth, but for the newbies, I’ll first explain what it does in regards to Osc amplitude.

An ADSR envelope, x-axis is time

Oscillator reassignment

Because our synth is doing a lot of work with limited resources, we had to be smart about how many oscillator pairs each instrument can have. If an instrument has a long ADSR envelope and is playing a lot of different notes in rapid succession, then you can end up with hundreds of oscillators. I made the decision to allow each instrument a max of five oscillator pairs. At the start of the track, it looks at the score and works out which notes will be assigned to which OSC. If it’s got 5 notes ringing and it wants to play another, it will re-assign the least-recently-played OSC, cut off the ADSR envelope and play the new one.

Filters

Filters are another ingredient that 100x’ed the power of our Synth. For the non-synth-savvy, they amplify or remove different frequencies from the signal coming in, which is important for creating certain sounds. AudioContext’s BiquadFilterNode is doing all the heavy lifting here, and our synth just wraps around it and then adds in some ADSR functionality or an LFO on the cutoff frequency. A key feature of the filter ADSR envelope is that it can be configured as having negative values — meaning it distorts the cutoff frequency down from a baseline, rather than up from a 0Hz floor.

Other components

Every instrument also has the option of delay, panner, and reverb module. The former is just AudioContext’s delayNode on a configurable feedback loop and the panner wraps the stereoPannerNode with an LFO. Our reverb module is a little complex, doing some magic with a BiquadFilterNode and a convolverNode with a custom impulse response function.

Generalising synth controls

When making the synth, it was important to generalise its capabilities so the composer had everything it needed to work with. For this reason, all properties with a time-based unit (ie, seconds or Hz), have the option of being configured in terms of beats. For example, an LFO can be set to oscillate once per beat, rather than a set number of times per second (Hz). On top of this, most instrument properties can also be reconfigured by the composer mid-track if needed — the synth effectively exposes an API to the composer, giving it the ability to adjust the sound of any instrument at any point in the track.

The Composer

Now that we had a fully loaded Synth, it was time to write the code that makes it sing. This was a mammoth task and requires finding a delicate balance between being overly constrained or sounding too crazy and non-musical. Ironically, given that it was code writing this music, we found it needed to meet a higher bar for matching what the regular listener would consider “okay music” than if it were created by a human. If the latter does something unusual it can be received as experimental, but if a machine that’s supposed to make music makes something that doesn’t sound like your idea of music, you’ll probably just think it sucks.

Randomness from hash

Before going into the details of the composer, it’s important to cover how its decisions are made based on the token’s hash. In POW NFT’s visual layer, individual bytes of the hash is assigned to determining different characteristics of the Atom (a few for atomic number, a few for colour, etc). When it came to designing the composer, however, it became clear that there would be more decisions to be made than exist bytes in the hash. And even then, we would want more than 256 degrees of variety in some of them.

function fnv32a() {
let str = JSON.stringify(arguments);
let hval = 0x811c9dc5;
for ( let i = 0; i < str.length; ++i )
{
hval ^= str.charCodeAt(i);
hval += (hval << 1) + (hval << 4) + (hval << 7) + (hval << 8) + (hval << 24);
}
return hval >>> 0;
}
function random(){
const
limit = 1000000000;
last_rehash = fnv32a(tokenId,hash,last_rehash);
return
(last_rehash%limit)/limit;
}

Track structure

An early decision we made was that our non-fungible tunes needed to adhere to some kind of internally-logical structure. That’s not to say that they should all be structured in the same way, but that a track should have different sections with different moods/tones, and that the order and possible repetition of those sections should not be jarring to someone who’s paying attention.

Track properties

Once the Composer has mapped out the track’s overall structure, it must decide on a number of properties including tempo, root key, key changes and scale. It also has an algorithm for determining the length of the different sections, which provides additional variety between tracks.

Instruments

For each track, the Composer effectively assembles a band of instrument definitions and then creates them on the Synth. The instrument definitions were predefined by Skwid, which was a creative decision designed to ensure that all the Synth’s sounds were audibly pleasing. It’s important to remember all generative art has a level of curation. Whether it’s the color palette, the shape algorithm, or in this case the instrument library, the generative art is a result of rules laid out by an artist who understood the many possibilities but was unaware of what the final outcome would be.

Melody and rhythm elements

The crunchiest part of the Composer module, and the thing that gives the tracks the most flavour besides the instrument definitions, are the melodies and rhythms. Skwid defined several matrices which determined the possible re-use of these elements in different sections of a track, as well as how they should be generated.

Bringing it all together

Once the Composer has created all the pieces, it must actually write the score and send it to the Synth. It steps through each section of the track and composes the relevant bars for each instrument based on whatever melody/rhythm it should be playing, mindful of key-changes, mixer settings, and making a few extra decisions about which instruments will play at what time.

Music for eternity

All these rules, built in this future-facing, open format, meant we were able to add a unique track to all 5200+ existing POW NFT Atoms. It also maximises the chance that if someone mints an Atom a decade or two from now, they’ll be able to hear their own unique track, played live by their browser, based on rules written decades ago, that’s never been heard before.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store