![]() Click to view full-size image (1466 KB) |
So how do I know this?
OK, you got me. I don't. I know a bit about electronics and microprocessors and I read through the Yamaha patents for the FS1R (patent numbers are printed on its back). If you do so, you get a pretty good idea of how the FS1R works, but: A lot that I write here is just speculation, so not every detail may be right. I hope, it's interesting at least ;-)
About sound spectra, formants and operators
The sound that you hear is caught up by your eardrum and transfered to your cochlea. The sound goes through the cochlea, which gets narrower to the end. When the sound reaches the point where the frequency produces a resonance, the nerves at this point are stimulated so that your ear knows the frequency (i.e. pitch) of the sound.
But most sounds contain more than just one frequency, so if you hear a piano tone, several nerves are stimulated at once. The lowest frequency you hear defines the signal's pitch, the other (higher) frequencies (overtones) give the tone its character, so you can tell apart a piano from a flute, even if they play the same note. Overtones typically have a lower amplitude than the basic tone, and the amplitude gets lower the higher the overtone is.
So by mixing a basic tone with several overtones, you can build the sound of any instrument you like (i.e. you construct its spectrum). You do so by mixing sine waves of the proper frequencies. You take sine waves because a sine wave is the only signal that doesn't have overtones itself. So you take a number of operators, as Yamaha calls them, each operator generating a sine wave with adjustable frequency and amplitude. That's it.
But there's a problem: There is an infinite number of possible frequencies that have to be mixed to achieve the sound of any given instrument. Even if you divided the spectrum between 16 Hz and 20.000 kHz (which is what your ear can hear) into a finite number of frequencies, which are so close together that you can't tell a frequency from its neighbour, you would need thousands of operators to construct a sound. Imagine editing this with the few buttons on your box...
The solution is simple. When you look at the spectrum of a given instrument using a spectrum analyzer, you find that the frequencies are clustered. There are only few areas in the spectrum with high amplitude, and those consist of an infinite number of frequencies (e.g. from 1700 Hz to 2300 Hz). The middle frequency (at 2000 Hz) will have highest amplitude, whereas the side frequencies are quiter. Such a cluster of frequencies is called a "formant". The 8 operators of any voice in the FS1R can each produce a formant, with adjustable middle frequency, amplitude and "skirt": That's how fast the amplitude will fall at the sides of the formant. There are two kinds of formants: Formants with fixed frequency don't move with the pitch of the sound. Formants with relative frequency do. Most sounds contain a mix of both types of formants.
So how do you get the sound inside your head into the FS1R?
That's the difficult part. You have to learn to think in spectra and formants, to come up with the desired sound. Musicians coming from traditional synthesizers or even samplers aren't used to this. That's why a lot of people find it difficult to program the FS1R, and some even give up. All you can do is use your FS1R a lot, and after a while you will learn it. If you manage to do so, you will love the FS1R, as most people that I have talked to do.
What about FM?
Ah yes, FM. I'm not going to talk a lot about this. You can arrange the 8 operators of your voice in one of 88 algorithms. An algorithm determines, which operator's output is fed into which operator's input. When you feed the output of the "source" operator into the "target" operator, the target is frequency modulated by the source.
Well, that's a lie. In fact, the FS1R doesn't use frequency modulation (FM), even if Yamaha states so. Instead it uses phase modulation (PM). This has a few advantages over FM (the average pitch of the tone isn't affected), but sounds equal to FM, so we won't bother about the differences here. It's just a detail about the implementation.
What happens when a signal is frequency modulated? It changes its frequency according to the current elongation of the controlling source. You won't hear another pitch, though. This is because the source isn't a LFO. The source has a very high frequency, so the target changes frequency so fast, that this results in new overtones. It's very hard to predict how the modulated signal will sound, you have to try and play with FM. It's not very intuitive.
Details about formant generation inside the FS1R
So here we are: A sound is constructed out of up to eight operators (plus eight unvoiced operators, I'm not talking about them). Each operator generates a formant (or just a sine wave or some other signal). These may be FM'ed using algorithms. Formant parameters are controlled over time using enevelope generators, LFOs, FSEQs and controllers. Then you use filters and effects to further change the sound. But how does the operator generate the formant?
That's a detail about the implementation of the FS1R. You don't have to know this if you just want to make music. If you're not interested in the details of formant generation, you don't have to read along.
Each voice has up to eight formants. If you don't use filters, you have 32-voice timbrality, i.e. there are 256 formants sounding in parallel (plus 256 unvoiced operators, totalling up to 512 operators). There are two FM chips, so each chip contains 128 formant generation circuits. Right?
Wrong. One interesting thing about the FS1R is, that it uses time multiplexing to produce this incredible number of formants. Each FM chip has only one formant generation circuit, but uses a very high clock frequency. At 44.1 kHz sampling rate, the FM chip calculates a sample for each operator, so it has to calculate several million samples per second. Formants are calculated one after each other, not all at the same time. You won't here the resulting phase shift, since all samples are added up before they are output to the D/A converter.
Generation of a formant is done using a "window function" and a sine wave, that is stored in a lookup table inside the FM chip. Yes, that's right: There are samples stored inside the FS1R. But only for the basic waveforms like the sine, that is needed for formant generation.
The principle is simple. The sine wave is played, and amplitude-modulated by the window function. The window function looks like a sawtooth wave, except that it is reset to the beginning in regular time intervals. That is, the saw is going up, and reset to zero before it reaches its maximum, if the reset frequency is higher than the saw frequency. If the reset frequency is lower, then the sawtooth wave stays at maximum until the next reset.
That's it (for the most part). I find it stunning, that this generates a spectrum that can be used as a formant. The parameters of the formant, like center frequency, width and skirt can be controlled by changing parameters of the generation algorithm: Reset interval, sawtooth frequency, sine wave frequency, and a bit shifter in the signal path of the window function. This signal is left-shifted from zero up to seven bits, what explains that "skirt" is the only parameter that has only eight different possible settings. The used algorithm makes it impossible to have a finer control over the formant skirt.
But: The generated spectrum is not perfect. It should consist out of an infinite number of sine waves to get a spectrum without gaps. But in fact only some peaks in the spectrum are generated. Depending on the formant parameters, there are as much as a few hundred peaks or as few as only 15 or 20. Yamaha is lucky that the ear is not able to hear the gaps when the peaks are very close to each other. But in cases where there are only few spectrum peaks in the formant, the "proper" formant would sound somewhat different than the one generated by the FS1R. It's a concession to the simple algorithm, which on the other hand is simple to realize in hardware.
Another interesting fact is, that the formant generation circuit operates on logarithms of the signal values rather than on the signal values themselfes. Even the sine wave sample is stored in logarithmic form. By using this trick, Yamaha can use simple additions for amplitude modulation. This way not a single time-consuming muliplication is necessary. Logarithmic values are converted to real values using an antilogarithm table at the output of the operator.
Why aren't there any explanatory drawings on this page?
Yes, that would be nice, of course. And it would make it a lot more understandable. It's just a lot of work doing this. I just wanted to sit down and talk a bit about what I found out about the FS1R in case anybody is interested. There are some nice drawings contained in the Yamaha patents, but I don't know if I may publish them on my web site. You will find drawings of the formant generation circuitry and the resulting spectra there.