The discrete Fourier transform is fundamental to many questions in science and technology and nature, we want to be able to compute the discrete Fourier transform fast the classical way to do that is the so-called FFT algorithm. So the FFT algorithm for Fast Fourier Transform algorithm was first designed in 1965 by Cooley and Tuckey, so it’s also known as the Cooley-Tuckey algorithm, and it has led to a revolution in digital signal processing This is Michael Kapralov an Assistant Professor of the IC School at EPFL and indeed the Fourier transform is central to our information age.
As a result, performing the Fourier transform in a fast and efficient manner, But why is the Fourier transform so central to our modern world? The transform is fundamental to numerous areas of science and technology. You can think of signal processing, data analysis, medical imaging and perhaps the very well-known example as image and video compression schemes so think of jpg. You do know what jpg is, don’t you? What about mp3? These are digital formats for images and sounds and there are plenty of other formats for videos as well, like h.264, which is the one I’m using for this video such audiovisual files are opened by every one of us, every day.
And every time you’re looking at a digital picture every time you’re listening to digital music and every time you’re playing a digital video some FFT algorithm will be running in the background. As Richard Baraniuk a professor of Rice university puts it: “FFTs of billions of times a day.” but how does the Fourier transform help with audiovisual compression? in many applications, we acquire signals in the continuous domain. But these signals are usually sampled at discrete intervals we have analog-to-digital converters so once the signal is digitized, or sampled, the question becomes:
Can we compute the discrete Fourier transform of that signal? This is in opposition to the classical continuous Fourier transform that was studied by French mathematician Joseph Fourier and others in the early 1800s these days, we rather focus on digital data, and therefore, we rather care about the discrete Fourier transform now, if we are trying to compress an image using the jpeg method, what we would like to do is just say I take an image, I take the Fourier transform of that image and then it turns out that for most images because these images are smooth in the time domain.
the Fourier domain representation is fairly compact so we just truncate to the top few coefficients and magnitudes zero out the rest and hence we achieve compression let’s explain a bit more. Signals like images and sounds can be regarded as one way of representing some fundamental mathematical objects. The free transform allows us to look at these fundamental mathematical objects through a different angle thereby, objects that look complicated in the signal domain can suddenly appear amazing the simple, and easy to describe in the Fourier domain this will be the case when the signal in the Fourier domain is mostly zeros in which case, we don’t even need to record the zero values.
We can just store in memory the nonzero values, which means that we need to let memory space record our signal but why would the Fourier representation of images and sound be so nice and compact in the Fourier domain if we think of a k-sparse signal in the Fourier domain, this is a signal that is a sum of a small number of pure harmonics, plus maybe some amount of noise. And in music, it’s pretty clear why most music is made of only a handful of pure harmony after all that’s how music partitions are written height of the musical note imposes a particular pitch that corresponds to pure harmonics.
There’s even more than that as depending on instruments each note is accompanied with a tone which is usually itself a combination of a handful of pure harmonics however, it might be less clear why images are also made of a handful of pure harmonics. The way jpg works are by first decomposing an overall image into tiny squares now, if the resolution of the large image is quite high each tiny square will likely be smooth and nearly homogeneous, which means that the frequency representation of such tiny squares will mostly be made of low frequencies.
In both cases, it turns out that most of the signals are actually a composition of the handful of frequencies now the other frequencies are usually not exactly zero because there’s always some noise. But as long as this noise, these amplitudes, of the other frequencies are low and negligible compared to the main signal you can get rid of them just cancel them out without really affecting the quality of the image or the sound the underlying idea of jpg is that we take a time-domain representation,
PCA or SVD
which is the image itself we take the Fourier to transform and then it turns out that if we just keep the top few coefficients and throw away the rest the image almost doesn’t change and hence we achieve compression. This approach has a strong resemblance with machine learning algorithms like PCA or SVD or like the word vector representation we discussed with Martin Jaggi.
This is one reason why the discrete Fourier transform is important: Because for natural signals that arise in applications they sort of concentrate their energy in a small number of coefficients and hence just focusing on these coefficients results in saving in storage, saving in time, etc… and the Fourier basis turns out to also be convenient for other reasons. One of these reasons is the fact that convolution the operation of convolving two signals becomes very easy to perform in the Fourier domain wait. What’s a convolution? And why does it matter? So here’s one reason for computing convolution.
Suppose that I have a signal X, and some template signal Y and I know that a signal X is a shift of the signal Y in the time domain, plus perhaps with some noise added and what I want is to find out what is the right shift that aligns X&Y optimally. One way to do it is exactly to compute a convolution so what is a convolution? Convolution is the following operation: We take a signal X then we consider all the possible shifts of the other signal Y. We do the dot product of X and the shifted version of Y we overlay X with a shifted version of Y and check how similar they are. But why would we ever want to search for the best translation of X that best aligns with Y? this is the operation that GPS devices do.
When they’re receiving a GPS signal, think about this as X perhaps some noisy version of some template signal Y yes because to locate yourself using GPS, you need to measure very precisely the communication delay from the satellite to your phone to do so the satellite will send you a signal Y and you know that it will look like some template X. Measuring the arrival time of Y will then boil down to determining the time-translate of X that best matches the signal Y now the question is:
Transform of X
Can you find the right shift? if you think of doing this in the direct brute-force way, if my signal X is of length n, and my signal Y is of length n then I need to try all then shifts, on the other hand, if I can switch to the Fourier domain, it turns out that all I need to do is to take the Fourier transform of X, the Fourier transform of Y and multiply them pointwise and then do the Fourier transform back in the resulting signal if I look at the peak the maximum:
this is the right shift so multiplication is very easy solution convolution is very important, so if we have a way of quickly switching from time-domain representation to frequency-domain representation and we can take X and Y compute the Fourier transform, multiply them, compute the Fourier transform back and we have the convolution of X and Y so this is a fast algorithm for performing convolution So the Fourier transform is a key algorithm to determine optimal time translation.
Exactly, so this is a very efficient way of resolving optimal translation of one signal that makes it similar to another signal more generally, whenever signals have a lot of time or spatial translation symmetries then convolution and the Fourier transform will be powerful tools to efficiently deal with such symmetries. Inspired by the architecture of the brain, researchers in deep learning, have exploited convolution to improve the image processing of their neural network.
This has led to so-called convolution neural networks, as well as key breakthroughs in object detection and face recognition. So yes, the Fourier transform is a big deal. But in fact, there are also other reasons let me mention one to you and this is something that comes up in medical imaging. For example, when one is performing an MRI scan, one can view the process as following so there’s this object being scanned, the patient, and essentially the process of performing MRI scan is this:
you can think of the patient as a signal in the time domain so the MRI machine measures Fourier coefficients of the signal X, at various locations now the question becomes: if I want to scan a signal X, a patient, then this is the question of reconstructing X, from a small number of Fourier measurements. So this is yet another application where the Fourier transform comes up for completely different reasons, given by the physics of the process, and in fact, this is not so different from what our ears do at any point in time our ears are made of tiny hair each of which can only vibrate at a very specific frequency.
Now, because of a physical phenomenon called resonance, our hair will vibrate if and only if it “hears” the pure harmonics associated with its intrinsic frequency And when you think about it, this means that our brains are only told about the frequency that our ears hear. Yet, our brains can combine these frequencies to reconstitute the sound that our ears hear our brains apply the Fourier transform all the time. So, in a nutshell, the Fourier transform has a lot, a lot, a lot of applications we’ve been scratched the surface in this video.
Estimate the Functions
And because of it, the Fast Fourier transform by Cooley and Tuckey has conquered the world. But… what if I were to tell you that Cooley and Tuckey’s algorithm has recently been outperformed? well, this is what we shall discuss next time… that’s where the Fourier transforms comes in it comes up with a way to estimate the functions on the whole real line using complex exponentials this algorithm run in time about log squared n so it doesn’t even look at the entire signal.