Speech synthesis is a core component of many modern products - turn by turn navigation, automatic personal assistants/search like Siri and Alexa, and even the voices of AI/robots/people in computer games and movies are (in some cases) generated by speech synthesis systems. But how are these systems built? How can we make computers reply to our requests (and demands) using speech?
HDF5 is a direct, easy path to "big" (or just annoyingly larger than RAM) data in scientific python. Using the HDF5 file format and careful coding of your algorithms, it is quite possible to process "big-ish" data on modest hardware, and really push your resources to the limit before moving to larger, fancier distributed platforms.
There are two primary options for working with HDF5 files in python: H5Py and PyTables. While many people enjoy H5Py, I am much more familiar with PyTables and prefer it for most tasks. This blog will show a variety of PyTables examples for several potential applications including standard numpy nd-array replacement, variable length (ragged) arrays, iterator rewriting for simpler retireval of data, and on-the-fly compression/decompression.
Gaussian processes (GP) are a cornerstone of modern machine learning. They are often used for non-parametric regression and classification, and are extended from the theory behind Gaussian distributions and Gaussian mixture models (GMM), with strong and interesting theoretical ties to kernels and neural networks. While Gaussian mixture models are used to represent a distribution over values, Gaussian processes are a distribution over functions. This is easy enough to say, but what does it really mean? Let's take a look.
Hidden Markov Models (HMMs) are powerful, flexible methods for representing and classifying data with trends over time, and have been a key component in speech recognition systems for many years.
I found it very difficult to find a good example (with code) of a simple speech recognition system, so I decided to create this post. Though this implementation won't win any awards for "Best Speech Recognizer", I hope it will provide some insight into how HMMs can be used for speech recognition and other tasks.
Wavelets are a fundamental part of modern signal processing and feature engineering. Utilizing well developed basis functions with certain mathematical properties, rather than the more typical sines and cosines used for the DFT (discrete fourier transform) and DCT (discrete cosine transform), wavelet analysis has many interesting applications.
Around 25 minutes into this lecture, there is some good discussion of the PageRank algorithm. I have always wanted to code up a basic version of this algorithm, so this is a great excuse. This algorithm is probably one of the cleanest examples of Markov Chains that I have seen, and obviously its application was quite successful.
Matrix factorization is a very interesting area of machine learning research. Formulating a problem as a 2D matrix $X$ to be decomposed into multiple matrices, which combine to return an approximation of $X$, can lead to state of the art results for many interesting problems. This core concept is the focus of compressive sensing, matrix completion, sparse coding, robust PCA, dictionary learning, and many other algorithms. One major website which shows many different types of matrix decomposition algorithms is the Matrix Factorization Jungle, run by Igor Carron. There has been a heavy focus on random projections in recent algorithms, which can often lead to increased stability and computationally efficient solutions.
The general technique of breaking a large bandwidth signal into many smaller streams is often referred to as channelization, which is an absolutely vital operation in RF processing and communications. There are many different approaches to performing this operation, with the polyphase filterbank being perhaps the most popular. This notebook should show all the steps necessary to contruct a polyphase filterbank from the basic ideas of filtering, the Discrete Fourier transform (DFT, using the np.fft module) and decimation.
When presented with an unknown dataset, it is very common to attempt to find trends or patterns. The most basic form of this is visual inspection - how is the data trending? Does it repeat in cycles? Can we predict future data given some past events? The mathematical approach to this "trend finding" is called regression.