Type text, pick a model and voice, and generate speech locally on CPU.
1.0 = normal speed; lower is slower, higher is faster.