Alan Parker — Linear Predictive Coding with Multi-Pulse Excitation

§ 1 Introduction

A 1984 thesis, scanned and brought back to life

In May 1984, Alan Parker (1959–2025) published his doctoral thesis “Linear Predictive Coding with Multi-Pulse Excitation” in Electrical Engineering at North Carolina State University. Forty-two years later, in May 2026, Dr. Parker’s thesis was digitally scanned and remastered in modern LaTeX, restoring all 108 original pages, 197 equations, 30 pen-plotted figures, 15 tables, 28 references, and 140 lines of C that still compiles to this day.

This page is an interactive memorial that celebrates Dr. Parker’s significant contributions to the field of digital speech synthesis. Every digital voice call, whether mobile, internet, or video, is built on pulse-excited linear prediction. Dr. Parker’s thesis helped turn that technique into the codecs running in every phone, app, and video call of the past three decades.

§ 2 Listen

Dr. Parker's voice, digitized through the eras

To understand how digital speech quality has improved over the years, there’s no better way than to listen to it. Use the LPC Analyzer below to play back audio with your choice of 4 different codecs:

LPC — Linear Predictive Coding. This was state-of-the-art low-bitrate speech synthesis in the 1970s to early 1980s. It has the distinctive robotic, scratchy sound of that era.
MP-LPC — Multi-Pulse Linear Predictive Coding. The subject of Dr. Parker’s thesis, where multiple excitation pulses are used to enhance the quality of LPC.
AMR — Adaptive Multi-Rate (ACELP). The narrowband cellular codec that carried nearly every mobile phone call from the late 1990s through the 2000s. It is the direct descendant of Dr. Parker’s work: ACELP excites its filter from an algebraic codebook whose entries are just a handful of signed unit pulses on a fixed grid.
Opus-FB — Opus Full-band. The modern internet/HD codec (2012), built on LPC-based SILK and MDCT-based CELT. It is the highest-quality, richest sounding audio of today.

Dr. Parker used Multi-Pulse LPC to improve the quality of LPC in the 1980s and helped pave the way for modern codecs like ACELP in the 1990s and Opus in the 2010s. Listen to Dr. Parker himself in the various codecs using the LPC Analyzer. The MP-LPC codec is the 140-line C algorithm Dr. Parker wrote in 1984, applied to a wedding speech he gave 38 years later, in 2022. Toggle the codec back and forth between LPC and MP-LPC to see how Dr. Parker improved upon the algorithm. Then step forward through the eras: AMR is the ACELP codec that carried the cellular calls of the 1990s and 2000s (production-grade proof of the multi-pulse principle), and Opus-FB is the full-band, rich audio of today.

You can also apply the codecs to the voices of Richard Feynman giving a physics lecture on semiconductors (1964) or Neil Armstrong’s famous moon landing speech (1969).

AUDIO / LPC ANALYZER · MODEL 1130

PWR SIG PLAY

SCOPE PLAY

F1 0000 F2 0000 F3 0000 RMS -INF dB · F0 000 Hz

Transport

Source

▸Parker · author's voice

Codec Encoding

▸ MP·LPC

BW 3.4 kHz

00:00

§ 3 Background

How a voice fits into a computer

Dr. Parker entered NC State’s doctoral program in January 1982 and defended his thesis in May 1984, a time frame in which modern speech coding was rapidly evolving. As Dr. Parker shows in Figure 1.1 of his thesis, speech encoding was done in one of two ways:

Waveform coding: encode voice as a waveform directly. This requires a large number of bits per second (7.2–200 kbps), used for communications, toll, and broadcast quality.
Source coding: encode voice in a compressed form by describing the shape using fewer bits per second (0.5-7.2 kbps). Required whenever bandwidth is precious, like in wireless communications.

FIG. 1.1 Bit Rate Required for Different Qualities of Speech. Fidelity trades against bandwidth along one axis. Waveform coders reproduce the signal sample-by-sample at high rates (broadcast and toll quality); source coders instead transmit the parameters of a speech-production model at far lower rates. Linear predictive coding lives at this low-rate end.

Below is a demonstration of the more expensive waveform coding. Use the controls to modify both:

Sample rate frequency: the rate at which points in the waveform are sampled. If the frequency is too low (below the Nyquist rate), the voice is poorly sampled and will not encode well.
Bit depth: the precision of the encoded amplitude of the voice. At 2-bits, only the sign (positive or negative) and the value 1 or 2 can be encoded, which fails to capture the precise shape of the voice.

Use the sliders to see how the sampling changes. To get a decent quality, you need to retain the higher frequencies by raising the sample rate to 8 kHz and the precision to 8 bits. This requires 64,000 bits per second, which is far too high for effective wireless applications.

HP 7470A · CARRIAGE CONTROLX-Y PEN POSITION

sample rate fs5,500 Hz

bit depth B8 bit

fs = 5,500 Hz × B = 8 bits = 44.0 kbit/s quant SNR ≈ 49.9 dB

✓ fs/2 = 2750 Hz covers all signal content — no aliasing

bit rate vs 64 kbit/s reference69% of 64 kbit/s

44.0 kbit/s

time (ms) — paper feed → amplitude

frequency (Hz) magnitude (dB)

Given that waveform coding is not feasible for digitizing voice at low kilobits per second, research went into more efficient digital representations of voice. In 1971, Bishnu Atal and Suzanne Hanauer developed Linear Predictive Coding (LPC) at Bell Laboratories. This kicked off 15 years of LPC development that Dr. Parker extended in his doctoral thesis in 1984 by formalizing multi-pulse LPC. Dr. Parker shows a diagram of how ordinary LPC works in Figure 2.3:

Speech analysis encodes voice into: (1) the pitch period for voiced segments, (2) a single bit for voiced/unvoiced, (3) gain, and (4) reflection coefficients. This compressed representation of voice is then transmitted to a receiver.
The receiver decodes the encoded signal to produce synthetic speech. This speech is a best-effort reconstruction of the original speech, but some information is invariably lost, resulting in the robotic-sounding voice on the receiver’s end.

FIG. 2.3 LPC Speech Production. A transmitter encodes speech into component bits that represent a compressed version of the voice signal. A receiver decodes these bits to construct an output speech that closely resembles the input speech.

After LPC was formally introduced in 1971, much effort went into improving the quality of LPC output while retaining its low kilobit per second advantage. The 15-year timeline below highlights how Dr. Parker’s multi-pulse LPC work fits into the broader effort to improve LPC quality.

1971

Linear prediction

Atal & Hanauer formalize speech LPC, modeling the vocal tract as an all-pole filter driven by two-state (voiced/unvoiced) excitation, the foundation everything else rests on, and the crude buzzer everyone would spend the next decade trying to fix.

1976

Physiology

Holmes argues, on physiological grounds, that the tract is excited before and after glottal closure, not just once per period. Dr. Parker cites this as the physical licence for using multiple pulses.

1979

Perceptual weighting

Atal & Schroeder: don’t minimize raw error, minimize it where the ear can hear it. The weighting filter born here reappears in Dr. Parker’s Chapter 3 and in every CELP codec since.

1982

Multi-pulse

Atal & Remde introduce multi-pulse excitation: “speech of any desired quality” from enough pulses, found one at a time, greedily. This is the paper Dr. Parker’s thesis extends.

1983

Joint re-estimation

Singhal & Atal: once the excitation model changes, the old filter coefficients are no longer optimal and should be re-estimated jointly with the pulses. This is the open problem Dr. Parker formalizes and solves.

May 1984

Dr. Parker’s thesis

Dr. Parker defends a provably optimal, provably convergent joint solution for pulses and filter together, plus an efficient scheme to transmit the pulse locations.

1985

CELP

Ten months later the field pivots again: pulses become codebook entries. The paradigm Dr. Parker worked in is generalized into CELP and goes on to carry telephony for thirty years.

Dr. Parker opens his thesis by naming the defect he set out to fix, citing Atal:

621.382
P241

Author: Parker, A.
Title: Linear Predictive Coding w/ Multi-Pulse Excitation
Call No.: NCSU / EE / 1984

“It is known that by increasing the bit rate for LPC that dramatic improvement of speech quality is not obtained. Atal has suggested that the reason for this is the highly inflexible way present day LPC systems are excited.”

Alan Parker § 1.1 · Background

§ 4 Methods

A mathematical model for multi-pulse LPC

Figure 1.2 of Dr. Parker’s thesis shows a diagram for the multi-pulse LPC model that would improve upon the standard LPC model.

FIG. 1.2 A Multi-Pulse Model for LPC. A pulse train excites the LPC synthesizer — the vocal-tract filter set by the reflection coefficients — and a low-pass filter yields synthetic speech. Where classic LPC uses a single pulse or noise per frame (the source of its robotic buzz), multi-pulse excitation fits several pulses per frame to the residual for far more natural output.

The idea of using multiple pulses was not new: Atal & Remde introduced it in 1982, but they utilized a naive, greedy algorithm to find the positions of the pulses. Dr. Parker, who was highly adept at applied mathematics, formalized this approach by solving for the positions optimally and globally. The following quote from Chapter 3 directly calls out how the greedy algorithm does not guarantee an optimal set of pulse positions.

621.382
P241

Author: Parker, A.
Title: Linear Predictive Coding w/ Multi-Pulse Excitation
Call No.: NCSU / EE / 1984

“As mentioned in the first chapter one of the reasons that present day LPC systems sound so mechanical may be because of the lack of a more sophisticated excitation set. One promising approach to the enhancement of low bit rate speech is reformulating the excitation in LPC as multiple impulses per block rather than one impulse per pitch period. Previous researchers have examined the multiple-pulse excitation problem by sequential impulse extraction. That is, over a fixed block of speech the best single impulse excitation location (and amplitude) is found, then the best second impulse excitation is found conditioned on the first ,etc... until the final Nth excitation conditioned on the N – 1 others is found. However, this method does not necessarily result in the optimal set of N impulses (i.e., that which produces the lowest mean square error between the actual and the modelled speech) since the introduction of succeeding excitations could alter the optimal location of the previously calculated impulses.”

Alan Parker § 3.1 · Derivation of Multi-Pulse LPC

The thesis breaks new ground by solving for the pulses and the filter, where earlier work fixed the filter and searched only for pulses. Dr. Parker alternates between the two, finding the optimal pulses for a fixed filter, then the optimal filter (its prediction coefficients uncoupled from the gain) for those pulses. He proves that each step lowers the error and that the alternation converges to a stationary point.

Stripped to its essentials, the method is a short chain of equations, reproduced here exactly as they appear in Chapter 3 of the thesis. It begins by writing down what synthetic speech is: each new sample is predicted from the samples just before it (the vocal-tract filter) plus an excitation that supplies whatever the filter alone cannot:

EQ. 2.2

\hat{s}(n) = \sum_{i=1}^{p} \alpha_i s(n-i) + \hat{x}(n)

The model. A synthetic speech sample is a weighted blend of the previous p samples plus an excitation that drives it.

The classic LPC of the 1970s drove that filter with a blunt excitation: a single impulse per pitch period, or noise. That is the source of its robotic buzz. Multi-pulse LPC replaces it with a small, carefully placed set of pulses: an amplitude $b_i$ dropped at a position $p_i$ :

EQ. 2.5

\hat{x}(n) = \sum_{i=1}^{I} b_i \delta(n - p_i) \qquad \begin{cases} 1 \le n \le N \\ 1 \le p_i \le N \end{cases}

Multi-pulse excitation. The excitation is just I pulses across a block of N samples, each a spike of height b at a time position p. Choosing those heights and positions well is the entire problem.

Dr. Parker scores any choice of filter and pulses by how far the resulting synthetic speech strays from the real recording, squared and summed across the whole block.

EQ. 2.4

E = \sum_{k=1}^{N} e^2(k)

The objective. Add up the squared error between the real and the synthetic speech at every sample. Driving this one number E to its minimum is the goal that fixes both the filter and the pulses at once.

Atal & Remde placed their pulses one at a time, greedily, while Dr. Parker proved the optimum has a mathematically closed form. Define the residual $f(n)$ as the part of the speech the filter fails to predict. Dr. Parker shows the best pulse positions are simply the points where that residual is largest, with each pulse’s height equal to the residual itself, $b_j = f(p_j)$ .

EQ. 3.17

|f(p_1)| \;\geq\; |f(p_1+\ell)|

The answer. A pulse is optimally placed only when the residual there is at least as large as anywhere it might be moved. In plain terms: spend the pulses on the biggest peaks of whatever the filter left behind.

§ 5 Results

What multi-pulse buys, measured

The demo below runs the multi-pulse algorithm from Dr. Parker’s C code on one real 20 ms frame of voiced speech. At the far left of the pulse count rail the demo shows ordinary LPC: the vocal-tract filter is driven by a single impulse per pitch period and its reconstruction buzzes off the original. Drag the slider right and multi-pulse instead spends its pulses on the peaks of the residual: the amber reconstruction tightens onto the original waveform as the pulses multiply.

The bottom plot shows the normalized error of the speech reconstruction as a function of pulse count. Multi-pulse LPC outperforms standard LPC at all pulse counts $I>2$ . If you go back to the LPC Analyzer and listen to Dr. Parker’s speech again, that difference in audio clarity between LPC and MP-LPC is due to this lower error.

HP 7475A · 6-PEN PLOTTEREXCITATION ✕ FILTER

pulse count Istandard LPC

original s(n)reconstruction ŝ(n)residual f(n)

sample n — excitation amplitude

sample n (20 ms @ 8 kHz) amplitude

pulse count I norm. error

§ 6 Conclusion

A method that outlived its decade

Dr. Parker summarizes the four things that are new in his dissertation:

621.382
P241

Author: Parker, A.
Title: Linear Predictive Coding w/ Multi-Pulse Excitation
Call No.: NCSU / EE / 1984

“New contributions by this dissertation include: solution of a set of equations for multi-pulse to determine the optimal prediction coefficients for a given set of pulse positions … uncoupled from the gain; a mathematical derivation of the optimal pulse locations; a theoretical proof of convergence for the algorithm; and an algorithm to transmit the pulse locations.”

Alan Parker § 7.1 · Summary

He was just as candid about the cost. Multi-pulse’s quality came from spending more bits, not from a free lunch:

621.382
P241

Author: Parker, A.
Title: Linear Predictive Coding w/ Multi-Pulse Excitation
Call No.: NCSU / EE / 1984

“Multi-pulse performs better than LPC when allowed to run at 10.6 kBits. As discussed earlier, when the bit rates are the same there is not really a significant improvement over LPC.”

Alan Parker § 7.1 · Summary

The next decade of work would close that gap (multi-pulse quality without multi-pulse bit rate) by replacing the explicitly transmitted pulses with indices into a shared, pre-trained codebook. The analysis-by-synthesis paradigm he worked in would, within a year, generalize into CELP and go on to carry essentially every cell-phone call on Earth for the next thirty years. The excitation question “what drives the filter?” organizes forty years of codecs into a single family tree displayed below. The amber line traces the multi-pulse algorithm out to ACELP, LPCNet, and the neural codecs of today.

Through his doctoral work, Dr. Parker’s math, engineering, and influence live on for all of human history.

§ A Appendix

The original dissertation, page for page

Dr. Parker’s original 1984 thesis, digitally scanned in full, is reproduced below. The figure shows a side-by-side comparison of a representative page: the original scan on the left against the modern LaTeX rewrite on the right.

Page 7 of the original 1984 thesis, scanned

The complete scan of Alan’s original 1984 manuscript:

Original 1984 Thesis Scan Open PDF · 23 MB

And the modern, remastered LaTeX edition:

Remastered LaTeX Edition Open PDF · 708 KB

§ B Appendix

140 lines of C, still compiling

The original C implementation of the multi-pulse algorithm is reproduced verbatim from the 1984 thesis here. It is pre-ANSI K&R C, so the build suppresses implicit-int, no-prototype, and bare-return warnings. You can compile the program with:

BUILD MULTIP.C cc(1) · 06·21·26

$ cc-std=c89-c-Wno-implicit-int-Wno-deprecated-non-prototype-Wno-return-type-Wno-implicit-function-declarationmultip.c-omultip.o

C COMPILER FILE: MULTIP.C 05·11·84

1 /****************************************************************

2 **** This subroutine calculates the multi-pulse

3 **** parameters for a given set of data. The multi-pulse

4 **** coefficients and pulse positions will be passed back to

5 **** the caller.

6 *****************************************************************/

8 #include <stdio.h>

9 #include <math.h>

11 multip(n,i20,iter,p,s,nn,dec,pos,alpha,code)

13 float s[ ],alpha[ ];

14 int n,i20,iter,p,nn,dec,code,pos[ ];

16 /****************************************************************

17 **** Meaning of Variables Passed to Subroutine.

19 **** n: the number of data samples in the speech block

20 **** i20: the number of impulses to be used.

21 **** iter: the number of iterations for the algorithm.

22 **** p: the number of predictor coefficients.

23 **** s: the array containing the speech. if s[x] is

24 passed to the algorithm, the algorithm will

25 look for the initial data in samples s[x],s[x+1],

26 ...,s[x+p-1] and the speech data in s[x+p],s[x+p+1],

27 ...,s[x+n+p-1].

28 **** nn: the number of initial pulse positions.

29 **** dec: the decrement for the number of pulse positions used in

30 for(i=nn;i>=i20;i-=dec) where i denotes the number of

31 pulse positions the algorithm is using at that particular

32 iteration.

33 **** pos: the array containing the positions of the pulses. if

34 pos[y] is passed to the algorithm it will return pulse

35 positions pos[y+1],...,pos[y+i20].

36 **** alpha the array containing the alphas. If alpha[j] is passed

37 to the algorithm it will return alpha(1) in location

38 alpha[j+1],...,alpha(p) will be in location alpha[j+p].

39 **** code: if code = 0 then the algorithm calculates the optimal

40 coefficients for the fixed locations passed to it.

41 *****************************************************************/

42 {

43 #define begin {

44 #define end }

45 #define then

46 float kp[1000],phi[20][20],sum,psi[20],sum1,sum2,sum3;

47 float d[20],v[20][20],y[20],e[1000],b[1000];

48 int i,j,j3,j4,k,k2;

49 for(i=nn;i>=i20;i-=dec) /*start pulse position loop */

50 begin

51 for(k2=1;k2<=n;k2++)

52 kp[k2]=1.0; /*initialize correlation select function */

53 if(code==1)

54 begin

55 for(k2=1;k2<=i20;k2++) /* if code =1 then use pulse positions */

56 kp[pos[k2]]=0.0; /* passed to the algorithm to define */

57 end /* new correlation select function */

58 for(j4=1;j4<=iter;j4++) /* begin iteration loop */

59 begin

60 for(j3=1;j3<=p;j3++)

61 begin /* begin loop to calculate the multi-pulse */

62 for(k=1;k<=j3;k++) /* phi matrix. (identical to standard lpc */

63 begin /* but with kp multiplied in with the sum */

64 sum=0.0;

65 for(j=1;j<=n;j++)

66 sum+=s[j-j3+p-1]*s[j-k+p-1]*kp[j];

67 phi[j3][k]=sum;

68 end

69 end

70 for(j3=1;j3<=p;j3++)

71 begin

72 sum=0.0; /* calculate the multi-pulse psi matrix */

73 for(j=1;j<=n;j++) /* similar to lpc */

74 sum+=s[j+p-1]*s[j+p-j3-1]*kp[j];

75 psi[j3]=sum;

76 end

77 d[1]=phi[1][1];

78 for(j=2;j<=p;j++)

79 v[j][1]=phi[j][1]/d[1]; /*calculate the multi-pulse v matrix */

80 for(j3=2;j3<=p-1;j3++)

81 begin

82 sum1=0.0;

83 for(k=1;k<=j3-1;k++)

84 sum1+=v[j3][k]*v[j3][k]*d[k];

85 d[j3]=phi[j3][j3]-sum1; /*calculate the multi-pulse d matrix */

86 for(j=j3+1;j<=p;j++)

87 begin

88 sum2=0.0;

89 for(k=1;k<=j3-1;k++)

90 sum2+=v[j][k]*d[k]*v[j3][k];

91 v[j][j3]=(phi[j][j3]-sum2)/d[j3];

92 end

93 end

94 sum3=0.0;

95 for(j=1;j<=p-1;j++)

96 sum3+=v[p][j]*v[p][j]*d[j];

97 d[p]=phi[p][p]-sum3;

98 y[1]=psi[1];

99 for(j3=2;j3<=p;j3++)

100 begin

101 sum1=0.0;

102 for(j=1;j<=j3-1;j++) /*calculate the multi-pulse y matrix */

103 sum1+=v[j3][j]*y[j];

104 y[j3]=psi[j3]-sum1;

105 end

106 alpha[p]=y[p]/d[p];

107 for(j3=p-1;j3>=1;j3--)

108 begin

109 sum2=0.0;

110 for(j=j3+1;j<=p;j++)

111 sum2+=v[j][j3]*alpha[j]; /*alphas for multi-pulse are finally */

112 alpha[j3]=y[j3]/d[j3]-sum2; /*defined in this loop */

113 end

114 if(code==1) return; /* if positions were specified you are done */

115 for(j=1;j<=n;j++)

116 begin

117 sum1=0.0;

118 for(j3=1;j3<=p;j3++) /* loop to calculate error function */

119 sum1+=alpha[j3]*s[j+p-j3-1];

120 e[j]=s[j+p-1]-sum1;

121 end

122 for(k2=1;k2<=n;k2++)

123 kp[k2]=1.0;

124 for(j=1;j<=i;j++)

125 begin

126 b[j]=0.0;

127 for(k=1;k<=n;k++)

128 begin

129 if(fabs(*(e+k))>fabs(*(b+j))) then

130 begin /* loop to calculate the largest i values of */

131 b[j]=e[k]; /* the error function which define the positions */

132 pos[j]=k; /* to be used in the next iteration */

133 end

134 end

135 e[pos[j]]=0.0;

136 kp[pos[j]]=0.0;

137 end

138 end

139 end

140 }