GSM Signal Processing

Encryption
Source and Channel Coding
Interleaving
Voice Activity Detection
Convolutional Coder
Viterbi

Description of the GSM Signal Processing Chain

Subscriber Identity Module and Cyphering

An innovation from the C-Netz made it into the GSM standard. The GSM phone was separated from the subscriber identity. This was done with a chip card that could be inserted in a GSM phone. In the GSM standard, this was called a SIM card, a Subscriber Identity Module. However, the chip card was not only intended as a simple storage for data, but also contained a processor for complex operations.

The processor protected the data stored on the card from misuse. The operation of the SIM card had first to be activated by entering a 4-digit “PIN” code. (PIN = Personal Identification Number). The SIM card did not simply contain the subscriber’s phone number or a similar identification. The SIM card also contained a secret key for encryption. This key was not directly accessible and could not be read, even by force. Beside encryption, this key was also used to authentify the subscriber.

At the beginning of the GSM era, the SIM card was still the size of a check card. Later also smaller SIM cards became available for handheld devices, which only contained the SIM card’s reading contacts and the chip.

Encryption

One of the key advantages of digital transmission is the possibility of encryption. As a matter of fact, in WWII, the encryption of speech signals was the driving force behind digitizing speech because it could easily be encrypted in digital form. But back in the forties, without the usage of transistorized electronics, this was not achievable.

Encrypting digital data is easy with a so-called “secret” digital key and a simple XOR function. An XOR compares two binary values. If both values are different, the result is a logical ONE or TRUE otherwise a ZERO or FALSE.

XOR01
001
110
Function of an XOR

If you have e.g. an 8-bit key, you can encrypt digital information as shown in the following table. If you don’t know the key, it is practically impossible to get the original text from the encrypted text.

Cyphering
Input Data00110101
Key11100011
 XOR
Encrypted Data11010110
Decypering
Encrypted Data11010110
Key11100011
 XOR
Restored Data00110101
Description of encryption with XOR

In the GSM system, the key of a subscriber is only stored at two places, a central, inaccessible storage facility called the Authentication Center and on the SIM card. The keys do not leave those places. The key is called Ki and is 128 bit long.

In order to be able to authenticate a mobile phone subscriber, a random word RAND is generated. This is sent to the GSM Authentication Center and to the SIM card. An encryption algorithm now runs on the SIM card, which generates a 32-bit number SRES from the key Ki and RAND. The same thing happens in the Authentication Center. The SRES are compared in the Mobile Switching Center. If the results are the same, it is ensured that the same keys were used and the participant is authenticated without revealing their key.

Authentication Process

The key is also used to encrypt messages, but only indirectly. The encryption must be done practically “publicly” in the signal processor (DSP) of the mobile station or the base station. But the it is important that the key does not leave its place. Therfore another algorithm, A8, is used which generates a temporary key Kc in the SIM card from a random number and the key Ki. This can be used to encrypt the stream of information bits using an algorithm called A5.

Encryption of Data in GSM

Source and Channel Coding

As discussed before, PCM Coding of the speech signal would lead to a bit rate of 64 kbit/s. This would correspond to a channel bandwidth of 50 kHz. This would be more bandwidth than analog transmission with FM. To get more capacity with a digital system it is necessary to „compress“ speech. Such „speech coding“ is discussed under „digital signal processing“.

For the GSM system, speech frames of 160 samples corresponding to 20 ms are used for coding.  An LPC (Linear Predictive Coding) analysis is carried out to determine vocal tract parameters. These parameter are most critical for the understandability of speech. Only 36 bits are sufficient to encode these parameters. The second set of parameters is the periodicity. In practice, a filter is estimated here, which predicts the current data from old data that existed a period before. This data can also be encoded with only 36 bits. A so called residual signal remains. 

In the first GSM speech encoder, only every third sample value was encoded. Nevertheless, the audible error generated was relatively small. Also the complexity to encode the residual signal was still easy enough to allow the use of available digital signal processors. The remaining „residual signal“ was encoded with 188 bits. Overall, the speech codec had a bit rate of 260 bits per 20 ms, i.e. 13 kbit/s. This was a reduction of almost five times compared to PCM’s 64 kbit/s.

Speech Coding and Channel Coding of a GSM speech frame

The “raw bits” have to be protected from transmission errors by channel encoders. Especially bits describing the vocal tract must be properly protected to avoid severe disruption. Therefore, the information bits from the speech encoder are divided into three classes. The first class (Ia) with 50 bits are especially protected. They not only contained information bits, but also three check bits which indicate that errors still exist despite error correction. If this would be the case, it is better to repeat old parameters (such as vocal tract parameters) than to allow errors. 132 bits are simply protected by the following channel encoder. This is a convolution encoder that generates 378 bits from 189 bits.

Convolutional encoders are described below. They are relatively easy to implement. 78 bits of the remaining signal are not encoded at all, but are transmitted unprotected. The result is finally 456 bits, that is distributed on four time slots or bursts.

Interleaving

If transmission errors occure, they are not randomly distributed over a time frame. Very often a transmission error destroys several bits in a row. In this case, also a convolutional channel coder would not help. For this reason, neighboring bits are distributed over eight time slots. The first of the 465 bits is placed in time slot 1, the second in time slot 2, the third in time slot 3 until finally the eighth bit is placed in time slot 8. After that, the 9th bit is placed back into time slot 1 and so on. This process is called interleaving or nesting. When receiving, everything must be deinterleaved before processing. If, for example, up to eight bits were disturbed next to each other, this would result in 8 individual errors that are likely to be corrected with the channel decoder.

GSM signal processing for the transmission of speech

Voice Activity Detection

In a normal conversation, one person speaks while the other person is listening. It would therefore be an advantage to stop transmission of a signal if no voice is present. GSM provides for such a voice activity recognition. Certain characteristics in speech analysis are used to determine whether somebody is speaking or just noice or silence is audible. If „silence“ is detected, this is communicated to the network and no transmission is done on the corresponding traffic channel. The channel is not released to other users, but no energy is transmitted in the corresponding time slot. This reduces possible interference with neighboring channels.

However, it is very annoying when the channel is essentially “muted” during this time. The speaking participant would immediately notice that the “counterpart” is no longer present. This is because there is always background noise that would be muted. For this reason, the background noise is constantly estimated by a signal processing algorithm and encoded with a few bits. If a connection is interrupted due to a lack of voice activity, the noise code is briefly transmitted. On the receive side the noice is artificially generated by a noise generator, simulating an existing connection.

Convolutional Coder

In GSM, convolution encoders are used as shown above. A convolution encoder uses “convolution” to create an output sequence from an information sequence that has more bits than the original sequence and therefore contains redundancy. Below we describe an example of how two output values are generated using an algorithm from three input values.

Simple Convolutional Endoder

As shown in the figure, two values are generated per incoming input value. So the number of bits is doubled. At the beginning the “memory values” are zero. Thus, the first pair of x is either 11 (if u(1) = 1 or 00 if u(1) = 0. The values are combined with an AND. The convolution encoder can have 4 different states: 00,01, 10 and 11. But it cannot change from one arbitrary state to another arbitrary state. There are always only two transitions.

Trellis Diagram

The figure above shows the states and the possible transitions. The result is a diagram that is reminiscent of a trellis and is therefore called a trellis diagram. If a sequence of information bits is encoded, this corresponds to a path through such a trellis diagram. Such a path always begins in the state 00 and ends with 00 by adding two zeros to the information bits.

The following figure shows such a path for a given bit sequence
1 1 0 1 0

Trellis Diagram of the sequence 11010

In the above figure only the real trellis path is shown. But there are many more possible paths through a trellis diagram. However, the possible paths must be compared with the received bits. In our case 11 01 01 00 10 11 00. Each step through the path now corresponds to a pair of bits. Those can be compared with the bit pair received. To do this the hemming distance is used. If all the bits are the same, the hamming distance is zero. If one bit differs, the distance is 1 and if two bits are different, the distance is 2.

Theoretical all possible paths through the trellis diagram can be taken into account and the corresponding hamming distances are calculated. The path with the lowest hamming distance is considered to be the „correct path“. If no transmission error occurs, the hamming distance of this lowest path is exactly zero But what happens if a transmission error occurs?

Trellis paths with a transition error

The figure above shows possible paths if an error occurs on the second transmitted pair. Now a distance of one also occurs in the “correct path”. So the entire path has a distance of one. Potentially there could be one (or more) other paths. A path is shown that also has a distance of one in the second step. However, as the process progresses, errors or Hamming distances are added, so that the entire path has a distance of four and is therefore eliminated.

Trellis paths with two transition errors

This figure shows the correct and an alternative path for two transmission errors. The correct path has a Hamming distance of 2, a possible alternative path in this example has a Hamming distance of 4.

In the examples shown, only a few bits are encoded. In realistic applications such as GSM, the information sequences are 189 bits long. Of course, this creates a huge number of paths that cannot simply be tried one after the other. That’s why a special algorithm, the Viterbi algorithm, is used for decoding.

Viterbi Algorithm

Viterbi Algorithm.

The Viterbi algorithm is shown in the figure above. Each state transition together with the received bit pair leads to a Hamming distance. In the example, this is 2 and 0 at the first transition. The corresponding states at this point in time receive the accumulated Hamming distance. In this case 2 and 0. Now the next possible transitions begin with the corresponding Hamming distances. The accumulated distances are now 2, 4, 1 and 1 (because there was an error in this transition. At the next transition, where the trellis has settled, there are two paths for each state. The state now gets the Hamming distance of both paths which is smaller.

For each state and time you store the direction of the path with the slower hamming distance. The next distances are 2, 2, 1, 3. This continues through the trellis. At the end the Hamming distance of the optimal path is in state 00. Now you do “backtracking.” You have remembered where the optimal path came from and can now trace the path backwards and thereby obtain the error-corrected information bit sequence.