Information Theory

Explore the fundamental limits of data compression and transmission through interactive simulations of Entropy, Source Coding, and Channel Capacity.

Laboratory Objectives

Upon completion of this lab, students will be able to:

Quantify Information

Calculate the self-information and entropy of a discrete memoryless source based on symbol probabilities.

Source Coding

Understand the Shannon Source Coding Theorem and design efficient codes using Huffman coding principles.

Channel Capacity

Analyze the effect of noise on binary transmission and calculate the theoretical channel capacity limit.

Theoretical Background

1. Information Content & Entropy

Information theory, founded by Claude Shannon, quantifies information. The information content I(x) of an event x with probability P(x) is defined as:

I(x) = -log₂(P(x)) Unit: Bits (if log base 2)

Entropy (H) is the average information content of a source. It represents the uncertainty or randomness.

H(X) = - Σ P(xᵢ) log₂(P(xᵢ))
  • If one symbol has P=1, entropy is 0 (no uncertainty).
  • For a binary source with equal probability (P=0.5), entropy is maximum at 1 bit/symbol.

2. Source Coding Theorem

The theorem states that the average code length L must be greater than or equal to the entropy H(X) for lossless compression.

H(X) ≤ L < H(X) + 1

Huffman coding is a greedy algorithm used to construct an optimal prefix code that approaches this entropy limit.

3. Channel Capacity

The Shannon-Hartley theorem defines the maximum rate at which information can be transmitted over a noisy channel without error.

C = B log₂(1 + SNR)

Where C is capacity in bits/sec, B is bandwidth, and SNR is Signal-to-Noise Ratio. For a Binary Symmetric Channel (BSC) with error probability p:

C = 1 - H(p) Where H(p) is the binary entropy function of the error probability.

Laboratory Procedure

1

Source Entropy Analysis

Navigate to Experiment 1.

  • Set the probability of Symbol A to 1.0 and others to 0. Observe the Entropy (H=0).
  • Adjust the source to a uniform distribution (A=0.25, B=0.25, C=0.25, D=0.25). Record the maximum Entropy.
  • Create a skewed distribution (e.g., A=0.7, B=0.2, C=0.1, D=0.0). Compare the Entropy to the uniform case.
2

Huffman Coding Efficiency

Observe the generated codes.

  • Note the Huffman codes generated for the skewed distribution. Which symbol gets the shortest code? Why?
  • Calculate the Average Code Length (L) manually and verify it against the simulation.
  • Calculate Coding Efficiency (η = H/L) and verify it is close to 100%.
3

Noisy Channel Simulation

Navigate to Experiment 2.

  • Set Error Probability (p) to 0.0. Send 20 bits. Verify 0% error.
  • Set p to 0.5. Send bits. Observe the output is completely random relative to input. Check the Channel Capacity graph (should be 0).
  • Set p to 0.1. Send 100 bits. Record the experimental error rate. Does it match p?

Experiment 1: Source Entropy & Coding

Visualize how probability distribution affects entropy and code efficiency.

Source Configuration

Adjust probabilities for symbols A, B, C, D. (Auto-normalized)

Calculated Entropy (H): 0.00 bits
Avg Code Length (L): 0.00 bits
Efficiency: 0%

Probability Distribution

Huffman Coding Tree

Tree visualization will appear here

Experiment 2: Binary Symmetric Channel (BSC)

Simulate data transmission over a noisy channel and observe the relationship between Error Probability (p) and Mutual Information.

Channel Transmission

0.10
SOURCE (Input)
1
Noisy Channel
DESTINATION (Output)
1
Bits Sent
0
Errors
0
Error Rate
0%

Channel Capacity

C = 1 - H(p)

Current Capacity
0.90 bits/use

Guidelines for Lab Report Writing

A well-structured lab report is essential for documenting your experimental work and demonstrating your understanding of information theory concepts. Follow these guidelines to prepare your report:

1. Title Page

  • Course name and code
  • Experiment title: "Introduction to Information Theory"
  • Student name and ID
  • Date of experiment
  • Instructor name

2. Abstract (Summary)

Provide a brief overview (150-200 words) summarizing the objectives, key methods used (entropy calculation, Huffman coding, BSC simulation), and main findings regarding source coding efficiency and channel capacity.

3. Introduction

  • State the purpose of the experiment
  • Explain the significance of information theory in communication systems
  • Define key terms: Entropy, Source Coding, Channel Capacity
  • State the theoretical background relevant to the experiments performed

4. Experimental Procedure

Describe the step-by-step methodology:

  • Details of the source entropy experiment (probability settings used)
  • Huffman coding generation process
  • BSC simulation parameters (error probabilities tested)
  • Number of bits transmitted in channel tests

5. Results and Discussion

Present your findings with appropriate tables, graphs, and analysis:

  • Entropy Analysis: Tabulate entropy values for different probability distributions (uniform, skewed, deterministic)
  • Huffman Coding: Show the generated codes, calculate average code length (L) and coding efficiency (η = H/L × 100%)
  • Channel Capacity: Plot theoretical capacity vs. error probability curve; compare with experimental error rates
  • Discuss why entropy is maximum for uniform distribution
  • Explain the significance of H(X) ≤ L
  • Analyze why channel capacity becomes zero when p = 0.5

6. Conclusion

  • Summarize key findings regarding entropy maximization
  • Confirm the validity of Shannon's Source Coding Theorem through your Huffman coding results
  • State the practical implications of channel capacity limits
  • Mention any sources of error or limitations in the simulation

7. References

List all textbooks, online resources, and research papers consulted. Use standard citation format (IEEE or APA).

8. Appendices (if needed)

Include screenshots of simulation results, sample calculations, or additional data tables.

Grading Criteria

Content Accuracy (40%): Correct calculations, proper use of formulas, accurate data recording

Analysis & Discussion (30%): Depth of understanding, proper interpretation of results, connection to theory

Organization & Presentation (20%): Logical structure, clear graphs/tables, proper formatting

Conclusion Quality (10%): Summary of findings, insight into practical applications