Data representation lies at the core of computing. Every piece of media, every calculation, and every file is ultimately expressed in binary form—streams of 1s and 0s. Understanding how data is represented, transformed, and interpreted is essential for mastering computer science.
To manage large data sizes, prefixes are used to denote magnitude. There are decimal prefixes (kilo = 10³) and binary prefixes (kibi = 2¹⁰) that sound similar but mean different values:
Understanding these distinctions prevents confusion when working with memory sizes and file capacities.
While humans commonly use decimal (base-10), computers use binary (base-2). Other systems bridge the gap:
BCD is a special encoding system where each decimal digit is represented by its own 4-bit binary number.
A number cannot be represented as BCD if any nibble (4 bits) represents a denary number greater than 9. For example, 1010₂ (10) to 1111₂ (15) are invalid BCD digits.
Conversions are crucial for understanding how data moves between human-readable forms and machine-readable forms:
Binary addition and subtraction use the same principles as decimal but with base-2:
Text is stored as numeric codes to represent letters, digits, punctuation, and symbols:
Unicode ensures global communication and consistent text representation across platforms.
Hexadecimal is used for memory addresses, MAC addresses, and web color codes (#RRGGBB).BCD is handy in digital clocks, calculators, and other devices that display decimal digits directly.
Graphics turn binary data into vibrant visuals. Two main approaches are used: bitmaps and vectors.
A bitmap image is a grid of pixels, each with its own color. The level of detail and color depends on resolution and color depth.
Approximate file size = (width × height × bits per pixel) ÷ 8 + header size.
Increasing resolution or color depth improves quality but increases file size. Reducing them saves space but may lower image quality.
Vector images use mathematical descriptions to draw shapes and lines:
Vectors are ideal for logos, icons, and diagrams that must scale to different sizes.
Use bitmaps for complex, photo-realistic images and vectors for images that must be resized frequently without losing clarity (like company logos).
Sound waves are analog, but computers store them as digital data:
Adjusting sampling rate or resolution balances audio quality with file size. For speech, lower settings might be acceptable; for music, higher quality is often preferred.
Compression makes files smaller, speeding up transfers and saving storage. It’s essential for web media, streaming services, and cloud storage.
Definition: Permanently removes some data to achieve smaller file sizes. Typically used for images, audio, and video where slight quality loss is acceptable.
Benefits:
Definition: Compresses data without any loss of the original information. Perfect for text, code, and any scenario where accuracy is paramount.
A common technique is Run-Length Encoding (RLE):
Advantages: The original data is perfectly restorable. Ideal for financial records, legal documents, or any data where accuracy matters.
In practice, formats like ZIP or PNG (for images) use lossless techniques. JPEG (for images) and MP3 (for audio) typically use lossy methods.