What is Information?
For the rest of this series, "information" means data with enough structure and context that a reader can retrieve it, understand it, and act on it. The operative definition has three layers. Each corresponds to a distinct technical literature, and each points at something that can go wrong.
Structure: the mathematical view
The structural minimum for information comes from Claude Shannon's 1948 formulation [1]. Information is the reduction of uncertainty in a signal, measured in bits. One bit is the uncertainty of a single binary choice; sequences of bits encode larger alphabets. Shannon's theory is deliberately syntactic. It describes the structure of a transmission, not its meaning. A perfectly random bit sequence and an equally long piece of English prose can carry the same number of bits, and neither fact tells the mathematical theory anything about which is more useful.
Context: the semantic view
A bit stream that cannot be parsed is not information to a reader. Luciano Floridi's general definition fills the gap Shannon left: information is well-formed, meaningful data [2]. Well-formed means the data follows the syntax of a known format. Meaningful means a reader can decode the semantics of that format. A JPEG file is well-formed and meaningful to an image viewer. The same bytes opened in a text editor are well-formed as bytes but not meaningful, because the text editor has no decoder.
Purpose: the layered view
Even well-formed, meaningful data can fail to help. The DIKW hierarchy, popularized by Russell Ackoff in 1989, tries to capture this by distinguishing data, information, knowledge, and wisdom [3]. The hierarchy has drawn serious criticism: a 2009 review in the Journal of Information Science identified a central logical error in it and called its philosophical foundations dated [4]. The vocabulary survives the critique, even if the metaphor does not. A reader asking a specific question needs the saved record shaped for that question, not just syntactically and semantically valid in the abstract.
In practice
Any file that can be opened, parsed, and usefully queried is information. A file that can be opened but not understood is data. A file that cannot be opened is loss. The three layers above correspond, one for one, to these three failure modes: structure fails, context fails, purpose fails. Each needs a different preservation response, which the rest of the series makes concrete.
References
- [1]Shannon, C. (1948). A Mathematical Theory of Communication. Bell System Technical Journal, 27(3), 379-423. doi:10.1002/j.1538-7305.1948.tb01338.x ↗
- [2]Floridi, L. (2010). Information: A Very Short Introduction. Oxford University Press. global.oup.com ↗
- [3]Ackoff, R. (1989). From Data to Wisdom. Journal of Applied Systems Analysis, 16, 3-9. faculty.ung.edu ↗
- [4]Frické, M. (2009). The knowledge pyramid: a critique of the DIKW hierarchy. Journal of Information Science, 35(2), 131-142. doi:10.1177/0165551508094050 ↗