海纳百川

登录 | 登录并检查站内短信 | 个人设置 网站首页 |  论坛首页 |  博客 |  搜索 |  收藏夹 |  帮助 |  团队  | 注册  | RSS
主题: 刚才仔细看了一下启明转贴的那个文章,刚看了一点,我就不懂了。
回复主题   printer-friendly view    海纳百川首页 -> 罕见奇谈
阅读上一个主题 :: 阅读下一个主题  
作者 刚才仔细看了一下启明转贴的那个文章,刚看了一点,我就不懂了。   
所跟贴 刚才仔细看了一下启明转贴的那个文章,刚看了一点,我就不懂了。 -- Anonymous - (907 Byte) 2005-1-14 周五, 上午11:39 (455 reads)
冬冬
[博客]
[个人文集]

游客









文章标题: 再提个问题: (116 reads)      时间: 2005-1-14 周五, 下午12:09

作者:Anonymous罕见奇谈 发贴, 来自 http://www.hjclub.org

----------------------------------
随着汉字容量增大,信息熵的增加趋缓;汉字增加到12370以后,不再使信息熵有明显的增加。通过数理语言学中著名的齐普夫定律(ZIPF'S LAW)核算,汉字的容量极限是12366个汉字,汉字静态平均信息熵的值是9.65比特,或者说,汉字的平均信息量是9.65比特(见冯志伟提出的“汉字容量极限定律”)。这是当今世界上信息量最大的文字符号系统。下面是联合国五种工作语言文字的信息熵比较:
-------------------------------------------

Zipf's law
(definition)

Definition: The probability of occurrence of words or other items starts high and tapers off. Thus, a few occur very often while many others occur rarely.

Formal Definition: Pn 1/na, where Pn is the frequency of occurrence of the nth ranked item and a is close to 1.

See also Zipfian distribution, Lotka's law, Benford's law, Bradford's law.

Note: In the English language words like "and," "the," "to," and "of" occur often while words like "undeniable" are rare. This law applies to words in human or computer languages, operating system calls, colors in images, etc., and is the basis of many (if not, all!) compression approaches.

Named for George Kingsley Zipf.

Zipf's law is an experimental law, not a theoretical one. Zipfian distributions are commonly observed in many kinds of phenomena. The causes of Zipfian distributions in real life are a matter of some controversy, however.


---------------------------------------------------------
根据Zipf's law的定义,如何能算出汉字的容量极限是12366个汉字???什么叫容量极限?什么叫汉字的容量极限?

常用汉字不是一共才几千个吗?


作者:Anonymous罕见奇谈 发贴, 来自 http://www.hjclub.org
返回顶端
显示文章:     
回复主题   printer-friendly view    海纳百川首页 -> 罕见奇谈 所有的时间均为 北京时间


 
论坛转跳:   
不能在本论坛发表新主题
不能在本论坛回复主题
不能在本论坛编辑自己的文章
不能在本论坛删除自己的文章
不能在本论坛发表投票
不能在这个论坛添加附件
不能在这个论坛下载文件


based on phpbb, All rights reserved.
[ Page generation time: 0.089051 seconds ] :: [ 23 queries excuted ] :: [ GZIP compression enabled ]