Why a stupid forum needs a search option?

  Max Petrov June 2015

      On the first page of your story the lice are crawling
in large numbers. But it is possible to avoid the insects.*

From an answer of Maxim Gorky to an up-and-coming writer

      I took 500 kilobytes of messages from one of Russian forums (the forum was quite smart, because it was about the laws). For comparison, the average volume of a book (for example, of "Dead Souls" written by Gogol) is, exactly, 500 kilobytes. After analysis of such a volume of forum sayings, I have found, that there, it is used only about one thousand of words. To be exact, one algorithm has given a result of ~ 800 words, another algorithm ~ 1200 words. But accuracy is not important, since 1200 words as an index of wealth of the lexicon is not less terrible than 800.

      The figure shows the dependence of number of words, used in all messages of the forum, from volume of text, which is situated in the forum. As you can see, starting from 200 kilobytes of text (200 messages in one kilobyte), the same level can be practically hold, that is, no longer the lexicon of the forum could be significantly expanded by an addition of new messages.

      The forums usually have a narrow thematic focus, and the people, who write on them are not concerned about the search for a lice (according to the example of Maxim Gorky) in the text, and add up the words as they want – they speak by kitchen talking language about the sublime and complex things. Russian speaking Homo Sapiens, meanwhile, knows about 150 thousand words, sometimes more.

      Now, let’s imagine that a thousand of words are used only in the forum, but there, in the forum, there is a small box with the name “Search”, where the user can enter anything (because of no restrictions) from the language stock which he has in his disposal. Mathematically, the probability of luck is 1000/150 000 = 1/150, that is, for every 150 attempts there will be one hit.

      I will be objected: “Any visitor sees well, on which forum he came, so the visitor is obliged to enter the correct words in Search”.

      Yes, of course. But to obtain a complete comprehension of lexicon and thematic of any site, the site, first of all, must be look through. It turns out that to use confidently the Search engine, it is necessary to know in advance the content. But in this case, the Search engine will not be needed anymore. And if the content and lexicon, with the help of which the text is composed, are not known, then it is difficult to use the Search engine because the set of right words, suitable for entry in the box, is two ranks less than that which an average person stores in his memory.

      I think that a Search engine on website or forum, especially on a small site or forum, is a misconception. In a mathematical sense.

      Search index is a structure, where the links are established between the keywords and specific fragments of the text. Index can be shown, in a large scale, in two ways: as a Search bar or in form of a Menu (cloud of tags, drop-down list, etc.). The fundamental difference of the Menu (in all it’s forms) from the Search bar is that the Menu needs some place for it’s placement, but it visibly gives the information about all the content at once; the Search bar is compact, but it is like a black box, nobody knows what is inside of it. As a means of navigation through out the content, Menu and Search bar are not comparable in their efficiency. Let’s imagine that if in a book the usual table of contents suddenly changes into a box for a search. This "table of contents" would be uninformative, and it would hardly be convenient to a reader.

      One thing is Yandex or Google. They have hundreds of thousands or even millions of keywords (in fact Google is multi-lingual). To show and to transmit such a volume to a user by an interactive list is difficult even technically. Another thing is a small forum where the lexicon is a thousand of words, where the number of keywords (which have any thematic meaning) is even less.

      Thus, the Menu from keywords, instead of the Search bar in a forum or site, is possible due to an obvious narrowness of lexicon of forum or website. In comparison with the Search bar, the Menu should be preferred because of it’s greater informational capacity and usability.

      * There is Russian suffix which exactly coincides with Russian word "lice".

