Hi Laurent,
I agree with you in all what you wrote in the first part of your contribution. But I must observe that you wrote about only one part of the necessary decision process. You should take into account that columns in the WN history table are mutually affected. In fact only one of the columns is really important in your decision making process regardless which one it is. :vogel: Because even very accurate estimation in one column (difference +/- 1) will lead to relatively large rest of the package we should use the other columns to make selection more accurate.
Thus the steps applied in decision making process should be as follows:
1) To decide in which columns you can make decision about the future orientation (negative or positive direction) and about the future absolute value in the best and easy way. To do this you can use
→ History differences table; this screen contains simulator (simulation button in the right bottom corner) so you can simulate up to 100000 draws by random generator. As you can see some columns are of small range; the highest difference is for example smaller than 25 and numbers are mainly in the top part of table. So in such case the probable orientation of the future deviation should be positive.
→ History charts (Sum charts and Deviation charts) here You also can estimate orientation and absolute value of the future deviation (in order to obtain better imagination you can use „Draw simulation button“ editing fields.
2)
So if you decided about numbers for one column the values are predetermined also for the other columns. The smaller the range of margins the smaller the count of remaining tickets in the package. When you perform Statistics of such rest in the package, you can identify what numbers are the most frequent and/or what couples are the most frequent. (But you must be aware that results are usually of the type either/or (excluding when the rest is really very small)) - meaning that not all most frequent numbers are actually the winning ones.
3) As you wrote the number of columns with difference between +5 to –5 is usually from 1 to 5. It should be noted that we recognized for the most frequent situation the difference
–10 + 10 in 3 to 5 columns. If we take it for general rule, we have the first criterion to perform selection from the full package. When we say „in any 3 to 5“ columns we can obtain about half of the initial contents of the package. However using the history differences tables we can try to estimate which columns should have this small difference. Note the shape of the each individual column, pls. Usually majority of the numbers is in its top part. Regarding the fact that in each draw the maximum possible positive deviation is +43 (smaller if the first line of appropriate column does not contain at least 6 numbers) you can perform estimation concerning the +5/-5 deviation. (Either six numbers close to deviation with value 8 or prevailing count of numbers from the top part (4 to 5) with 1 to 2 numbers from the bottom part with bigger deviation etc.. For example if such columns contains more than 30 numbers in its top part the likelihood for the positive deviation is higher.
4) So if you decide (according to your letter) that 1 to 3 columns are to be with differences between –30 to –21 and 21 to 30, you can verify this decision from the point of view of the History differences table. You can identify whether it is possible at all and what conditions are necessary to fulfil your estimate. (In principle when you are able to say which columns are those with +5/-5 difference then in case of four columns the rest of the package should be small enough. The same applies also for large future deviations. Particularly in case of large negative deviations the applicable count of numbers is very small. Thus in certain cases you can estimate values for 5 columns only and when the margins are close to each other (for example +5/-4) you need not estimate remaining values at all. In certain cases when you use very small margins for +/- deviations trying to estimate the correct values in each column none of tickets even from full package could match such selection.
5) The +10/ –10 interval applied for each particular deviation is the condition to obtain rest in the package from tens to hundreds. Then you should apply any other mechanism of selection (for example tickets containing the first of the most frequent couples completed by tickets containing the second of the most frequent couples only (as you remember this selection corresponds usually to either/or logic)).
So based on the above mentioned I think the better way to perform correct selection is as follows:
→ Estimate for each column the conditions (maybe with their likelihood) under which the deviation could be strongly positive (more than +5) or neutral (+5/-5) or negative (less than –5 up to for example -15), or strong negative (less than –15).
→ Perform selection for the so called 'sure' columns (for example I decided that in columns 0, 3, 4 and 8 the future deviation is to be from –5 to +5 from the existing value and in the column 6 the deviation will fall by 45 in comparison with existing value.
In remaining columns you could leave margins without restriction (wide enough) in this first step.
Now you can analyze rest in the package (usually it is small enough to allow fast processing) concerning different possibilities for each column undecided so far with possible selection of representative tickets from each step.
Btw, as for your filter. In order to verify such filter lot of time is necessary, so this work was not done yet. However I would like to promise that I shall try it later with following report containing my final opinion.
At the first sight I would go as far as to say that some combinations (OR in particular) will lead even to full package.
Josef