User:松/Drafts/Extension:AbuseFilter/Rules format: Difference between revisions

Content added Content deleted
Line 593: Line 593:
=== Performance ===
=== Performance ===


As noted in the table above, some of these variables can be very slow.While writing filters, remember that the condition limit is '''not''' a good metric of how heavy filters are.For instance, variables like <code>*_recent_contributors</code> or <code>*_links</code> always need a DB query to be computed, while <code>*_pst</code> variables will have to perform parsing of the text, which again is a heavy operation; all these variables should be used very, very carefully.
As noted in the table above, some of these variables can be very slow.While writing filters, remember that the condition limit is '''not''' a good metric of how heavy filters are.For instance, variables like <code>*_recent_contributors</code> or <code>*_links</code> always need a DB query to be computed, while <code>*_pst</code> variables will have to perform parsing of the text, which again is a heavy operation; all these variables should be used very, very carefully.For instance, on Italian Wikipedia it's been observed that, with 135 active filters and an average of 450 used conditions, filters execution time was around 500ms, with peaks reaching 15 seconds.Removing the <code>added_links</code> variable from a single filter, and halving the cases when another filter would use <code>added_lines_pst</code> brought the average execution time to 50ms.More specifically:
<translate>
<!--T:279-->
For instance, on Italian Wikipedia it's been observed that, with 135 active filters and an average of 450 used conditions, filters execution time was around 500ms, with peaks reaching 15 seconds.</translate>
<translate>
<!--T:280-->
Removing the <tvar|1><code>added_links</code></> variable from a single filter, and halving the cases when another filter would use <tvar|2><code>added_lines_pst</code></> brought the average execution time to 50ms.</translate>
<translate>
<!--T:281-->
More specifically:


*Use <code>_links</code> variables when you need high accuracy and checking for "http://..." in other variables (for instance, <code>added_lines</code>) could lead to heavy malfunctioning;
<!--T:282-->
* Use <tvar|1><code>_links</code></> variables when you need high accuracy and checking for "<tvar|2>http://...</>" in other variables (for instance, <tvar|3><code>added_lines</code></>) could lead to heavy malfunctioning;</translate>
*Use <code>_pst</code> variables when you're really sure that non-PST variables aren't enough.You may also conditionally decide which one to check: if, for instance, you want to examine a signature, check first if <code>added_lines</code> contains <code><nowiki>~~~</nowiki></code>;
<translate>
<!--T:283-->
* Use <tvar|1><code>_pst</code></> variables when you're really sure that non-PST variables aren't enough.</translate> <translate><!--T:284--> You may also conditionally decide which one to check: if, for instance, you want to examine a signature, check first if <tvar|1><code>added_lines</code></> contains <tvar|2><code><nowiki>~~~</nowiki></code></>;</translate>
*In general, when dealing with these variables, it's always much better to consume further conditions but avoid computing heavy stuff.In order to achieve this, always put heavy variables as last conditions.
*In general, when dealing with these variables, it's always much better to consume further conditions but avoid computing heavy stuff.In order to achieve this, always put heavy variables as last conditions.