Last modified: 2014-06-04 19:31:06 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T68032, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 66032 - AbuseFilter should allow storing regex matches into variables


Summary:	AbuseFilter should allow storing regex matches into variables

Status:	NEW

Product:	MediaWiki extensions
Classification:	Unclassified
Component:	AbuseFilter (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Low enhancement (vote)
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2014-06-02 14:05 UTC by Huji
Modified:	2014-06-04 19:31 UTC (History)
CC List:	4 users (show)

See Also:	47512
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Huji 2014-06-02 14:05:52 UTC

The motivation for this request is the following task:

A user creates a named reference (e.g. <ref name="something">...</ref>) and uses it throughout a page (e.g. <ref name="something"/> ...). Later, another user edits the page, and removes the first definition of that named reference. This causes all subsequent uses to return an error.

The idea is to create an abuse filter that can (a) detect when a named reference is removed, (b) detect that value of the "name" parameter of that reference, and (c) check to see if any other reference tags using the same name parameter exist throughout the page.

Parts a and c are very easy. Part b is only possible if you can store the output of a regular expression match into a variable. Currently, the only regex function allowed by AbuseFilter is rcount, which doesn't serve this purpose.

Comment 1 Liangent 2014-06-02 15:31:29 UTC

In some of our (zhwiki) filters, I use something like:

(removed_lines + 'UNIQUE_STRING' + new_wikitext) rlike '<ref name="([^"]+)">.*UNIQUE_STRING.*<ref name="\1"/>'

Comment 2 Huji 2014-06-02 18:41:02 UTC

Thanks; while I am trying to get that to work, I still want to point out that this method can only pass the matches to the next part of the same regex (i.e. \1 only works within the context of the same rlike function). Ideally, one should be able to start \1 into a variable and then use that variable in a subsequent line of code.

Comment 3 Liangent 2014-06-02 18:46:57 UTC

(In reply to Huji from comment #2)
> Thanks; while I am trying to get that to work, I still want to point out
> that this method can only pass the matches to the next part of the same
> regex (i.e. \1 only works within the context of the same rlike function).
> Ideally, one should be able to start \1 into a variable and then use that
> variable in a subsequent line of code.

Yeah but normally this trick works everywhere where the variable is used a next regex match, because you can always concatenate other variables together and use a big regex to match that long string, like what's done above.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links