Last modified: 2014-11-06 21:24:11 UTC
Betacommand Ive got an interesting idea for the externallinks table. What about having a including the timestamp that a link was added? Like what happens with cl_timestamp ? legoktm what's your usecase Reedy it's doable, but ^ Betacommand legoktm: tracking when links are added, so batch requests for archival (IE a partnership with IA) can be done Betacommand or tracking how long a link has been in an article without having to check every diff Betacommand tracking overall external link volume over time Betacommand or within a given time span legoktm sounds useful legoktm file a bug? Betacommand legoktm: I was thinking about it but wanted a sounding board first This issue came up as I was thinking about external link recovery (Preventing link rot). Right now there is zero ways of finding external links that have been added in the last X time. Which means any attempt at proactive archiving of URLs must be done via database dumps and diffing the externallinks table between two dumps. While it may be feasible for smaller wikis any type of diffing on a large scale easily becomes unmanageable. Being able to do a select based off a given times would enable this and would allow nightly incremental dumps that could then be passed to archival sites to take proactive steps to avoid link rot.
Sounds sane; the actual cost of adding a timestamp should be essentially nil, and I can think of a couple use cases when patrolling for spam links that make it easier than trawling the RC. That said, the column would be nearly useless without an index and I know there's a cost for /that/, so someone more versed in performance will need to chime in.
(In reply to Marc A. Pelletier from comment #1) > Sounds sane; the actual cost of adding a timestamp should be essentially > nil, and I can think of a couple use cases when patrolling for spam links > that make it easier than trawling the RC. > > That said, the column would be nearly useless without an index and I know > there's a cost for /that/, so someone more versed in performance will need > to chime in. I think it should be alright. Any indexing has some cost, and we regularly index many tables/columns with timestamps. The cost of doing should be fine as there's a reasonable use case, rather than a generic "this might be useful"