Last modified: 2014-06-07 18:10:45 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T64566, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 62566 - Segregate and document configuration variables
Segregate and document configuration variables
Status: NEW
Product: Analytics
Classification: Unclassified
Wikistats (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-03-12 10:52 UTC by Nemo
Modified: 2014-06-07 18:10 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Nemo 2014-03-12 10:52:31 UTC
One problem with using wikistats is that it has a lot of configuration variables.
A simple grep -Er "^[A-Za-z]+=" . finds:

----

./bash/collect_edits.sh:wikistats=/a/wikistats_git
./bash/collect_edits.sh:dumps=$wikistats/dumps
./bash/collect_edits.sh:perl=$dumps/perl
./bash/collect_edits.sh:csv=$dumps/csv
./bash/collect_edits.sh:input=/mnt/data/xmldatadumps/public/nlwikinews/20121115/nlwikinews-20121115-stub-meta-history.xml.gz
./bash/report_en.sh:wikistats=/a/wikistats_git
./bash/report_en.sh:dumps=$wikistats/dumps
./bash/report_en.sh:perl=$dumps/perl
./bash/report_en.sh:bash=$dumps/bash
./bash/report_en.sh:logs=$dumps/logs
./bash/report_en.sh:csv=$dumps/csv
./bash/report_en.sh:out=$dumps/out
./bash/progress_wikistats.sh:wikistats=/a/wikistats_git
./bash/progress_wikistats.sh:dumps=$wikistats/dumps
./bash/progress_wikistats.sh:perl=$dumps/perl
./bash/progress_wikistats.sh:out=$dumps/out
./bash/progress_wikistats.sh:dammit=/a/dammit.lt
./bash/progress_wikistats.sh:htdocs=stat1001.wikimedia.org::a/srv/stats.wikimedia.org/htdocs
./bash/zip_all.sh:wikistats=/a/wikistats_git
./bash/backup_monthly.sh:wikistats=/a/wikistats_git
./bash/backup_monthly.sh:backup=$wikistats/backup
./bash/backup_monthly.sh:dumps=$wikistats/dumps
./bash/backup_monthly.sh:csv=$dumps/csv
./bash/backup_monthly.sh:dt=$(date +[%Y-%m-%d][%H:%M])
./bash/report.sh:wikistats=/a/wikistats_git
./bash/report.sh:dumps=$wikistats/dumps
./bash/report.sh:perl=$dumps/perl
./bash/report.sh:bash=$dumps/bash
./bash/report.sh:csv=$dumps/csv
./bash/report.sh:out=$dumps/out
./bash/report.sh:htdocs=stat1001.wikimedia.org::a/srv/stats.wikimedia.org/htdocs/
./bash/report.sh:log=$dumps/logs/log_report_sh.txt
./bash/report.sh:interval=0  # only update non-English reports once per 'interval' days 
./bash/report.sh:projectcode="$1"
./bash/count_commons_images_wlm.sh:wikistats=/a/wikistats_git
./bash/count_commons_images_wlm.sh:dumps=$wikistats/dumps      
./bash/count_commons_images_wlm.sh:perl=$dumps/perl
./bash/count_commons_images_wlm.sh:perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/count_commons_images_wlm.sh:csv=$dumps/csv
./bash/count_commons_images_wlm.sh:countrycodes=/a/wikistats_git/squids/csv/meta/CountryCodes.csv
./bash/count_editors.sh:wikistats=/a/wikistats_git
./bash/count_editors.sh:dumps=$wikistats/dumps
./bash/count_editors.sh:perl=$dumps/perl
./bash/count_editors.sh:csv=$dumps/csv
./bash/count_editors.sh:out=$dumps/out
./bash/count_editors.sh:htdocs=stat1001.wikimedia.org::a/srv/stats.wikimedia.org/htdocs/
./bash/count_editors.sh:bashpath="${PWD}"
./bash/backup_weekly.sh:wikistats=/a/wikistats_git
./bash/backup_weekly.sh:backup=$wikistats/backup
./bash/backup_weekly.sh:analytics=$wikistats/analytics
./bash/backup_weekly.sh:dammit=$wikistats/dammit.lt
./bash/backup_weekly.sh:dumps=$wikistats/dumps
./bash/backup_weekly.sh:perl=$dumps/perl
./bash/backup_weekly.sh:bash=$dumps/bash
./bash/backup_weekly.sh:csv=$dumps/csv
./bash/backup_weekly.sh:out=$dumps/out
./bash/backup_weekly.sh:projectcounts=$dammit/projectcounts
./bash/backup_weekly.sh:dt=$(date +[%Y-%m-%d][%H:%M])
./bash/report_all_editors.sh:wikistats=/a/wikistats_git
./bash/report_all_editors.sh:dumps=$wikistats/dumps
./bash/report_all_editors.sh:perl=$dumps/perl
./bash/report_all_editors.sh:csv=$dumps/csv
./bash/report_all_editors.sh:out=$dumps/out
./bash/report_all_editors.sh:htdocs=stat1001.wikimedia.org::a/srv/stats.wikimedia.org/htdocs/
./bash/zip_csv.sh:wikistats=/a/wikistats_git
./bash/zip_csv.sh:csv=$wikistats/dumps/csv
./bash/count_report_publish_wmf.sh:wikistats=/a/wikistats_git
./bash/count_report_publish_wmf.sh:dumps=$wikistats/dumps
./bash/count_report_publish_wmf.sh:perl=$dumps/perl
./bash/count_report_publish_wmf.sh:csv=$dumps/csv
./bash/count_report_publish_wmf.sh:out=$dumps/out
./bash/count_report_publish_wmf.sh:php=/a/mediawiki/core/languages
./bash/count_report_publish_wmf.sh:force=-f
./bash/count_report_publish_wmf.sh:date=today
./bash/archived_used_once_or_obsolete/regusers.sh:dumps=/mnt/data/xmldatadumps
./bash/archived_used_once_or_obsolete/titles.sh:m=wp
./bash/archived_used_once_or_obsolete/titles.sh:p=afwiki
./bash/archived_used_once_or_obsolete/titles.sh:dumps=/mnt/data/xmldatadumps
./bash/archived_used_once_or_obsolete/extract_reg_user.sh:wiki=enwiki
./bash/archived_used_once_or_obsolete/extract_reg_user.sh:date=20091103
./bash/archived_used_once_or_obsolete/extract_reg_user.sh:dumps=/mnt/data/xmldatadumps
./bash/archived_used_once_or_obsolete/publish_scripts.sh:htdocs=stat1001.wikimedia.org::a/srv/stats.wikimedia.org/htdocs/
./bash/archived_used_once_or_obsolete/publish_scripts.sh:perl=/a/wikistats/scripts/perl
./bash/archived_used_once_or_obsolete/publish.sh:now=`date +%s`
./bash/archived_used_once_or_obsolete/publish.sh:htdocs="stat1001.wikimedia.org::a/srv/stats.wikimedia.org/$dir/csv"
./bash/archived_used_once_or_obsolete/publish.sh:csv="/a/wikistats/csv_$1"
./bash/archived_used_once_or_obsolete/publish.sh:archive="/mnt/data/xmldatadumps/public/other/pagecounts-ez/wikistats" # odd name, temp location
./bash/archived_used_once_or_obsolete/publish.sh:publish="#publish.txt"
./bash/archived_used_once_or_obsolete/publish_regions.sh:htdocs=stat1001.wikimedia.org::a/srv/stats.wikimedia.org/htdocs/
./bash/report_one_only.sh:wikistats=/a/wikistats_git
./bash/report_one_only.sh:dumps=$wikistats/dumps
./bash/report_one_only.sh:perl=$dumps/perl
./bash/report_one_only.sh:bash=$dumps/bash
./bash/report_one_only.sh:logs=$dumsp/logs
./bash/report_one_only.sh:csv=$dumps/csv
./bash/report_one_only.sh:out=$dumps/out
./bash/report_one_only.sh:htdocs=stat1001.wikimedia.org::a/srv/stats.wikimedia.org/htdocs/
./bash/report_one_only.sh:mode=wp
./bash/report_one_only.sh:lang=en
./bash/count_prep_animations.sh:wikistats=/a/wikistats_git
./bash/count_prep_animations.sh:dumps=$wikistats/dumps       
./bash/count_prep_animations.sh:perl=$dumps/perl
./bash/count_prep_animations.sh:perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/count_prep_animations.sh:csv=$dumps/csv
./bash/count_prep_animations.sh:out=$wikistats/animations/growth
./bash/count_prep_animations.sh:htdocs=stat1001.wikimedia.org::a/srv/stats.wikimedia.org/htdocs/
./bash/count_report_publish_non_wp.sh:wikistats=/a/wikistats_git
./bash/count_report_publish_non_wp.sh:dumps=$wikistats/dumps
./bash/count_report_publish_non_wp.sh:bash=$dumps/bash
./bash/count_report_publish_non_wp.sh:log=$dumps/logs/log_count_report_publish_non_wp.txt
./bash/report_all.sh:wikistats=/a/wikistats_git
./bash/list_newest_dumps.sh:wikistats=/a/wikistats_git
./bash/list_newest_dumps.sh:dumps=$wikistats/dumps
./bash/list_newest_dumps.sh:perl=$dumps/perl
./bash/list_newest_dumps.sh:csv=$dumps/csv
./bash/list_newest_dumps.sh:dblists=$dumps/dblists
./bash/collect_countable_namespaces.sh:wikistats=/a/wikistats_git
./bash/collect_countable_namespaces.sh:perl=$wikistats/dumps/perl
./bash/collect_countable_namespaces.sh:perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/collect_countable_namespaces.sh:csv=$wikistats/dumps/csv
./bash/collect_countable_namespaces.sh:htdocs=stat1001.wikimedia.org::a/srv/stats.wikimedia.org/htdocs/
./bash/report_regions.sh:wikistats=/a/wikistats_git
./bash/report_regions.sh:dumps=$wikistats/dumps
./bash/report_regions.sh:perl=$dumps/perl
./bash/report_regions.sh:bash=$dumps/bash
./bash/report_regions.sh:csv=$dumps/csv
./bash/report_regions.sh:out=$dumps/out
./bash/report_regions.sh:htdocs=stat1001.wikimedia.org::a/srv/stats.wikimedia.org/htdocs/
./bash/report_regions.sh:log=$dumps/logs/log_report_regions.txt
./bash/sort_dblists.sh:wikistats=/a/wikistats_git
./bash/sort_dblists.sh:dumps=$wikistats/dumps
./bash/sort_dblists.sh:perl=$dumps/perl
./bash/sort_dblists.sh:perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/sort_dblists.sh:csv=$dumps/csv
./bash/sort_dblists.sh:dblists=$dumps/dblists
./bash/report_test.sh:wikistats=/a/wikistats_git
./bash/report_test.sh:dumps=$wikistats/dumps
./bash/report_test.sh:perl=$dumps/perl
./bash/report_test.sh:perl=/home/ezachte/wikistats/dumps/perl # test
./bash/report_test.sh:csv=$dumps/csv
./bash/report_test.sh:out=$dumps/out
./bash/pageviews_monthly_sp.sh:wikistats=/a/wikistats_git
./bash/pageviews_monthly_sp.sh:dumps=$wikistats/dumps
./bash/pageviews_monthly_sp.sh:perl=$dumps/perl
./bash/pageviews_monthly_sp.sh:csv=$dumps/csv
./bash/pageviews_monthly_sp.sh:out=$dumps/out
./bash/pageviews_monthly_sp.sh:report=$dumps/logs/log_pageviews_monthly_sp.txt
./bash/pageviews_monthly_sp.sh:htdocs=stat1001.wikimedia.org::a/srv/stats.wikimedia.org/htdocs/
./bash/count_report_publish_wp.sh:wikistats=/a/wikistats_git
./bash/count_report_publish_wp.sh:log=$wikistats/dumps/logs/log_count_report_publish_concise_wp.txt
./bash/report_publish_some.sh:wikistats=/a/wikistats_git
./bash/report_publish_some.sh:dumps=$wikistats/dumps
./bash/report_publish_some.sh:bash=$dumps/bash
./bash/merge_editors.sh:wikistats=/a/wikistats_git
./bash/merge_editors.sh:dumps=$wikistats/dumps
./bash/merge_editors.sh:perl=$dumps/perl
./bash/merge_editors.sh:csv=$dumps/csv
./bash/merge_editors.sh:log=$dumps/logs/log_merge_editors.txt
./bash/count.sh:project=$1
./bash/count.sh:wikistats=/a/wikistats_git
./bash/count.sh:dumps=$wikistats/dumps                     # folder for scripts and output
./bash/count.sh:perl=$dumps/perl
./bash/count.sh:perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/count.sh:csv=$dumps/csv
./bash/count.sh:bash=$dumps/bash
./bash/count.sh:dblists=$dumps/dblists
./bash/count.sh:php=/a/mediawiki/core/languages
./bash/count.sh:trace=-r # trace resources
./bash/pageviews_monthly.sh:wikistats=/a/wikistats_git
./bash/pageviews_monthly.sh:dumps=$wikistats/dumps
./bash/pageviews_monthly.sh:perl=$dumps/perl
./bash/pageviews_monthly.sh:csv=$dumps/csv
./bash/pageviews_monthly.sh:out=$dumps/out
./bash/pageviews_monthly.sh:report=$dumps/logs/log_pageviews_monthly.txt
./bash/pageviews_monthly.sh:projectcounts=/a/dammit.lt/projectcounts
./bash/pageviews_monthly.sh:htdocs=stat1001.wikimedia.org::a/srv/stats.wikimedia.org/htdocs/
./bash/pageviews_monthly.sh:list=WhiteListWikis.csv
./bash/count_state_of_the_wiki.sh:wikistats=/a/wikistats_git
./bash/count_state_of_the_wiki.sh:dumps=$wikistats/dumps
./bash/count_state_of_the_wiki.sh:perl=$dumps/perl
./bash/count_state_of_the_wiki.sh:perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/count_state_of_the_wiki.sh:csv=$dumps/csv
./bash/count_state_of_the_wiki.sh:log=$dumps/logs/count_wikis_by_size_by_growth.log
./bash/count_state_of_the_wiki.sh:htdocs=stat1001.wikimedia.org::a/srv/stats.wikimedia.org/htdocs/
./bash/publish_all.sh:wikistats=/a/wikistats_git
./bash/publish_all.sh:dumps=$wikistats/dumps
./bash/publish_all.sh:bash=$dumps/bash
./bash/publish_all.sh:bash=/home/ezachte/wikistats/dumps/bash # tests
./bash/sync_language_files.sh:wikistats=/a/wikistats_git
./bash/sync_language_files.sh:dumps=$wikistats/dumps
./bash/sync_language_files.sh:csv=$dumps/csv
./bash/tar_data_reportcard.sh:wikistats=/a/wikistats_git
./bash/tar_data_reportcard.sh:csv=$wikistats/dumps/csv
./bash/count_merge_editors.sh:wikistats=/a/wikistats_git
./bash/count_merge_editors.sh:dumps=$wikistats/dumps
./bash/count_merge_editors.sh:perl=$dumps/perl
./bash/count_merge_editors.sh:perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/count_merge_editors.sh:csv=$dumps/csv
./bash/count_merge_editors.sh:log=$dumps/logs/count_merge_editors.log
./bash/zip_out.sh:wikistats=/a/wikistats_git
./bash/zip_out.sh:out=$wikistats/dumps/out
./bash/count_words.sh:x=1
./bash/count_wp_one.sh:wikistats=/a/wikistats_git
./bash/count_wp_one.sh:dumps=$wikistats/dumps
./bash/count_wp_one.sh:perl=$dumps/perl
./bash/count_wp_one.sh:perl=/home/ezachte/wikistats/dumps/perl # tests 
./bash/count_wp_one.sh:csv=$dumps/csv
./bash/count_wp_one.sh:php=/a/mediawiki/core/languages
./bash/count_wp_one.sh:date=auto # 20101231 # auto
./bash/count_wp_one.sh:x=fywiki
./bash/count_wp_one.sh:project=wp
./perl/WikimediaDownload.pl:EXs=screen;EXw=EXs.width;navigator.appName!="Netscape"?
./perl/WikimediaDownload.pl:EXb=EXs.colorDepth:EXb=EXs.pixelDepth;
./perl/WikimediaDownload.pl:EXd=document;EXw?"":EXw="na";EXb?"":EXb="na";
./perl/WikimediaDownload.pl:src="http://nht-2.extreme-dm.com/n3.g?login=infodis&url=nojs&j=n&jv=n&pv=" />
./perl/WikiReportsScripts.pm:border=0 width=1 alt=''></a>
./perl/WikiReportsScripts.pm:EXs=screen;EXw=EXs.width;navigator.appName!='Netscape'?
./perl/WikiReportsScripts.pm:EXb=EXs.colorDepth:EXb=EXs.pixelDepth;
./perl/WikiReportsScripts.pm:EXd=document;
./perl/WikiReportsScriptsHtml.pm:border=0 width=1 alt=''></a>
./perl/WikiReportsScriptsHtml.pm:EXs=screen;EXw=EXs.width;navigator.appName!='Netscape'?
./perl/WikiReportsScriptsHtml.pm:EXb=EXs.colorDepth:EXb=EXs.pixelDepth;
./perl/WikiReportsScriptsHtml.pm:EXd=document;

----

Is it possible to concentrate all the configuration in a single file? I don't know anything about multiple-file bash/shell scripts. It would be nice to have only one file to edit.
If you tell me what's an acceptable path, I'd gladly submit patches.
Comment 1 Nemo 2014-03-12 11:36:42 UTC
Hm, also:

$ grep -r "/home/ezachte" .
./bash/report.sh:#perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/count_commons_images_wlm.sh:perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/count_editors.sh:  perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/backup_weekly.sh:cd /home/ezachte
./bash/report_all_editors.sh:# perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/archived_used_once_or_obsolete/titles.sh:perl WikiStatsCollectArticleNames.pl -p $p -i $dumps/public/$p -o /home/ezachte/wikistats/titles
./bash/archived_used_once_or_obsolete/publish_all.sh:cd /home/ezachte/wikistats
./bash/count_prep_animations.sh:perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/count_report_publish_non_wp.sh:# bash=/home/ezachte/wikistats/dumps/bash # tests
./bash/collect_countable_namespaces.sh:perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/sort_dblists.sh:perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/report_test.sh:perl=/home/ezachte/wikistats/dumps/perl # test
./bash/pageviews_monthly_sp.sh:#perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/count.sh:perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/pageviews_monthly.sh:# perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/pageviews_monthly.sh:# projectcounts=/home/ezachte/test/projectcounts  # tests
./bash/count_state_of_the_wiki.sh:perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/publish_all.sh:bash=/home/ezachte/wikistats/dumps/bash # tests
./bash/count_merge_editors.sh:perl=/home/ezachte/wikistats/dumps/perl # tests
./bash/count_wp_one.sh:perl=/home/ezachte/wikistats/dumps/perl # tests 
./perl/TestEzLib.pl:  use lib "/home/ezachte/lib" ;
./perl/WikiCountsArguments.pm:  use lib "/home/ezachte/lib" ;
./perl/WikiReportsSelectTimelines.pl:  use lib "/home/ezachte/lib" ;
./perl/QD_tools/TestSortHash.pl:use lib "/home/ezachte/lib" ;
./perl/QD_tools/WikiStatsWikipediaWeekly.pl:    require "/home/ezachte/wikistats/WikiReportsDate.pl" ;
./perl/QD_tools/WikiCountsRegUsers.pl:  use lib "/home/ezachte/lib" ;
./perl/QD_tools/WikiCountsRegUsers.pl:if (-e "/home/ezachte/")
./perl/QD_tools/WikiCountsRegUsers.pl:if (-e "/home/ezachte/")
./perl/QD_tools/WikiStatsScanCategories.pl:  { $dir = "/home/ezachte/wikistats/viewspercat" ; }
./perl/QD_tools/WikiStatsScanCategories.pl:  require "/home/ezachte/wikistats/WikiReportsDate.pl" ;
./perl/QD_tools/WikiStatsPageViewsPerPagePerCategory.pl:  use lib "/home/ezachte/lib" ;
./perl/QD_tools/WikiStatsPageViewsPerPagePerCategory.pl:    $dir_out = "/home/ezachte/wikistats/viewspercat" ;
./perl/QD_tools/WikiExtractRegUsers.pl:if (-e "/home/ezachte/")
./perl/QD_tools/WikiExtractRegUsers.pl:#if (-e "/home/ezachte/")
./perl/WikiReportsProcessReverts.pm:  use lib "/home/ezachte/lib" ;
./perl/WikiCountWords.pl:  use lib "/home/ezachte/lib" ;
./perl/WikiReportsOutputTables.pm:  use lib "/home/ezachte/lib" ;
./perl/EzLib.pm:use lib "/home/ezachte/lib" ;
./perl/EzLib.pm:if ($os_linux) #  && (-d "/home/ezachte")) # runs on server, to be refined
./perl/EzLib.pm:  $path_home = "/home/ezachte" ;
./perl/EzLib.pm:  $path_pm = "/home/ezachte/lib/$file_pm" ;
./perl/WikiCountsTimeDistribution.pl:  use lib "/home/ezachte/lib" ;
./perl/WikiReports.pl:  use lib "/home/ezachte/lib" ;
./perl/WikiReportsSampledVisitorsLog.pl:    $dir_root = "/home/ezachte" ;
./perl/WikiCountsScanNamespacesWithContent.pl:  use lib "/home/ezachte/lib" ;
./perl/WikiCountsSummarizeProjectCounts.pl:  use lib "/home/ezachte/lib" ;
./perl/WikiCountsRankPageHistory.pl:  use lib "/home/ezachte/lib" ;
./perl/WikiCountsJobProgress.pl:  use lib "/home/ezachte/lib" ;
./perl/WikiCounts.pl:  use lib "/home/ezachte/lib" ;

though there is some

$ grep -r cfg_liblocation .
./squids/conf-editors/SquidReportArchiveConfig-editors.pm:#$cfg_liblocation       = "/a/wikistats_git/squids/perl" ;
./squids/conf-editors/SquidReportArchiveConfig-editors.pm:$cfg_liblocation       = "/home/spetrea/wikistats/wikistats/squids/perl" ;
./squids/conf-editors/SquidCountArchiveConfig-editors.pm:#$cfg_liblocation = "$squids/perl" ;
./squids/conf-editors/SquidCountArchiveConfig-editors.pm:#$cfg_liblocation       = "/a/wikistats_git/squids/perl" ;
./squids/conf-editors/SquidCountArchiveConfig-editors.pm:$cfg_liblocation       = "/home/spetrea/wikistats/wikistats/squids/perl" ;
./squids/testdata/regression-countries-count-arithmetic/SquidReportArchiveConfig.pm:$cfg_liblocation       = "$__CODE_BASE/perl" ;
./squids/testdata/regression-countries-count-arithmetic/SquidCountArchiveConfig.pm:$cfg_liblocation          = "$__CODE_BASE/perl" ;
./squids/testdata/regression-mingle-356-bugzilla-46269/SquidReportArchiveConfig.pm:$cfg_liblocation       = "$__CODE_BASE/perl" ;
./squids/testdata/regression-mingle-356-bugzilla-46269/SquidCountArchiveConfig.pm:$cfg_liblocation          = "$__CODE_BASE/perl" ;
./squids/testdata/regression-mismatch-world-north-south-unknown/SquidReportArchiveConfig.pm:$cfg_liblocation       = "$__CODE_BASE/perl" ;
./squids/testdata/regression-mismatch-world-north-south-unknown/SquidCountArchiveConfig.pm:$cfg_liblocation          = "$__CODE_BASE/perl" ;
./squids/testdata/merge-australia-into-oceania/SquidReportArchiveConfig.pm:$cfg_liblocation       = "$__CODE_BASE/perl" ;
./squids/testdata/merge-australia-into-oceania/SquidCountArchiveConfig.pm:$cfg_liblocation          = "$__CODE_BASE/perl" ;
./squids/testdata/regression-sample/SquidReportArchiveConfig.pm:$cfg_liblocation       = "$__CODE_BASE/perl" ;
./squids/testdata/regression-sample/SquidCountArchiveConfig.pm:$cfg_liblocation          = "$__CODE_BASE/perl" ;
./squids/testdata/regression-test-ipv6-wrong-external-domain/SquidReportArchiveConfig.pm:$cfg_liblocation       = "$__CODE_BASE/perl" ;
./squids/testdata/regression-test-ipv6-wrong-external-domain/SquidCountArchiveConfig.pm:$cfg_liblocation          = "$__CODE_BASE/perl" ;
./squids/testdata/regression-tablets-discrepancy_for_config_editors/SquidReportArchiveConfig.pm:$cfg_liblocation       = "$__CODE_BASE/perl" ;
./squids/testdata/regression-tablets-discrepancy_for_config_editors/SquidCountArchiveConfig.pm:$cfg_liblocation          = "$__CODE_BASE/perl" ;
./squids/testdata/regression-totals-fixes-for-squidreportclients/SquidReportArchiveConfig.pm:$cfg_liblocation       = "$__CODE_BASE/perl" ;
./squids/testdata/regression-totals-fixes-for-squidreportclients/SquidCountArchiveConfig.pm:$cfg_liblocation          = "$__CODE_BASE/perl" ;
./squids/conf-mobile/SquidCountArchiveConfig-mobile.pm:#$cfg_liblocation = "$squids/perl" ;
./squids/conf-mobile/SquidCountArchiveConfig-mobile.pm:#$cfg_liblocation       = "/a/wikistats_git/squids/perl" ;
./squids/conf-mobile/SquidCountArchiveConfig-mobile.pm:$cfg_liblocation       = "/home/spetrea/wikistats/wikistats/squids/perl" ;
./squids/conf-mobile/SquidReportArchiveConfig-mobile.pm:#$cfg_liblocation       = "/a/wikistats_git/squids/perl" ;
./squids/conf-mobile/SquidReportArchiveConfig-mobile.pm:$cfg_liblocation       = "/home/spetrea/wikistats/wikistats/squids/perl" ;
./squids/perl/SquidReportArchiveConfig.pm:  $cfg_liblocation       = "$squids/perl" ;
./squids/perl/SquidCountArchive.pl:  croak "Expected \$cfg_liblocation to be defined inside config   .pm file" if !defined $cfg_liblocation;
./squids/perl/SquidCountArchive.pl:  unshift(@INC,$cfg_liblocation); 
./squids/perl/SquidCountArchiveWriteOutput.pm:use lib $cfg_liblocation ;
./squids/perl/SquidCountryScanConfig.pm:$cfg_liblocation   = "$squids/perl/" ;
./squids/perl/SquidCountArchiveConfig.pm:$cfg_liblocation = "$squids/perl" ;
./squids/perl/SquidCountryScan.pl:  use lib $cfg_liblocation ;
./squids/perl/SquidReportArchive.pl:  croak "Expected \$cfg_liblocation to be defined inside config   .pm file" if !defined $cfg_liblocation;
./squids/perl/SquidReportArchive.pl:  unshift(@INC,$cfg_liblocation); 
./dumps/perl/SquidReportArchive.pl:  use lib $cfg_liblocation ;
Comment 2 Nemo 2014-03-12 14:13:36 UTC
Pasting here what I'm doing for one wiki so that I don't forget. Some manual actions are also needed, but those might be an error in configuration so I'm keeping everything together.

0) Checkouts
1) Replace the path (almost) everywhere:
    sed -i "s/\/a\/wikistats_git/MYESCAPEDPATH/g" dumps/bash/*sh dumps/perl/*pl dumps/perl/*pm
2) Comment the path tests https://gerrit.wikimedia.org/r/118261 and adjust the other settings including dumps_public, php, x (I put x=translatewiki), project (I put wx); set the date variable to an 8 digit timestamp like 20140101 and add /$date to the -i argument.
3) Create the directory $dumps_public/$project/$date and place your bz2 dump in it.
4) Create the directories $csv, $csv/csv_$project/ , $csv/temp and set write permissions
5) Manually create in the $csv directory a csv file (csv_mw/StatisticsContentNamespaces.csv) like http://stats.wikimedia.org/wikimedia/misc/StatisticsContentNamespaces.csv; in my case (https://translatewiki.net/w/api.php?action=query&meta=siteinfo&siprop=namespaces):

project code,language code,content namespaces
wx,translate,0|1102|1256|1214|1242|1250|1236|1202|1218|1254|1228|1240|1244|1210|1258|8|1230|1246|1212|1204|1220|1234|1216|1222|1238|1226|1208|1252|1200|1232|1206|1224

Run the script! Now it's been running for a few seconds without errors, maybe I'll get some output. :D
Comment 3 Nemo 2014-03-12 20:20:46 UTC
Got something: http://p.defau.lt/?1WfXUN56rhc3uwBiU7oXvw

Then report_one_only.sh
1) Set mode to wx, language to whatever you used before *wiki in x (translate in my case)
2) Create out/out_wx/EN/
3) No config to skip views? fake them, e.g.: $ wget http://dumps.wikimedia.org/other/pagecounts-ez/wikistats/csv_wx.zip ; unzip csv_wx.zip PageViewsPerMonthAll.csv ; mv PageViewsPerMonthAll.csv csv/csv_wx/
4) Add your wiki to the list in SetLanguageInfo and GetProjectBaseUrl in WikiReportsLiterals.pm as well as (I guess) dblists/master\ copy/special.dblist 
5) Adjust $threshold_articles and/or $threshold_edits in WhiteListLanguages of WikiReportsInput.pm or you may see in csv/csv_wx/WikiReportsLog.txt something like
- No dump processed:
- < 10 articles: translate

Result so far (report takes few minutes): http://koti.kapsi.fi/~federico/twnstats/
There are clearly some more hidden switches to enable the full reports.

So, there is something more to do. No idea how to handle the configurations like *dblist or the whole WhiteListLanguages function (split to a file used only if requested by a command line argument?); variables in the middle of functions like $threshold* also need to be listed.
Comment 4 Nemo 2014-03-12 23:13:03 UTC
Add to comment 2:
6) For plots, set $weekly_plotdata = $true in WikiCounts.pl
Comment 5 Nemo 2014-03-13 09:29:28 UTC
7) Apply https://gerrit.wikimedia.org/r/#/c/118436/
Comment 6 Erik Zachte 2014-03-13 11:14:51 UTC
Wikistats code base is notoriously difficult to maintain and full of quirks. I have never made a secret of it, that I chose adding functionality over producing neat code or restructuring. It was mostly an after hours project before I joined WMF, and at WMF a heap of new information requests got priority. Doing this with state of the art software engineering techniques could have kept a small department busy.

See also https://www.mediawiki.org/wiki/Wikistats 

and for a wider perspective https://meta.wikimedia.org/wiki/Data_analysis/mining_of_Wikimedia_wikis#Wikistats_scripts

We're rather late in the game now, not sure how long Wikistats will be used, but even so I appreciate any effort to do some digging into the inner workings, if only to better understand possible discrepancies in a migration scenario, and who knows some non Wikimedia wiki may still see use for parts of it.

So I'll be happy to provide assistance in understanding the code, and getting it to work on say TranslationWiki. I won't have much time to restructure the code.

Some Q&A outside bugzilla might be expedient, happy to answer mail on ezachte.wikimedia.org or we could do one or more Skype sessions, id ezachte (pls mail me  first).
Comment 7 Nemo 2014-03-13 11:22:58 UTC
I'm not blaming anyone and I'm sure you don't have time for a refactor, that's why I offer to do one myself if you tell me the general direction/method you'd be ok with. :) Thank you very much for your offer to help, I'll contact you when I get stuck and I'm ready to devote some time to coding. 

I don't think we're late to the party, for instance at no point in history we have had so many dumps to analyse (thanks to WikiTeam)!
Comment 8 Erik Zachte 2014-03-13 11:59:17 UTC
Sure, let's see how to get this ball rolling.

I suggest we focus on runtime arguments and general setup first, as you already started to do. Everything become easier if you have a working setup, as you can step into code with debugger. (I used Windows OptiPerl during development, now mostly edit code on WMF server directly).

About comment 1/2: it looks more than it is, as I split out nested paths into several statements. Having said that I certainly agree this could be coded better and more flexible, and I could use some advice here. Now I run most jobs from home folder and push to git mainly for archive purposes. If that could be done without editing bash files that would be a big win (environment variable ?). BTW somehow getting push working from /a/wikistats_git proved a bottleneck. Some authorization issues.

About code base in general. As you know WikiCounts.pl produces csv files, mostly from xml files (and some api results, php files, translatewiki output, etc). Some parts of WikiCounts are ugly or hard to comprehend even for me. Some parts are more or less self contained and maybe could be made into standalone script (to further modularisation). WikiReports.pl is probably the hardest to read (but some parts, once modularized, may find reuse even in a HADOOP environment, e.g. page view reports). 

As for coding conventions: Especially in WikiReports there are lots of one letter variables and very maintenance-unfriendly function names. I tried to make names for complex functions reasonably understandable, but chose really cryptic function names for many small one line functions used for html mark-up (kind of c-style function name approach).There is only half-hearted systematic in those cryptic names, all in the name of getting code done. WikiReportsHtml.pm is by far the worst, I have to lookup names there all the time. My dilemma is giving all those mini-functions self-explanatory names would make the code using them much less readable (wood, trees). Now the html presentation details are kind of obfuscated but it doesn't distract from main presentation logic.
Comment 9 Bingle 2014-03-24 16:51:22 UTC
Prioritization and scheduling of this bug is tracked on Mingle card https://wikimedia.mingle.thoughtworks.com/projects/analytics/cards/cards/1485
Comment 10 Nemo 2014-06-01 10:05:52 UTC
Erik, did the addition of -F and the refactor land the repo(s)? Are they ready to receive patches?
Comment 11 Erik Zachte 2014-06-07 17:57:46 UTC
Nemo, I finally got all wikistats files in the repo. I think you can add patches now. Sorry for delay.
Comment 12 Nemo 2014-06-07 18:10:45 UTC
(In reply to Erik Zachte from comment #11)
> Nemo, I finally got all wikistats files in the repo. I think you can add
> patches now. Sorry for delay.

Wow, great, I'll try to start rebasing some of my patches next week.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links