Last modified: 2011-03-13 18:05:03 UTC
Some API links, such as the following: http://he.wikisource.org/w/api.php?action=parse&text=%7B%7B%F6%E9%E8%E5%E8%E9%ED+%F9%EC+%EE%F7%F8%E0%E5%FA+%E2%E3%E5%EC%E5%FA+%F2%EC+%F4%F1%E5%F7%7C%E0%E9%E5%E1%7C%EE%E1%7C-%7C%E9%7C-%7C%7D%7D&redirects=1&format=xml Work from a browser, but when I try to get them from a PHP script, I get an error "HTTP/1.0 403 Forbidden", for example with this program: <?php print file_get_contents("http://he.wikisource.org/w/api.php?action=parse&text=%7B%7B%F6%E9%E8%E5%E8%E9%ED+%F9%EC+%EE%F7%F8%E0%E5%FA+%E2%E3%E5%EC%E5%FA+%F2%EC+%F4%F1%E5%F7%7C%E0%E9%E5%E1%7C%EE%E1%7C-%7C%E9%7C-%7C%7D%7D&redirects=1&format=xml"); ?> It worked until about a week or two ago.
You need to supply user agents with your requests. Read: http://lists.wikimedia.org/pipermail/wikitech-l/2010-February/046777.html
Try: ini_set( 'user_agent', 'Erel\'s bot' );
Thank you, it works only servers where ini_set is enabled. I found a solution that works on all servers, including shared hosting, where ini_set is usually disabled: function get_url_with_agent($url) { $ch = curl_init($url); curl_setopt($ch, CURLOPT_HTTPGET, true); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_USERAGENT, 'Erel Bot'); $result = curl_exec($ch); curl_close($ch); return $result; } Hope it helps someone.