<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Recent changes to 1: Caching word lookup? Add Index to words table.</title><link>https://sourceforge.net/p/php-crawler/feature-requests/1/</link><description>Recent changes to 1: Caching word lookup? Add Index to words table.</description><atom:link href="https://sourceforge.net/p/php-crawler/feature-requests/1/feed.rss" rel="self"/><language>en</language><lastBuildDate>Thu, 02 Oct 2008 00:45:43 -0000</lastBuildDate><atom:link href="https://sourceforge.net/p/php-crawler/feature-requests/1/feed.rss" rel="self" type="application/rss+xml"/><item><title>Caching word lookup? Add Index to words table.</title><link>https://sourceforge.net/p/php-crawler/feature-requests/1/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;Hello, when doing a search which will return many results, there will be an insane number of word lookups.  This slows things down quite a bit.&lt;/p&gt;
&lt;p&gt;SUGGESTION 1 (cache word lookups):&lt;br /&gt;
This is caused in _search.php's hashToText method.&lt;br /&gt;
Specifically you do:&lt;br /&gt;
$word = sql_fetch("SELECT word FROM `words` WHERE id=$num");&lt;/p&gt;
&lt;p&gt;if you do something like this you can store the words in a global so you only have to do the lookup if you need too.&lt;br /&gt;
function hashToText($content, $boldMiddle = false) {&lt;br /&gt;
global $cachedWord;&lt;br /&gt;
if (!isset($cachedWord)) $cachedWord = array();&lt;/p&gt;
&lt;p&gt;$pairs = str_split($content, $CRAWL_CHARS_PER_WORD);&lt;br /&gt;
$text = "";&lt;br /&gt;
$i = 0;&lt;br /&gt;
foreach ($pairs as $pair) {&lt;br /&gt;
$num = toDecimal($pair);&lt;/p&gt;
&lt;p&gt;// Was the word cached?&lt;br /&gt;
if (!isset($cachedWord[$num]))&lt;br /&gt;
{    &lt;br /&gt;
// Word was not in cache, we need to fetch it from the database.&lt;br /&gt;
$word = sql_fetch("SELECT word FROM `words` WHERE id=$num");&lt;br /&gt;
$cachedWord[$num] = $word;&lt;br /&gt;
}&lt;br /&gt;
else &lt;br /&gt;
{&lt;br /&gt;
// Found the word in our cache!&lt;br /&gt;
$word = $cachedWord[$num];          &lt;br /&gt;
}&lt;/p&gt;
&lt;p&gt;.........etc&lt;/p&gt;
&lt;p&gt;SUGGESTION 2 (add id index to words table):&lt;br /&gt;
Also the 'words' table should have a index on on the id field to speed lookups when a lookup must be made.&lt;br /&gt;
This insures that if the lookup has to goto the database then it will be quick.&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Anonymous</dc:creator><pubDate>Thu, 02 Oct 2008 00:45:43 -0000</pubDate><guid>https://sourceforge.netd6aeca64a7e80b111b35da4ad771d4f63205d272</guid></item></channel></rss>