I'm facing a situation where I need to come up with some really smart way of caching pre-computed variables on many levels.
To explain the situation I have a MIME document built up like this (I'm sure you'll know what I'm on about):
I'll demonstrate with XML so it's easier to follow, although it's not XML I actually use.
Code: Select all
<document>
<part>
<headers>
</headers>
<body>
Some data
<part>
<headers>
</headers>
<body>
More data
<part>
<headers>
</headers>
<body>
Even more data
</body>
</part>
</body>
</part>
</body>
</part>
</document>Processing the output of each of these "parts" is handled separately in a sort of recursive fashion until we get the tree structure. The way that works is that a "part" object can contain other "part" objects and they all have a build().
Because processing may be fairly intensive in order to maintain compliancy with various RFCs I have decided to "cache" each part in a property once it's been computed the first time. If any subsequent changes are made at runtime then the cache is set back to null so it needs to be computed again, Otherwise the cache is returned at the next request rather than processing all the data again.
I cache each "part" independently. The headers are also cached independently. This makes it easier to target any specific "parts" or headers which need to be processed again without having to process the entire document again.
My problem is optimizing the memory I'm using here. Currently I return the cached data by reference everywhere so it's not repeatedly copied into memory BUT I'm still not happy that I'm saving as much memory as I can (and this isn't premature optimization... these "parts" can be several MB).
The problem comes with the fact that each part is cached, then it's parent's cache contains the same data with a few extra bits tagged on; and in turn the parent's parent's cache contains all children below it. So I'm duplicating what's cached.
Can I pick your brains on a clever way to optimize the cache? This needs to be done in memory... no dumping to files or a DB.
Essentially, the question is can I cache substrings along with the parent string whilst only using the amount of memory the parent string uses?
A simple problem following the same logic would be to have this:
Code: Select all
<?php
error_reporting(E_STRICT);
class StringMaker
{
protected $myCache = null;
protected $fullCache = null;
protected $append = array();
protected $label = null;
public function __construct($label)
{
$this->setLabel($label);
}
public function setLabel($label)
{
$this->myCache = null; //Changes made so reset cache
$this->label = (string) $label;
}
protected function getSomeComputedData()
{
return uniqid($this->label, true);
}
public function &fetch()
{
$add = "";
foreach ($this->append as $string)
{
$substring = $string->fetch(); //Could be cached, but we don't worry, we let the part deal with that
$add .= " " . $substring;
}
if ($this->myCache !== null)
{
$this->fullCache = $this->myCache . $add;
return $this->fullCache;
}
$this->myCache = $this->getSomeComputedData();
$this->fullCache = $this->myCache . $add;
return $this->fullCache;
}
public function add(StringMaker $string)
{
$this->append[] = $string;
}
}
$parent_str = new StringMaker("foo");
$child_str = new StringMaker("bar");
$parent_str->add($child_str);
echo "<br />First computation:<br />";
echo $parent_str->fetch();
echo "<br />No changes, second computation:<br />";
echo $parent_str->fetch();
$parent_str->setLabel("xxx");
echo "<br />Changed label in parent, third computation:<br />";
echo $parent_str->fetch();
$child_str->setLabel("yyy");
echo "<br />Changed label in child, but not in parent, fourth computation:<br />";
echo $parent_str->fetch();Code: Select all
First computation:
foo45698d3e75b659.92261672 bar45698d3e75b538.94718683
No changes, second computation:
foo45698d3e75b659.92261672 bar45698d3e75b538.94718683
Changed label in parent, third computation:
xxx45698d3e75b7e6.26268698 bar45698d3e75b538.94718683
Changed label in child, but not in parent, fourth computation:
xxx45698d3e75b7e6.26268698 yyy45698d3e75b8a0.38707792I'm still thinking so if I come up with any ideas I'll post back