Cache

From TYPO3Wiki

Jump to: navigation, search
 List Projects The Caching-Rewrite Functions for t3lib_cache-project
list pages
See Current Project Members, Wishlist 
    you can help if you like!

Contents

current

The current TYPO3 caching-system can:

  • create, delete, update the cache for a page
  • delete the cache by a cHash
  • carry the "reg1"-value for a foreign-key of an extension. e.g. for tt_news

proposal t3lib_cache

News todo: write about TYPO3-dev t3lib_cache Steffen Kamper messageID | info-text

News todo: write about TYPO3-dev Data exchange between USER/USER_INT Oliver Hader messageID | info-text

Steffen Kamper: My proposal is to bundle existing functions in one file, eg. t3lib_cache, maybe mostly static functions that can be called from extensions. The sub-Core API should have functionality like

  • creating chash from given piVars
  • deleting cache of specific pid
  • deleting cache from specific cHash
  • register cache using fields reg1
  • deleting cache with registered reg1

What do you think? Please let me know you're ideas.

vg Steffen

solution toi_cache

John Angel: You can check the STABLE toi_cache extension and 5425: Extending caching system (cache_hash) [new], the Doku toi_cache


functions

(Steffen:) I started collecting functions for t3lib_cache: Functions for t3lib_cache

Current Project Members

  • Steffen Kamper
  • ideas: some people in dev-list
  • Tests: DanielB

proposal A

<PHP> :
 
setCacheReg($value, $key='reg1')
clearCacheByPage ($uid)
clearCacheByReg ($value, $key='reg1', $pid=0)
 
# annotation: if $pid=0 the function clearCacheByReg this reg-value is cleared for any page. 
 

ideas for step 2

After the refactoring much more is possible.

Wishlist

indexed varchar

Ernesto: An indexed varchar field would be much nicer.

hook for TCAmain for EXT-developers

Ingo: An extension would then need to hook into TCEmain and watch for its records to run by as they get created or changed. When this happes the extension needs to take care of marking certain parts of the cache as "dirty". This should all be very simple for extension developers by using an API - otherwise if it isn't easy, they won't use it...

tagging

Ernesto: And what I was trying to suggest was that we just need one "tagging" method for cached pages, and this doesn't need to be just an integer (like reg1 currently is).

If we have an indexed varchar, this might be enough (e.g. "tx_testextension_table:5454" would identify a cache tagged with this table uid entry, so the extension knows where the single-view was rendered). With this approach, currently only "one" extension can put their information in this reg1 (later "ident" or something). The API for it is pretty "dumb", just set $TSFE->page_cache_reg1. Meaning also that the "latest called extension" wins.

To overcome that, maybe we could already think about many extensions trying to do that. In order not to pollute the current cache_pages with more fields, it might be interesting to create a cache_pages_ref table like:

<SQL> :
 
id int(11)  FOREIGN KEY TO cache_pages.id
extkey varchar indexed
ident varchar indexed
 

So that there is a 1-M relation between cached-pages and "tags" for this cache entry. So it is then later easy to locate caches that affect certain records. E.g. say we have a page with:

  • news single view with uid=666 and
  • two product teasers with uid=777 and 888 on the left border (same page)
  1. ) cache_pages contains one rendered page with id=2323
  2. ) cache_pages_ref contains two entries:
2323, tt_news, 666
2323, product_ext, 777
2323, product_ext, 888

Say the extension product_ext knows that 888 product is "obsolet" or something has changed or "whatever", it will then be able to find all pages that display information about that record and delete all cache related to it (here is where t3lib_cache API comes in).

The "ident" varchar might also be used in some other ways other than storing the pure uid. Maybe there is a list of uids, maybe there is nothing at all, maybe there are keywords like "latest_view". It's up to the extensions author to think about how he wants to manage the TYPO3 page cache.

extra table

Ries: a nice table for the purpose would be nice. This can solve many problems (fast) for example if you have data chown (for example news) on multiple pages/branches then there should be a method that whenever a nice item is changed/added... the pages/branches where that news item shows up should beable to get purged from the cache.

I think this is really interesting, because then in BE and/or FE you can simply execute a query to delete all cached pages where record XY is found, this can be done with one efficient query effectively. I had this idea one day when I was working at a company where we had many of these situations. You also could store additional start/end time information in that table to purge pages based on start/end times. This will solve a other problem...

However, extensions must be ready and use such an API to make use of this.

An API could look like this: <PHP> :
 
 
//$row['uid'] is the UID of the record (tt_news record for example).
// registerDisplayObject is part of piBase
// Since it's part of piBase you can get the prefixId  to know what extension is responsible
 
$this -> registerDisplayRecord($row['uid']);
 

registerDisplayObject will then store in a seperate table the current page uid, the prefixid and the record uid.

A table like that could look like this <SQL> :
 
pageuid integer
recorduid integer
recordprefixid varchar
starttime integer --taken from the origional record (somehow))
endtime integer --taken from the origional record (somehow))
changed BOOLEAN
 
Note Can be problem with workspaces


The above will 'know' what pages to delete the cache from based on changed records. in the BE or FE you simply call a other API with the same parameters that will tell if a record has been changed or not. Using a cron you can then purge pages on a x minute interval, or during a record update in the BE/FE


Ernesto: if it was that easy, we would have it already. Your proposed table is exactly what I was trying to propose, only that your solution is based around the assumption that rendered pages are just dependant on the status of specific database records and my proposition was based on the assumption that it is up to the extension to "know" which records affect what parts of the cache.

M-N table

Daniel B: an additional M:N-relation-table so that more than 1 register is possible

ext_conf_template.txt

Daniel B: It could be handled by ext_conf_template.txt, because the ext-author know best what CGI-var-combination should be cached - and so it can be changed by an admin easily.

nc_staticfilecache

This optimizes the performance for cached pages: nc_staticfilecache (nc_staticfilecache) (contact: netcreators)

examples

Ries: Caching is not only dependant on records, so I would not delegate that task to core. This will only handle pure "single views" where just one record for one extension is shown.

Here a bunch of examples:

- list view of news, which displays the latest 5 news in category ABC: caching depends on records that are placed in category ABC and not on a record in particular

- product teasers that are attached to a certain news record. Maybe two are selected at random from all records that are linked to that news record for display.

- a news list view with paginator (all cached). The paginator is of course dependant on the total number of records that were selected by the news extension and not on a particular news uid.

- etc

The information about which records exactly where rendered during the creation of a page might be interesting, but not really the most relevant stuff when the extensions needs to decide which pages to "obsolete".


Example for using cache for own Extensions

this Example is from Ernesto, written in News todo: write about Data exchange between USER/USER_INT Ernesto Baschny messageID | info-text

You have a method in some class that is able to calculate something very fancy but takes lots of processing to do it on every hit, so you want that to be cached. For example:

<PHP> :
 
function getBlurbExpensiveOperation($id) {
   // here comes the code that is expensive and will be cached
   ....
   return $blurb;
}
 

So you make a wrapper around it, and start using that instead of the "expensive" operation:

<PHP> :
 
function getBlurb($id) {
   $hash = md5('blurb'.$id);
   $blurb = $GLOBALS['TSFE']->sys_page->getHash($hash);
   if (!$blurb) {
         // not found in cache, re-calculate
      $blurb = $this->getBlurbExpensiveOperation($id);
   }
   $GLOBALS['TSFE']->sys_page->storeHash($hash, $blurb, 'blurb-'.$uid);
}
 

In this case I try to get it from cache, if it is not there, I just calculate it again and at the end store it back in the cache table. I do this even if it was already cached, because this will refresh the timestamp which you might use to "expire" old cached entries.

The ident field ("blurb-$id") is just for "information", it is not used anywhere. But makes it easy to find and identify your cached entries when using phpMyAdmin. Might even be used to "clean" your own plugins cache.

Note getHash($hash,$expTime=0) has a second optional parameter that gives the expiretime in seconds. If set to 0 (default), the hash will never expire.
Personal tools