How to pre-load the cache on a high-end PC? Topic is solved

FAQ, getting help, user experience about PrimoCache
RobF
Level 3
Level 3
Posts: 14
Joined: Fri Oct 04, 2013 3:12 pm

Re: How to pre-load the cache on a high-end PC?

Post by RobF »

I use this simple, very basic python script to load my cache. I run it on Python 2.7.11 in a cmd.exe window

The indentation matters in python so take care to preserve that.

I called it loader.py but you can call it whatever you want.

First argument is the starting directory you want to cache. It will walk recursively down from there.
Second argument is the file pattern to read.
I use this to load up all the WoW files before starting the game.

Code: Select all

C:\Python27> python loader.py L:\WoW\Data *.*
Python script loader.py

Code: Select all

import os, fnmatch, sys, argparse, fileinput

def find_files(directory, pattern):
    for root, dirs, files in os.walk(directory):
        for basename in files:
            if fnmatch.fnmatch(basename, pattern):
                filename = os.path.join(root, basename)
                yield filename

bufsize=1024*1024*256

for filename in find_files(sys.argv[1], sys.argv[2]):
    print 'Loading:', filename,
    fh=open(filename, 'rb')
    while True:
       data=fh.read(bufsize)
       print '.',
       if not data: break
    fh.close()
    del data
    print "done"
Axel Mertes
Level 9
Level 9
Posts: 180
Joined: Thu Feb 03, 2011 3:22 pm

Re: How to pre-load the cache on a high-end PC?

Post by Axel Mertes »

I would have expected that all reads go automatically into L2 unless L2 is filled. And if L2 is filled, a "last used first out" approach should decide which block of L2 is discarded to make space for what you are just reading. Thats the expectation.

I have to think about if I can verify for correct behaviour...
RobF
Level 3
Level 3
Posts: 14
Joined: Fri Oct 04, 2013 3:12 pm

Re: How to pre-load the cache on a high-end PC?

Post by RobF »

Axel Mertes wrote:I would have expected that all reads go automatically into L2 unless L2 is filled. And if L2 is filled, a "last used first out" approach should decide which block of L2 is discarded to make space for what you are just reading. Thats the expectation.

I have to think about if I can verify for correct behaviour...
L1 first then the "old stuff" from L1 goes in to L2.

I've never seen L2 fill and I think that is a problem with the algorithm but my understanding is that it doesn't get to L2 until you reboot or it expires from L1.
Axel Mertes
Level 9
Level 9
Posts: 180
Joined: Thu Feb 03, 2011 3:22 pm

Re: How to pre-load the cache on a high-end PC?

Post by Axel Mertes »

Yep, the algorithm may try doing things we don't expect and want from it.

A least used-first out approach is simple to understand, reflects on most users needs and would do the job exactly as expected IMHO.

What do you think?
gregfreemyer
Level 3
Level 3
Posts: 16
Joined: Tue Jan 10, 2017 12:34 am

Re: How to pre-load the cache on a high-end PC?

Post by gregfreemyer »

Axel Mertes wrote:I would have expected that all reads go automatically into L2 unless L2 is filled. And if L2 is filled, a "last used first out" approach should decide which block of L2 is discarded to make space for what you are just reading. Thats the expectation.

I have to think about if I can verify for correct behaviour...
From my understanding data doesn't move from L1 to L2 unless there is a period with no disk i/o. (disk i/o to what, I don't know.)

If you go into the Cache Config screen of PrimoCache then click on the disk stack icon to the right of the L2 size pull-down you see the "Gather Interval" option. I have it set at "Fastest (1) sec". I believe that means when there is a second of no disk i/o the L1 cache gets flushed to L2.

That may work with a lot of workloads, but it doesn't for my efforts to preload and I can't see why the python script would work any better (but I haven't tried it.)

To attempt to do a preload, I:

- remove other volumes from what is being cached to ensure plenty of available L2 cache space
- connect up a new to the system USB drive
- add the new volume to the set of drives being cached
- start a cygwin shell and cd to the directory full of files I want to cache. Often the dir will have 500GB or more of data in it. Last week I had one with 900GB in it.
- from cygwin: "cat * | dd of=/dev/null ibs=5M status=progress"

Then inside PrimoCache 2.7.0 I can monitor the volume reads and the L2 cache writes.

If I take no further actions, most of the 500GB of files read is not cached. I have to induce delays every 60GB or so to let the L1 cache flush to the L2 cache during idle time.
gregfreemyer
Level 3
Level 3
Posts: 16
Joined: Tue Jan 10, 2017 12:34 am

Re: How to pre-load the cache on a high-end PC?

Post by gregfreemyer »

To my surprise, the python script Robf posted does the job. It must use a different API than the process I use from cygwin.
Axel Mertes
Level 9
Level 9
Posts: 180
Joined: Thu Feb 03, 2011 3:22 pm

Re: How to pre-load the cache on a high-end PC?

Post by Axel Mertes »

While I understand that RobF is preloading for his gaming experience, I wonder why you are attempting to preload for your forensic analysis (right?). I would expect that its enough to access the data on request and step by step its getting cached automatically.

When we push jobs to our render farm, we create massive amounts of data. Unfortunately they are not copied to L2 cache when writing them to the cached network drive (V3 should change this). But after rendering, we replay that data, they get cached and further reads are usually fed from the L2 cache. We connect to the server via 4 * 10 GbE links, workstations have 10 GbE links. This is really high bandwidth, with usually 500-600 MByte/s per link on reads, >900 MByte/s per link on writes.

A cache scheme which drops data from L1 cache before copying it to L2 cache is IMHO a bit stupid, as it would force subsequent reads from HDD for data that has already been read and copied into cache. As long as there is free L2 cache, no single block should be dropped from cache, never.
Post Reply