Size of L1 and L2 cache index

FAQ, getting help, user experience about PrimoCache
Post Reply
Axel Mertes
Level 9
Level 9
Posts: 184
Joined: Thu Feb 03, 2011 3:22 pm

Size of L1 and L2 cache index

Post by Axel Mertes »

Hi Support,

I wonder how much much bytes are needed per cached block for actually indexing it.

I would think you store the disk block address, the RAM/L1 and SSD/L2 address, possibly the amount how often this block has been read since it is in cache and probably the time stamp of the most recent access.

I am currently looking into other storage options and was told that e.g. FreeNAS ZFS would require amazing 180, 200 or even 300 bytes (values differ, depending on which expert is talking to you... ;-) ). I find that astonishing, because in a worsed case one might use 512 byte blocks and then may need half of that for the index?!?! The point is that they say a 4 TByte SSD cache for a larger RAID would require something like 768 GByte of RAM to work.

So I wonder how much RAM PrimoCache actually needs to index e.g. a 4 TByte cache with a block size of say 4KB?

Btw, I read some threads that since 1.01 the L2 cache seems not to work as before, using only a small portion of it. Could the issue be that the index can't grow big enough?
In fact, if you can't index a complete SSD, you can only use a part of the SSD for caching. I just felt in the past that this was NEVER an issue. But I never tested with a larger SSD like 1 to 4 TByte so far.

Regards,
Axel
Axel Mertes
Level 9
Level 9
Posts: 184
Joined: Thu Feb 03, 2011 3:22 pm

Re: Size of L1 and L2 cache index

Post by Axel Mertes »

I just found that PrimoCache is showing me a "Memory Overhead" value.

Here some example values I got:

16384 MByte @ 512KByte sector = 32,768 sectors = 8,11 MByte Memory Overhead
8192 MByte @ 512KByte sector = 16,384 sectors = 4,98 MByte Memory Overhead
1024 MByte @ 512KByte sector = 2,048 sectors = 2,25 MByte Memory Overhead
This yields a 200 Bytes per sector increment and a base use of 1.85 MByte for whatever.

16384 MByte @ 4KByte sector = 4,194,304 sectors = 557,57 MByte Memory Overhead
8192 MByte @ 4KByte sector = 2,097,152 sectors = 397,57 MByte Memory Overhead
1024 MByte @ 4KByte sector = 262,144 sectors = 257,57 MByte Memory Overhead
This yields a 80 Bytes per sector increment and a VERY STRANGE base of 237,57 MByte base use for whatever?


For some strange reasons it has a fixed minimum size plus apparently 80 bytes per cached block. So I assume that the index entries take 80 bytes each.
Given a 4KByte block thats a ratio of 4,096:80 or 51.2:1.

I don't understand why the index is bigger in 512 KByte block size? What is different?
OK, the ratio is much better, because its 524,288:200 or 2621.44:1.

Assuming that the index size is the same of an SSD L2 cache @ 4KByte block size, I would expect to need 20 GBytes of RAM to index a 1 TByte cache, right?
So that would make sense to upgrade to Windows Server 2012 to go well beyond the 32 GByte barrier.
Using 512 KByte block size, we might get away with just 419,430,400 Bytes or 400 MByte for the index. Wow, thats a difference.

In that context:
We have 48 GBytes of RAM in the server, but Windows Server 2008 R2 just allows us to use 32 GByte. 16 GBytes are unused. Is there a chance to make them useable in PrimoCache?
If yes, I would even consider increasing this shadow RAM.

Best regards,
Axel
InquiringMind
Level SS
Level SS
Posts: 477
Joined: Wed Oct 06, 2010 11:10 pm

Re: Size of L1 and L2 cache index

Post by InquiringMind »

Axel Mertes wrote:This yields a 80 Bytes per sector increment and a VERY STRANGE base of 237,57 MByte base use for whatever?
An index has to be maintained to show what each cache block contains along with relevant usage statistics (last updated, hit-rate, etc). Such an index will require memory in proportion to the number of blocks available, so fewer (larger) blocks will mean a smaller index.
Axel Mertes wrote:We have 48 GBytes of RAM in the server, but Windows Server 2008 R2 just allows us to use 32 GByte. 16 GBytes are unused. Is there a chance to make them useable in PrimoCache?
If yes, I would even consider increasing this shadow RAM.
Have you tried enabling invisible memory support? This should allow you to bypass 64-bit Windows' licensing limits on RAM usage as well as the 4GB-limit on 32-bit versions.
Axel Mertes
Level 9
Level 9
Posts: 184
Joined: Thu Feb 03, 2011 3:22 pm

Re: Size of L1 and L2 cache index

Post by Axel Mertes »

The question is about the exact size of the index per block. I would not expect that this size can change at all, because all the index data is rather fixed data.
So given that bytes per index entry per block give us an estimate on the size the cache could possibly have, based on available RAM.
See, if I add a 4 TByte cache to a 48 TByte volume, I possibly need hundrets or thousands of GBytes RAM just for the index, depending on blocksize. Given a default small blocksize of 4 KByte, the index can be really large.

So I want to calculate the sweet spot were the amount of RAM and cache size are well balanced.

I hope we can try invisible memory option on the server in the next days.
AlienTech
Level 1
Level 1
Posts: 2
Joined: Tue May 19, 2015 7:05 am

Re: Size of L1 and L2 cache index

Post by AlienTech »

I had a question regarding the block size.. If the cluster size is 4K and block size is 512k.. What happens to the 508k that's left over in the block after caching that cluster after a read or does the program read the 128 sequential clusters to make up the 512B block.. And what happens if a program wants data from say the middle of this cluster segment.

Since using small block size makes memory requirements huge, it stands to reason that using large block size is the way to go. Since block access has advantage IF the files are sequentially stored because large block size with fragmented files would move unwanted data into the cache.

I did a test with 64K NTFS clusters and using 512K block size and was getting 80% hit rates. But I had defragmented the drive to make sure files were sequentially stored.

I was trying to make memory usage as low as possible and this is one way to do it. But a lot of small fragmented files would make this unfeasible since even with 80% hit rate, the reads in the cache was only 1GB but stored data was 6GB. So it is wasting a lot of cache.. But still since the data is sequential clusters it stands to reason that the data stored in the cache will come in useful at some point. Unless a lot of fragmented writes also happen in which case the cache would be marked dirty and reread if any cluster in the area is re-written. Even if the particular cluster that you are using still holds old valid data it still has to be reread as some other data in the area was changed. I could not find much technical info on things like this. But you need to know such things so you can use optimum block/cluster sizes and deal with fragmentation issues.

Since Microsoft uses 256k file reads to cache, it stands to reason that is the optimum read size.. But a file read don't have to worry about fragmentation like a block based cache would. I would think some databases which uses fixed record sizes would not have to worry about fragmentation issues as much. But for variable sized records,using large block sizes would waste just too much cache space to be useful.

Something else I noticed is the Total Read as 5GB, but L2 Storage write shows 7GB. So did it waste 2GB of cache because block size was so large? ie the extra data read to make use of the 512K block size..
InquiringMind
Level SS
Level SS
Posts: 477
Joined: Wed Oct 06, 2010 11:10 pm

Re: Size of L1 and L2 cache index

Post by InquiringMind »

AlienTech wrote:I had a question regarding the block size.. If the cluster size is 4K and block size is 512k.. What happens to the 508k that's left over in the block after caching that cluster after a read or does the program read the 128 sequential clusters to make up the 512B block.. And what happens if a program wants data from say the middle of this cluster segment...
You can test this using Crystal Diskmark or other disk benchmarking software and set the test file size so it is less than PC's L1 cache.. Look at the 4K read/write results (which simulate worst-case fragmentation). If they come close to the sequential results, that indicates that large blocks in PC are being fully used (that 508K is stored and available for subsequent reads rather than being discarded). If PC is discarding the 508K, then the 4K read/write results should show marginal improvement over an uncached disk.
Post Reply