My experience has been that sequential performance improves noticeably with larger block sizes. Might you have been talking about random, rather than sequential, access?
FAQ, getting help, user experience about PrimoCache
When use 1MB test length, it is possible that sequential performance get improved by lager block sizes. This is because larger block size can reduce CPU time on processing requests. However, if test with small test length like 128KB or 64KB, especially in single thread, the sequential performance will be the best when the block size is equal to the cluster size.
I guess my only other question along this topic is: If using a larger block size, what effect might that have on cached writes (especially smaller writes) to the underlying HDD? (I think mentioned earlier that I am using a 180GB NVME SSD to cache a 1TB 7200rpm HDD. This cache task also caches this drive with a 4GB L1 cache. Both caches are Shared, not separate read/write and both caches use 4kb block sizes. And Defer-Write is enabled. I don't know if any of this other information matters, but I thought I would include it. My question is still mainly about the effect of larger block sizes on cached writes. For instance, does PrimoCache have to write out the entire larger cache block every time a small write occurs to one of the clusters in the block?)
No, only dirty data will be written since v3.0.