Age | Commit message (Collapse) | Author |
|
|
|
|
|
Major changes:
- compute cycle id (cycle minimum) in parallel to future state
- skip computation of cycle stats for non-canonical cycles
Minor changes:
- move computation of several arrays around to improve locality and
reduce random memory reads
- distinguish between State and StateIter: the later can contain
maxState + 1 and is used in loops and as pointer.
- rename maxState to numState, add acutal maxState
- time performance of memory allocation
|
|
Default value should be quick to see what the program does.
|
|
In addition to being broken by design it was order_s_ of magnitude to
slow. Adding cores to the computation increased runtime.
|
|
|
|
Replacing bitset with packed_array is neccessary to allow parallel
access to proximate array cells.
|
|
|
|
Assure that at last each parallel thread processes at least
maxWorkdSize elements. This garantues that even for packed arrays of
bit-sized elements that thread boundaries lie on machine word boundaries.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|