Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
1. Introduce acanonize(state), which returns a canonical state and a
(bit-encoded) label wether the encoded state was itself canonical
2. Use this labelled canonical state as inital color of each
state. Now:
- every redundant cycle has the same color (except for the label)
- of each set of redundant cycles exactly one has the label set
3. Use the extended label to gather loop statistics.
|
|
|
|
|
|
|
|
A bool is passed to the worker function to store wether any work has
been done in an iterState round. To prevent cache line bouncing, each
worker thread has its own cache line aligned bool which is ultimately
merged.
|
|
|
|
|
|
Major changes:
- compute cycle id (cycle minimum) in parallel to future state
- skip computation of cycle stats for non-canonical cycles
Minor changes:
- move computation of several arrays around to improve locality and
reduce random memory reads
- distinguish between State and StateIter: the later can contain
maxState + 1 and is used in loops and as pointer.
- rename maxState to numState, add acutal maxState
- time performance of memory allocation
|
|
Default value should be quick to see what the program does.
|
|
In addition to being broken by design it was order_s_ of magnitude to
slow. Adding cores to the computation increased runtime.
|
|
|
|
Replacing bitset with packed_array is neccessary to allow parallel
access to proximate array cells.
|
|
|
|
Assure that at last each parallel thread processes at least
maxWorkdSize elements. This garantues that even for packed arrays of
bit-sized elements that thread boundaries lie on machine word boundaries.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|