I've seen optimizations to the Sieve of Eratosthenes that (claim to) use "wheel factorization". If the goal is to generate a list of prime numbers up to a certain value, I'm wondering how exactly is wheel factorization used? The Wikipedia article contains some information but it doesn't make sense to me.
For example sieve up to $15$: $\{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15\}$
Starting with 2 strike off multiples $\{1,2,3,\_,5,\_,7,\_,9,\_,11,\_,13,\_,15\}$
Then strike off multiples of 3: $\{1,2,3,\_,5,\_,7,\_,\_,\_,11,\_,13,\_,\_\}$
For wheel factorization with base primes $2$ and $3$ the idea is composites occur periodically with 3 in a row, then one.
So how are these two ideas "merged" when creating a list of prime numbers? Is it just wheel factorization is used to create an initial list of candidates before sieving? But that doesn't seem to save any time because SoE has the pitfall where it strikes off all ready stricken off composites (for example 15 is stricken off on 3 and 15 so what good would wheel factorization of circumference 6 do)?
Is anyone able to provide an example of wheel factorization used with a sieve?
TL;DR how is wheel factorization used with sieving?