I'll only tackle the first question:
In theory, can $C_S(P\oplus M)$ and $C_W(P)$ taken together be less secure $C_S(M)$?
Yes, it can be less secure.
Argument: I'll make an explicit example assuming $M$, $P$, ciphertext and keys are 16-byte, $M$ is UTF-8 text right-padded with zeroes, under an attack model where adversaries get a single ciphertext.
Define an easily computed condition $f(X)$ where $X$ is a 16-byte block and $f(X)$ either true or false, such that
$f(M)$ is always true
$f(X)$ is true with probability about $1/2$ for random $X$
A suitable $f$ is: every byte of $X$ is less than $\text{0xF5}=245$. That's because bytes $\text{0xF5}\ldots\text{0xFF}$ are reserved in UTF-8, and $(245/256)^{16}\approx1/2$.
Note $E_K(X)$ the AES-128 encryption of $X$ under keys $K$.
Construct $C_S$ for 16-byte key $K$ and block $X$ as follows:
set $K'$ to the all-0x00 or all-0xFF 16-byte block according to the low-order bit of $K$
If $f(X)$ holds,
- then set $X\gets E_K(X)$ repeatedly until $f(X)$ holds
- otherwise set $X\gets E_{K'}(X)$ repeatedly until $f(X)$ does not hold
Output $X$
Whatever a known $K$ is, $C_S$ is an easily computed and invertible bijection of any 16-byte block, such that $f(E_K(X))=f(X)$ for all blocks $X$. That's using the the standard cycle walking technique. When it encrypts a message $M$, the block cipher $C_S$ is strong. But when it encrypts a random $X$, with probability about $1/2$ it uses $K'$ which can take only take two values, thus is weak.
Make $C_W$ identity regardless of $K$, which insures "$C_S$ is always more secure than $C_W$": block cipher $C_S$ is not good, but still always better than nothing.
When an adversary not knowing keys gets $C_S(M)$, the condition $f(C_S(M))$ holds and practically nothing is learned about $M$ (beyond confirmation that $f(M)$ holds).
When an adversary gets $C_S(P\oplus M)$ and $C_W(P)$, the later yields $P$, and
if $f(C_S(P\oplus M)$ holds, which has probability about $1/2$, it's learned that $f(P\oplus M)$ holds
otherwise, decryption of $C_S(P\oplus M)$ can be attempted with both all-0x00 or all-0xFF keys, yielding candidates $X_0$ and $X_1$ both with $f(X_0)$ and $f(X_1)$ false. $M$ must be one of the $M_i=X_i\oplus P$. Further, if $f(M_i)$ is false then we can rule that $i$ out, and be certain the other is $M$.
In the end an adversary learns $M$ with probability about $1/4$, one of two possible values for $M$ with probability about $1/4$, and about one bit worth of information about $M$ otherwise.
We conclude that $C_S(P\oplus M)$ and $C_W(P)$ taken together is less secure than $C_S(M)$.
This could be extended to full-blown ciphers handling variable-length messages, with IV. The idea will remain to have $C_S$ secure when encrypting plaintext $M$ with a certain characteristic, but insecure when encrypting random plaintext (e.g. leak it's key). That's possible even if we add the constraint that encryption never increases size (beyond the IV).