One of the best, most thorough state-of-the-art and highly optimized, free FSM libraries available online is the AT&T FSM library. It implements "fsmdifference" exactly as you describe, requiring a determinized epsilon-free FSM to do the difference. One idea is to minimize one or both of the FSMs before doing the difference, that may help in some cases. (i.e. determinizing is not the same as minimizing.) This package also has an "approximate" or "greedy" minimization that is designed to be possibly faster than a full minimization.
However, studying similar problems, I believe there is some generalization or construction of FSMs that do not appear in the literature that can help with this problem by avoiding the determinization step, i.e. basically inverting an NFA without creating an additional determinized FSM. The idea is to traverse the NFA edges "in parallel" and keep track of the set of nodes that are part of the current "superstate" (set of states) just like with the standard determinizing algorithm. Then, the NFA complement accepts if and only if the set of current superstate nodes are "all nonaccepting" (in contrast to the determinizing construction which accepts iff "any accepting").
However, I have not seen this written up before and don't see it via a quick online search. There are many references that suggest or imply that the only way to work with the complement of an NFA is to determinize it.
Here are two "nearby" references that might be useful for some ideas. I would be interested to hear of any/others that are "closer". You mention you are working on program verification, which may be a field that has more direct research on the problem.
[1] Construction of Intersection of Nondeterministic
Finite Automata using Z Notation Nazir Ahmad Zafar, Nabeel Sabir, and Amir Ali
[2] Complementation Constructions for Nondeterministic Automata on Infinite Words Orna
Kupferman and Moshe Vardi